Use the scala method .split(",") to split the string into a collection of strings, and try using .replaceAll() on the field with the "?" to remove it.
On Thu, Feb 18, 2016 at 2:09 PM, Mich Talebzadeh <m...@peridale.co.uk> wrote: > Hi, > > What is the equivalent of this Hive statement in Spark > > > > select "?2,500.00", REGEXP_REPLACE("?2,500.00",'[^\\d\\.]',''); > +------------+----------+--+ > | _c0 | _c1 | > +------------+----------+--+ > | ?2,500.00 | 2500.00 | > +------------+----------+--+ > > Basically I want to get rid of "?" and "," in the csv file > > > > The full csv line is > > > > scala> csv2.first > res94: String = 360,10/02/2014,"?2,500.00",?0.00,"?2,500.00" > > I want to transform that string into 5 columns and use "," as the split > > Thanks, > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > >