Alternatively, you may try the built-in function:
regexp_extract
> On May 12, 2016, at 20:27, Ewan Leith <[email protected]> wrote:
>
> You could use a UDF pretty easily, something like this should work, the
> lastElement function could be changed to do pretty much any string
> manipulation you want.
>
> import org.apache.spark.sql.functions.udf
>
> def lastElement(input: String) = input.split("/").last
>
> val lastElementUdf = udf(lastElement(_:String))
>
> df.select(lastElementUdf ($"col1")).show()
>
> Ewan
>
>
> From: Bharathi Raja [mailto:[email protected]]
> Sent: 12 May 2016 11:40
> To: Raghavendra Pandey <[email protected]>; Bharathi Raja
> <[email protected]>
> Cc: User <[email protected]>
> Subject: RE: Spark 1.6.0: substring on df.select
>
> Thanks Raghav.
>
> I have 5+ million records. I feel creating multiple come is not an optimal
> way.
>
> Please suggest any other alternate solution.
> Can’t we do any string operation in DF.Select?
>
> Regards,
> Raja
>
> From: Raghavendra Pandey <mailto:[email protected]>
> Sent: 11 May 2016 09:04 PM
> To: Bharathi Raja <mailto:[email protected]>
> Cc: User <mailto:[email protected]>
> Subject: Re: Spark 1.6.0: substring on df.select
>
> You can create a column with count of /. Then take max of it and create that
> many columns for every row with null fillers.
>
> Raghav
>
> On 11 May 2016 20:37, "Bharathi Raja" <[email protected]
> <mailto:[email protected]>> wrote:
> Hi,
>
> I have a dataframe column col1 with values something like
> “/client/service/version/method”. The number of “/” are not constant.
> Could you please help me to extract all methods from the column col1?
>
> In Pig i used SUBSTRING with LAST_INDEX_OF(“/”).
>
> Thanks in advance.
> Regards,
> Raja