Re: [Spark Java] Add new column in DataSet based on existed column

2018-03-28 Thread Divya Gehlot
Hi ,

Here is example snippet in scala

// Convert to a Date typeval timestamp2datetype: (Column) => Column =
(x) => { to_date(x) }df = df.withColumn("date",
timestamp2datetype(col("end_date")))

Hope this helps !

Thanks,

Divya



On 28 March 2018 at 15:16, Junfeng Chen  wrote:

> I am working on adding a date transformed field on existed dataset.
>
> The current dataset contains a column named timestamp in ISO format. I
> want to parse this field to joda time type, and then extract the year,
> month, day, hour info as new column attaching to original dataset.
> I have tried df.withColumn function, but it seems only support simple
> expression rather than customized function like MapFunction.
> How to solve it?
>
> Thanks!
>
>
>
> Regard,
> Junfeng Chen
>


[Spark Java] Add new column in DataSet based on existed column

2018-03-28 Thread Junfeng Chen
I am working on adding a date transformed field on existed dataset.

The current dataset contains a column named timestamp in ISO format. I want
to parse this field to joda time type, and then extract the year, month,
day, hour info as new column attaching to original dataset.
I have tried df.withColumn function, but it seems only support simple
expression rather than customized function like MapFunction.
How to solve it?

Thanks!



Regard,
Junfeng Chen