Another simpler approach will be:
scala> val findf = sqlContext.sql("select
client_id,from_unixtime(ts/1000,'yyyy-MM-dd') ts from ts")
findf: org.apache.spark.sql.DataFrame = [client_id: string, ts: string]

scala> findf.show
+--------------------+----------+
|           client_id|        ts|
+--------------------+----------+
|cd646551-fceb-416...|2016-11-01|
|3bc61951-0f49-43b...|2016-11-01|
|688acc61-753f-4a3...|2016-11-23|
|5ff1eb6c-14ec-471...|2016-11-23|
+--------------------+----------+

I registered temp table out of the original DF
Thanks
Deepak

On Mon, Dec 5, 2016 at 1:49 PM, Deepak Sharma <deepakmc...@gmail.com> wrote:

> This is the correct way to do it.The timestamp that you mentioned was not
> correct:
>
> scala> val ts1 = from_unixtime($"ts"/1000, "yyyy-MM-dd")
> ts1: org.apache.spark.sql.Column = fromunixtime((ts / 1000),yyyy-MM-dd)
>
> scala> val finaldf = df.withColumn("ts1",ts1)
> finaldf: org.apache.spark.sql.DataFrame = [client_id: string, ts: string,
> ts1: string]
>
> scala> finaldf.show
> +--------------------+-------------+----------+
> |           client_id|           ts|       ts1|
> +--------------------+-------------+----------+
> |cd646551-fceb-416...|1477989416803|2016-11-01|
> |3bc61951-0f49-43b...|1477983725292|2016-11-01|
> |688acc61-753f-4a3...|1479899459947|2016-11-23|
> |5ff1eb6c-14ec-471...|1479901374026|2016-11-23|
> +--------------------+-------------+----------+
>
>
> Thanks
> Deepak
>
> On Mon, Dec 5, 2016 at 1:46 PM, Deepak Sharma <deepakmc...@gmail.com>
> wrote:
>
>> This is how you can do it in scala:
>> scala> val ts1 = from_unixtime($"ts", "yyyy-MM-dd")
>> ts1: org.apache.spark.sql.Column = fromunixtime(ts,yyyy-MM-dd)
>>
>> scala> val finaldf = df.withColumn("ts1",ts1)
>> finaldf: org.apache.spark.sql.DataFrame = [client_id: string, ts: string,
>> ts1: string]
>>
>> scala> finaldf.show
>> +--------------------+-------------+-----------+
>> |           client_id|           ts|        ts1|
>> +--------------------+-------------+-----------+
>> |cd646551-fceb-416...|1477989416803|48805-08-14|
>> |3bc61951-0f49-43b...|1477983725292|48805-06-09|
>> |688acc61-753f-4a3...|1479899459947|48866-02-22|
>> |5ff1eb6c-14ec-471...|1479901374026|48866-03-16|
>> +--------------------+-------------+-----------+
>>
>> The year is returning wrong here.May be the input timestamp is not
>> correct .Not sure.
>>
>> Thanks
>> Deepak
>>
>> On Mon, Dec 5, 2016 at 1:34 PM, Devi P.V <devip2...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Thanks for replying to my question.
>>> I am using scala
>>>
>>> On Mon, Dec 5, 2016 at 1:20 PM, Marco Mistroni <mmistr...@gmail.com>
>>> wrote:
>>>
>>>> Hi
>>>>  In python you can use date time.fromtimestamp(......).str
>>>> ftime('%Y%m%d')........
>>>> Which spark API are you using?
>>>> Kr
>>>>
>>>> On 5 Dec 2016 7:38 am, "Devi P.V" <devip2...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a dataframe like following,
>>>>>
>>>>> +------------------------------------+---------------+
>>>>> |client_id                           |timestamp|
>>>>> +------------------------------------+---------------+
>>>>> |cd646551-fceb-4166-acbc-b9|1477989416803  |
>>>>> |3bc61951-0f49-43bf-9848-b2|1477983725292  |
>>>>> |688acc61-753f-4a33-a034-bc|1479899459947  |
>>>>> |5ff1eb6c-14ec-4716-9798-00|1479901374026  |
>>>>> +------------------------------------+---------------+
>>>>>
>>>>>  I want to convert timestamp column into yyyy-MM-dd format.
>>>>> How to do this?
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>
>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>



-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net

Reply via email to