Yet another approach:

scala> val df1 = df.selectExpr("client_id", "from_unixtime(ts/1000,'yyyy-MM-dd') as ts")

Mgr. Michal Šenkýř
mike.sen...@gmail.com
+420 605 071 818

On 5.12.2016 09:22, Deepak Sharma wrote:
Another simpler approach will be:
scala> val findf = sqlContext.sql("select client_id,from_unixtime(ts/1000,'yyyy-MM-dd') ts from ts")
findf: org.apache.spark.sql.DataFrame = [client_id: string, ts: string]

scala> findf.show
+--------------------+----------+
|           client_id|        ts|
+--------------------+----------+
|cd646551-fceb-416...|2016-11-01|
|3bc61951-0f49-43b...|2016-11-01|
|688acc61-753f-4a3...|2016-11-23|
|5ff1eb6c-14ec-471...|2016-11-23|
+--------------------+----------+

I registered temp table out of the original DF
Thanks
Deepak

On Mon, Dec 5, 2016 at 1:49 PM, Deepak Sharma <deepakmc...@gmail.com <mailto:deepakmc...@gmail.com>> wrote:

    This is the correct way to do it.The timestamp that you mentioned
    was not correct:

    scala> val ts1 = from_unixtime($"ts"/1000, "yyyy-MM-dd")
    ts1: org.apache.spark.sql.Column = fromunixtime((ts /
    1000),yyyy-MM-dd)

    scala> val finaldf = df.withColumn("ts1",ts1)
    finaldf: org.apache.spark.sql.DataFrame = [client_id: string, ts:
    string, ts1: string]

    scala> finaldf.show
    +--------------------+-------------+----------+
    |           client_id|           ts|       ts1|
    +--------------------+-------------+----------+
    |cd646551-fceb-416...|1477989416803|2016-11-01|
    |3bc61951-0f49-43b...|1477983725292|2016-11-01|
    |688acc61-753f-4a3...|1479899459947|2016-11-23|
    |5ff1eb6c-14ec-471...|1479901374026|2016-11-23|
    +--------------------+-------------+----------+


    Thanks
    Deepak

    On Mon, Dec 5, 2016 at 1:46 PM, Deepak Sharma
    <deepakmc...@gmail.com <mailto:deepakmc...@gmail.com>> wrote:

        This is how you can do it in scala:
        scala> val ts1 = from_unixtime($"ts", "yyyy-MM-dd")
        ts1: org.apache.spark.sql.Column = fromunixtime(ts,yyyy-MM-dd)

        scala> val finaldf = df.withColumn("ts1",ts1)
        finaldf: org.apache.spark.sql.DataFrame = [client_id: string,
        ts: string, ts1: string]

        scala> finaldf.show
        +--------------------+-------------+-----------+
        |           client_id|           ts|    ts1|
        +--------------------+-------------+-----------+
        |cd646551-fceb-416...|1477989416803|48805-08-14|
        |3bc61951-0f49-43b...|1477983725292|48805-06-09|
        |688acc61-753f-4a3...|1479899459947|48866-02-22|
        |5ff1eb6c-14ec-471...|1479901374026|48866-03-16|
        +--------------------+-------------+-----------+

        The year is returning wrong here.May be the input timestamp is
        not correct .Not sure.

        Thanks
        Deepak

        On Mon, Dec 5, 2016 at 1:34 PM, Devi P.V <devip2...@gmail.com
        <mailto:devip2...@gmail.com>> wrote:

            Hi,

            Thanks for replying to my question.
            I am using scala

            On Mon, Dec 5, 2016 at 1:20 PM, Marco Mistroni
            <mmistr...@gmail.com <mailto:mmistr...@gmail.com>> wrote:

                Hi
                 In python you can use date
                time.fromtimestamp(......).strftime('%Y%m%d')........
                Which spark API are you using?
                Kr

                On 5 Dec 2016 7:38 am, "Devi P.V" <devip2...@gmail.com
                <mailto:devip2...@gmail.com>> wrote:

                    Hi all,

                    I have a dataframe like following,

                    +------------------------------------+---------------+
                    |client_id        |timestamp|
                    +------------------------------------+---------------+
                    |cd646551-fceb-4166-acbc-b9|1477989416803  |
                    |3bc61951-0f49-43bf-9848-b2|1477983725292  |
                    |688acc61-753f-4a33-a034-bc|1479899459947  |
                    |5ff1eb6c-14ec-4716-9798-00|1479901374026  |
                    +------------------------------------+---------------+

                     I want to convert timestamp column into
                    yyyy-MM-dd format.
                    How to do this?


                    Thanks





-- Thanks
        Deepak
        www.bigdatabig.com <http://www.bigdatabig.com>
        www.keosha.net <http://www.keosha.net>




-- Thanks
    Deepak
    www.bigdatabig.com <http://www.bigdatabig.com>
    www.keosha.net <http://www.keosha.net>




--
Thanks
Deepak
www.bigdatabig.com <http://www.bigdatabig.com>
www.keosha.net <http://www.keosha.net>

Reply via email to