I am using python spark 1.6 and the --packages
datastax:spark-cassandra-connector:1.6.0-M1-s_2.10

I need to convert a time stamp string into a unix epoch time stamp. The
function unix_timestamp() function assume current time zone. How ever my
string data is UTC and encodes the time zone as zero. I have not been able
to find a way to get the unix time calculated correctly. simpleDateFormat
does not have good time zone support. Any suggestions?

I could write a UDF and to adjust for time zones how ever this seems like  a
hack

I tried using to_utc_timestamp(created, 'gmt¹) how ever this creates a
timestamp. I have not been able to figure out how to convert this to a unix
time sample I.e a long representing epoch

Any suggestions?

stmnt = "select \
            row_key, created, count, unix_timestamp(created) as
unixTimeStamp, \
            unix_timestamp(created, 'yyyy-MM-dd HH:mm:ss.z') as etc \
         from \
            rawTable \
         where \
             (created > '{0}') and (created <= '{1}') \
             and \
             (row_key = Œblue' \
                        or row_key = Œred' \
                    )".format('2016-03-12 00:30:00+0000', '2016-03-12
04:30:00+0000¹)


Sample out put

root
 |-- row_key: string (nullable = true)
 |-- created: timestamp (nullable = true)
 |-- count: long (nullable = true)
 |-- unixTimeStamp: long (nullable = true)
 |-- etc: long (nullable = true)

2016-03-12 00:30:30.0 should be 1457742630 not 1457771430

+-------------+---------------------+-----+-------------+----------+
|row_key      |created    |count|unixTimeStamp|utc    |
+-------------+---------------------+-----+-------------+----------+
|red|2016-03-12 00:30:30.0|2    |1457771430   |1457771430|
|red|2016-03-12 00:30:45.0|1    |1457771445   |1457771445|


static Column 
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> unix_timestamp 
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/functions
.html#unix_timestamp(org.apache.spark.sql.Column)> (Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml>  s)Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp
(in seconds), using the default timezone and the default locale, return null
if fail.
static Column 
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> unix_timestamp 
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/functions
.html#unix_timestamp(org.apache.spark.sql.Column, java.lang.String)> (Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml>  s, java.lang.String p)Convert time string with given pattern (see
[http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html])
to Unix time stamp (in seconds), return null if fail.


Reply via email to