I am using python spark 1.6 and the --packages
datastax:spark-cassandra-connector:1.6.0-M1-s_2.10
I need to convert a time stamp string into a unix epoch time stamp. The
function unix_timestamp() function assume current time zone. How ever my
string data is UTC and encodes the time zone as zero. I have not been able
to find a way to get the unix time calculated correctly. simpleDateFormat
does not have good time zone support. Any suggestions?
I could write a UDF and to adjust for time zones how ever this seems like a
hack
I tried using to_utc_timestamp(created, 'gmt¹) how ever this creates a
timestamp. I have not been able to figure out how to convert this to a unix
time sample I.e a long representing epoch
Any suggestions?
stmnt = "select \
row_key, created, count, unix_timestamp(created) as
unixTimeStamp, \
unix_timestamp(created, 'yyyy-MM-dd HH:mm:ss.z') as etc \
from \
rawTable \
where \
(created > '{0}') and (created <= '{1}') \
and \
(row_key = blue' \
or row_key = red' \
)".format('2016-03-12 00:30:00+0000', '2016-03-12
04:30:00+0000¹)
Sample out put
root
|-- row_key: string (nullable = true)
|-- created: timestamp (nullable = true)
|-- count: long (nullable = true)
|-- unixTimeStamp: long (nullable = true)
|-- etc: long (nullable = true)
2016-03-12 00:30:30.0 should be 1457742630 not 1457771430
+-------------+---------------------+-----+-------------+----------+
|row_key |created |count|unixTimeStamp|utc |
+-------------+---------------------+-----+-------------+----------+
|red|2016-03-12 00:30:30.0|2 |1457771430 |1457771430|
|red|2016-03-12 00:30:45.0|1 |1457771445 |1457771445|
static Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> unix_timestamp
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/functions
.html#unix_timestamp(org.apache.spark.sql.Column)> (Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> s)Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp
(in seconds), using the default timezone and the default locale, return null
if fail.
static Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> unix_timestamp
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/functions
.html#unix_timestamp(org.apache.spark.sql.Column, java.lang.String)> (Column
<https://spark.apache.org/docs/1.5.0/api/java/org/apache/spark/sql/Column.ht
ml> s, java.lang.String p)Convert time string with given pattern (see
[http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html])
to Unix time stamp (in seconds), return null if fail.