[
https://issues.apache.org/jira/browse/SPARK-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194892#comment-15194892
]
Adrian Wang commented on SPARK-13837:
-------------------------------------
Which timezone are your system in?
> SQL Context function to_date() returns wrong date
> -------------------------------------------------
>
> Key: SPARK-13837
> URL: https://issues.apache.org/jira/browse/SPARK-13837
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.1
> Environment: Python version:
> 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2]
> Reporter: Arnaud Caruso
>
> When using the SQL Context function to_date on a timestamp, it sometimes
> returns the wrong date.
> Here's how to reproduce the bug in Python:
> data = [[datetime.datetime(2015, 2, 20, 0, 0, 2)],[datetime.datetime(2015,
> 10, 9, 0, 0, 2)]]
> rddData = sc.parallelize(data)
> fields=[StructField('timestamp', TimestampType(), True)]
> schema=StructType(fields)
> data_table=sqlCtx.createDataFrame(data,schema)
> sqlCtx.registerDataFrameAsTable(data_table,"data")
> query="SELECT timestamp, TO_DATE(timestamp) FROM data "
> df=sqlCtx.sql(query)
> df.collect()
> Here are the results I get:
> [Row(timestamp=datetime.datetime(2015, 2, 20, 0, 0, 2),
> _c1=datetime.date(2015, 2, 20)),
> Row(timestamp=datetime.datetime(2015, 10, 9, 0, 0, 2),
> _c1=datetime.date(2015, 10, 8))]
> The first date is right but the second date is wrong, it returns October 8th
> instead of returning October 9th.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]