Gerhard Fiedler created SPARK-12683:
---------------------------------------
Summary: SQL timestamp is wrong when accessed as Python datetime
Key: SPARK-12683
URL: https://issues.apache.org/jira/browse/SPARK-12683
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 1.6.0, 1.5.2, 1.5.1
Environment: Windows 7 Pro x64
Python 3.4.3
py4j 0.9
Reporter: Gerhard Fiedler
When accessing SQL timestamp data through {{.show()}}, it looks correct, but
when accessing it (as Python {{datetime}}) through {{.collect()}}, it is wrong.
{code}
from datetime import datetime
from pyspark import SparkContext
from pyspark.sql import SQLContext
if __name__ == "__main__":
spark_context = SparkContext(appName='SparkBugTimestampHour')
sql_context = SQLContext(spark_context)
sql_text = """select cast('2100-09-09 12:11:10.09' as timestamp) as ts"""
data_frame = sql_context.sql(sql_text)
data_frame.show(truncate=False)
# Result from .show() (as expected, looks correct):
# +----------------------+
# |ts |
# +----------------------+
# |2100-09-09 12:11:10.09|
# +----------------------+
rows = data_frame.collect()
row = rows[0]
ts = row[0]
print('ts={ts}'.format(ts=ts))
# Expected result from this print statement:
# ts=2100-09-09 12:11:10.090000
#
# Actual, wrong result (note the hours being 18 instead of 12):
# ts=2100-09-09 18:11:10.090000
#
# This error seems to be dependent on some characteristic of the system. We
couldn't reproduce
# this on all of our systems, but it is not clear what the differences are.
One difference is
# the processor: it failed on Intel Xeon E5-2687W v2.
assert isinstance(ts, datetime)
assert ts.year == 2100 and ts.month == 9 and ts.day == 9
assert ts.minute == 11 and ts.second == 10 and ts.microsecond == 90000
if ts.hour != 12:
print('hour is not correct; should be 12, is actually
{hour}'.format(hour=ts.hour))
spark_context.stop()
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]