Ihor Bobak created SPARK-27353:
----------------------------------
Summary: PySpark Row __repr__ bug
Key: SPARK-27353
URL: https://issues.apache.org/jira/browse/SPARK-27353
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 2.4.0
Reporter: Ihor Bobak
Row class has this implementation of __repr__:
def __repr__(self):
"""Printable representation of Row used in Python REPL."""
if hasattr(self, "__fields__"):
return "Row(%s)" % ", ".join("%s=%r" % (k, v)
for k, v in zip(self.__fields__,
tuple(self)))
else:
return "<Row(%s)>" % ", ".join(self)
the last line fails when you have a datetime.date instance in a row:
TypeError Traceback (most recent call last)
<ipython-input-41-02c2f5a33c6e> in <module>
2 print(*row.values)
3 df_row = Row(*row.values)
----> 4 print(repr(df_row))
5 break
6
E:\spark\spark-2.3.2-bin-without-hadoop\python\pyspark\sql\types.py in
__repr__(self)
1579 for k, v in
zip(self.__fields__, tuple(self)))
1580 else:
-> 1581 return "<Row(%s)>" % ", ".join(self)
1582
1583
TypeError: sequence item 0: expected str instance, datetime.date found
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]