Ihor Bobak created SPARK-27353:
----------------------------------

             Summary: PySpark  Row  __repr__ bug
                 Key: SPARK-27353
                 URL: https://issues.apache.org/jira/browse/SPARK-27353
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.4.0
            Reporter: Ihor Bobak


Row class has this implementation of __repr__:

    def __repr__(self):
        """Printable representation of Row used in Python REPL."""
        if hasattr(self, "__fields__"):
            return "Row(%s)" % ", ".join("%s=%r" % (k, v)
                                         for k, v in zip(self.__fields__, 
tuple(self)))
        else:
            return "<Row(%s)>" % ", ".join(self)

 

the last line fails when you have a datetime.date instance in a row:


TypeError                                 Traceback (most recent call last)
<ipython-input-41-02c2f5a33c6e> in <module>
      2     print(*row.values)
      3     df_row = Row(*row.values)
----> 4     print(repr(df_row))
      5     break
      6 

E:\spark\spark-2.3.2-bin-without-hadoop\python\pyspark\sql\types.py in 
__repr__(self)
   1579                                          for k, v in 
zip(self.__fields__, tuple(self)))
   1580         else:
-> 1581             return "<Row(%s)>" % ", ".join(self)
   1582 
   1583 

TypeError: sequence item 0: expected str instance, datetime.date found



 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to