tbcs commented on a change in pull request #24448: [SPARK-23299][SQL][PYSPARK]
Fix __repr__ behaviour for Rows
URL: https://github.com/apache/spark/pull/24448#discussion_r279123127
##########
File path: python/pyspark/sql/tests/test_types.py
##########
@@ -739,6 +739,11 @@ def test_timestamp_microsecond(self):
tst = TimestampType()
self.assertEqual(tst.toInternal(datetime.datetime.max) % 1000000,
999999)
+ # regression test for SPARK-23299
+ def test_row_without_column_name(self):
+ row = Row("Alice", 11)
+ self.assertEqual(repr(row), "<Row(Alice, 11)>")
Review comment:
> test non-ascii compatible characters
I have added a test with unicode values. Of course that breaks in Python
2.7 because %s is used in __repr__. I think it would be reasonable to change
the use of %s to %r for representing the individual tuple values, just as was
suggested originally in the JIRA ticket. I've made that change and adapted the
doctest accordingly. What do you think about this?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]