[
https://issues.apache.org/jira/browse/BEAM-11929?focusedWorklogId=563465&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-563465
]
ASF GitHub Bot logged work on BEAM-11929:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Mar/21 02:38
Start Date: 10/Mar/21 02:38
Worklog Time Spent: 10m
Work Description: udim commented on a change in pull request #14156:
URL: https://github.com/apache/beam/pull/14156#discussion_r590946586
##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -666,15 +666,14 @@ def as_dict(self):
return dict(self.__dict__)
def __iter__(self):
- for _, value in sorted(self.__dict__.items()):
+ for _, value in self.__dict__.items():
yield value
def __repr__(self):
- return 'Row(%s)' % ', '.join(
- '%s=%r' % kv for kv in sorted(self.__dict__.items()))
+ return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items())
def __hash__(self):
- return hash(type(sorted(self.__dict__.items())))
+ return hash(type(self.__dict__.items()))
def __eq__(self, other):
return type(self) == type(other) and self.__dict__ == other.__dict__
Review comment:
Comparing dicts ignores order. Is that that's intentional?
```
>>> {1:2, 3:4} == {1:2, 3:4}
True
>>> {1:2, 3:4} == {3:4, 1:2}
True
```
##########
File path: sdks/python/apache_beam/pvalue.py
##########
@@ -666,15 +666,14 @@ def as_dict(self):
return dict(self.__dict__)
def __iter__(self):
- for _, value in sorted(self.__dict__.items()):
+ for _, value in self.__dict__.items():
yield value
def __repr__(self):
- return 'Row(%s)' % ', '.join(
- '%s=%r' % kv for kv in sorted(self.__dict__.items()))
+ return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items())
def __hash__(self):
- return hash(type(sorted(self.__dict__.items())))
+ return hash(type(self.__dict__.items()))
Review comment:
I tried running this interactively but got the same hash:
```
>>> hash(type({}.items()))
5913863196444
>>> hash(type({1:2}.items()))
5913863196444
```
Is it working as intended?
I guess this is technically okay according to the docs, but probably not
what you wanted: `The only required property is that objects which compare
equal have the same hash value`.
[ref](https://docs.python.org/3/reference/datamodel.html#object.__hash__)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 563465)
Time Spent: 2h (was: 1h 50m)
> DataframeTransfom, BatchRowsAsDataFrame do not preserve field order when
> schema created with beam.Row
> -----------------------------------------------------------------------------------------------------
>
> Key: BEAM-11929
> URL: https://issues.apache.org/jira/browse/BEAM-11929
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.26.0, 2.27.0, 2.28.0
> Reporter: Brian Hulette
> Assignee: Brian Hulette
> Priority: P2
> Labels: dataframe-api
> Fix For: 2.29.0
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> The workaround is to use a NamedTuple instance with DataframeTransform.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)