gemini-code-assist[bot] commented on code in PR #37071: URL: https://github.com/apache/beam/pull/37071#discussion_r2615104444
########## sdks/python/apache_beam/typehints/row.py: ########## @@ -0,0 +1,74 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +from typing import NamedTuple + + +class Row(object): + """A dynamic schema'd row object. + + This objects attributes are initialized from the keywords passed into its + constructor, e.g. Row(x=3, y=4) will create a Row with two attributes x and y. + + More importantly, when a Row object is returned from a `Map`, `FlatMap`, or + `DoFn` type inference is able to deduce the schema of the resulting + PCollection, e.g. + + pc | beam.Map(lambda x: Row(x=x, y=0.5 * x)) + + when applied to a PCollection of ints will produce a PCollection with schema + `(x=int, y=float)`. + + Note that in Beam 2.30.0 and later, Row objects are sensitive to field order. + So `Row(x=3, y=4)` is not considered equal to `Row(y=4, x=3)`. + """ + def __init__(self, **kwargs): + self.__dict__.update(kwargs) + + def as_dict(self): + return dict(self.__dict__) + + # For compatibility with named tuples. + _asdict = as_dict + + def __iter__(self): + for _, value in self.__dict__.items(): + yield value + + def __repr__(self): + return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items()) + + def __hash__(self): + return hash(self.__dict__.items()) Review Comment:  The `__hash__` method will raise a `TypeError` because `dict.items()` returns a view object, which is not hashable in Python 3. To fix this, you should convert the items to a hashable, ordered type like a tuple. Given that field order is important for `Row` objects, converting to a tuple is the correct approach. ```suggestion return hash(tuple(self.__dict__.items())) ``` ########## sdks/python/apache_beam/typehints/row.py: ########## @@ -0,0 +1,74 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +from typing import NamedTuple + + +class Row(object): + """A dynamic schema'd row object. + + This objects attributes are initialized from the keywords passed into its + constructor, e.g. Row(x=3, y=4) will create a Row with two attributes x and y. + + More importantly, when a Row object is returned from a `Map`, `FlatMap`, or + `DoFn` type inference is able to deduce the schema of the resulting + PCollection, e.g. + + pc | beam.Map(lambda x: Row(x=x, y=0.5 * x)) + + when applied to a PCollection of ints will produce a PCollection with schema + `(x=int, y=float)`. + + Note that in Beam 2.30.0 and later, Row objects are sensitive to field order. + So `Row(x=3, y=4)` is not considered equal to `Row(y=4, x=3)`. + """ + def __init__(self, **kwargs): + self.__dict__.update(kwargs) + + def as_dict(self): + return dict(self.__dict__) + + # For compatibility with named tuples. + _asdict = as_dict + + def __iter__(self): + for _, value in self.__dict__.items(): + yield value + + def __repr__(self): + return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items()) + + def __hash__(self): + return hash(self.__dict__.items()) + + def __eq__(self, other): + if type(self) == type(other): + other_dict = other.__dict__ + elif type(other) == type(NamedTuple): + other_dict = other._asdict() Review Comment:  The check `type(other) == type(NamedTuple)` is incorrect for identifying `NamedTuple` instances. `typing.NamedTuple` is a class factory, and instances of named tuples do not have `NamedTuple` as their type. A more robust way to check for namedtuple-like objects is to see if it's a tuple and has an `_asdict` method. ```suggestion elif isinstance(other, tuple) and hasattr(other, '_asdict'): other_dict = other._asdict() ``` ########## CHANGES.md: ########## @@ -74,6 +74,10 @@ * Support configuring Firestore database on ReadFn transforms (Java) ([#36904](https://github.com/apache/beam/issues/36904)). +* Moved `Row` to `apache_beam.typehints.row` to avoid import cycles and improve module organization. Kept a compatibility alias in `pvalue.py` (Python) ([#35095](https://github.com/apache/beam/issues/35095)). + + Review Comment:  These extra blank lines can be removed for better readability and to keep the changelog concise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
