gemini-code-assist[bot] commented on code in PR #37071:
URL: https://github.com/apache/beam/pull/37071#discussion_r2615104444


##########
sdks/python/apache_beam/typehints/row.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import NamedTuple
+
+
+class Row(object):
+  """A dynamic schema'd row object.
+
+  This objects attributes are initialized from the keywords passed into its
+  constructor, e.g. Row(x=3, y=4) will create a Row with two attributes x and 
y.
+
+  More importantly, when a Row object is returned from a `Map`, `FlatMap`, or
+  `DoFn` type inference is able to deduce the schema of the resulting
+  PCollection, e.g.
+
+      pc | beam.Map(lambda x: Row(x=x, y=0.5 * x))
+
+  when applied to a PCollection of ints will produce a PCollection with schema
+  `(x=int, y=float)`.
+
+  Note that in Beam 2.30.0 and later, Row objects are sensitive to field order.
+  So `Row(x=3, y=4)` is not considered equal to `Row(y=4, x=3)`.
+  """
+  def __init__(self, **kwargs):
+    self.__dict__.update(kwargs)
+
+  def as_dict(self):
+    return dict(self.__dict__)
+
+  # For compatibility with named tuples.
+  _asdict = as_dict
+
+  def __iter__(self):
+    for _, value in self.__dict__.items():
+      yield value
+
+  def __repr__(self):
+    return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items())
+
+  def __hash__(self):
+    return hash(self.__dict__.items())

Review Comment:
   ![critical](https://www.gstatic.com/codereviewagent/critical.svg)
   
   The `__hash__` method will raise a `TypeError` because `dict.items()` 
returns a view object, which is not hashable in Python 3. To fix this, you 
should convert the items to a hashable, ordered type like a tuple. Given that 
field order is important for `Row` objects, converting to a tuple is the 
correct approach.
   
   ```suggestion
       return hash(tuple(self.__dict__.items()))
   ```



##########
sdks/python/apache_beam/typehints/row.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from typing import NamedTuple
+
+
+class Row(object):
+  """A dynamic schema'd row object.
+
+  This objects attributes are initialized from the keywords passed into its
+  constructor, e.g. Row(x=3, y=4) will create a Row with two attributes x and 
y.
+
+  More importantly, when a Row object is returned from a `Map`, `FlatMap`, or
+  `DoFn` type inference is able to deduce the schema of the resulting
+  PCollection, e.g.
+
+      pc | beam.Map(lambda x: Row(x=x, y=0.5 * x))
+
+  when applied to a PCollection of ints will produce a PCollection with schema
+  `(x=int, y=float)`.
+
+  Note that in Beam 2.30.0 and later, Row objects are sensitive to field order.
+  So `Row(x=3, y=4)` is not considered equal to `Row(y=4, x=3)`.
+  """
+  def __init__(self, **kwargs):
+    self.__dict__.update(kwargs)
+
+  def as_dict(self):
+    return dict(self.__dict__)
+
+  # For compatibility with named tuples.
+  _asdict = as_dict
+
+  def __iter__(self):
+    for _, value in self.__dict__.items():
+      yield value
+
+  def __repr__(self):
+    return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items())
+
+  def __hash__(self):
+    return hash(self.__dict__.items())
+
+  def __eq__(self, other):
+    if type(self) == type(other):
+      other_dict = other.__dict__
+    elif type(other) == type(NamedTuple):
+      other_dict = other._asdict()

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The check `type(other) == type(NamedTuple)` is incorrect for identifying 
`NamedTuple` instances. `typing.NamedTuple` is a class factory, and instances 
of named tuples do not have `NamedTuple` as their type. A more robust way to 
check for namedtuple-like objects is to see if it's a tuple and has an 
`_asdict` method.
   
   ```suggestion
       elif isinstance(other, tuple) and hasattr(other, '_asdict'):
         other_dict = other._asdict()
   ```



##########
CHANGES.md:
##########
@@ -74,6 +74,10 @@
 
 * Support configuring Firestore database on ReadFn transforms (Java) 
([#36904](https://github.com/apache/beam/issues/36904)).
 
+* Moved `Row` to `apache_beam.typehints.row` to avoid import cycles and 
improve module organization. Kept a compatibility alias in `pvalue.py` (Python) 
([#35095](https://github.com/apache/beam/issues/35095)).
+
+

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   These extra blank lines can be removed for better readability and to keep 
the changelog concise.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to