gemini-code-assist[bot] commented on code in PR #39073:
URL: https://github.com/apache/beam/pull/39073#discussion_r3460763497


##########
sdks/python/apache_beam/typehints/trivial_inference_test.py:
##########
@@ -491,28 +489,15 @@ def testPyCallable(self):
         [int])
 
   def testDataClassFields(self):
-    @dataclasses.dataclass
-    class BaseClass:
-      pass
-
     @dataclasses.dataclass
     class MyDataClass:
       id: int
       name: str
-      tags: list[str]
-      custom: BaseClass
 
     self.assertReturnType(
-        typehints.Tuple[int, str, typehints.List[str], BaseClass],
-        python_callable.PythonCallableWithSource(
-            "lambda x: (x.id, x.name, x.tags, x.custom)"), [MyDataClass])
-
-    options = PipelineOptions(['--exclude_infer_dataclass_field_type'])
-    with scoped_pipeline_options(options):
-      self.assertReturnType(
-          typehints.Tuple[int, str, typehints.Any, typehints.Any],
-          python_callable.PythonCallableWithSource(
-              "lambda x: (x.id, x.name, x.tags, x.custom)"), [MyDataClass])
+        typehints.Tuple[int, str],
+        python_callable.PythonCallableWithSource("lambda x: (x.id, x.name)"),
+        [MyDataClass])

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   If we retain type normalization for dataclass fields in `opcodes.py`, we can 
also keep the test coverage for complex dataclass field types (such as lists 
and custom classes) while still omitting the pipeline options test.
   
   ```suggestion
       @dataclasses.dataclass
       class BaseClass:
         pass
   
       @dataclasses.dataclass
       class MyDataClass:
         id: int
         name: str
         tags: list[str]
         custom: BaseClass
   
       self.assertReturnType(
           typehints.Tuple[int, str, typehints.List[str], BaseClass],
           python_callable.PythonCallableWithSource(
               "lambda x: (x.id, x.name, x.tags, x.custom)"), [MyDataClass])
   ```



##########
CHANGES.md:
##########
@@ -79,6 +79,11 @@
 
 ## Breaking Changes
 
+* (Python) Typehints of dataclass fields are honored during type inferences. 
To restore the behavior of fallback-to-any,
+  use pipeline option `--exclude_infer_dataclass_field_type` 
([#38797](https://github.com/apache/beam/issues/38797)).
+  However fixing forward is recommended.
+||||||| dd3a4549d8 (Normalize types in dataclass field type resolving (#38797))
+* X behavior was changed ([#X](https://github.com/apache/beam/issues/X)).

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   Unresolved merge conflict markers and duplicate entries were accidentally 
committed here. Since this PR reverts the dataclass field type inference 
feature, please remove this entire breaking change entry (including the 
duplicate lines 87-89).



##########
sdks/python/apache_beam/metrics/metric.py:
##########
@@ -234,7 +234,7 @@ def __init__(self, metric_name: MetricName) -> None:
       self.add = MetricUpdater(cells.StringSetCell, metric_name)  # type: 
ignore[method-assign]
 
   class DelegatingBoundedTrie(BoundedTrie):
-    """Metrics BoundedTrie that Delegates functionality to 
MetricsEnvironment."""
+    """Metrics StringSet that Delegates functionality to MetricsEnvironment."""

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   This docstring change is incorrect. `DelegatingBoundedTrie` delegates 
`BoundedTrie` functionality, not `StringSet`.
   
   ```suggestion
       """Metrics BoundedTrie that Delegates functionality to 
MetricsEnvironment."""
   ```



##########
sdks/python/apache_beam/typehints/opcodes.py:
##########
@@ -452,9 +451,8 @@ def _getattr(o, name):
   elif inspect.isclass(o) and dataclasses.is_dataclass(o):
     field = o.__dataclass_fields__.get(name)
     if field is not None:
-      return resolve_dataclass_field_type(field.type)
+      return field.type

Review Comment:
   ![high](https://www.gstatic.com/codereviewagent/high-priority.svg)
   
   Returning `field.type` directly without normalization will cause type 
inference to fail or behave unexpectedly for complex types (e.g., `list[str]`, 
`dict[str, int]`). We should still normalize the type using 
`typehints.normalize` to support complex dataclass field types, while safely 
avoiding the problematic pipeline options check that caused the PostCommit 
failure.
   
   ```suggestion
         from apache_beam.typehints import typehints
         return typehints.normalize(field.type)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to