gemini-code-assist[bot] commented on code in PR #37585:
URL: https://github.com/apache/beam/pull/37585#discussion_r2801359749


##########
sdks/python/apache_beam/yaml/yaml_provider.py:
##########
@@ -878,6 +878,22 @@ def create(elements: Iterable[Any], reshuffle: 
Optional[bool] = True):
     if not isinstance(elements, Iterable) or isinstance(elements, (dict, str)):
       raise TypeError('elements must be a list of elements')
 
+    if elements:
+      # Normalize elements to be all dicts or all primitives.
+      # If we have a mix, we want to treat them all as dicts for the purpose
+      # of schema inference (so we can have a schema like
+      # Row(element=..., other_field=...)).
+      # Note that we don't want to change the elements themselves if they
+      # are already all dicts or all primitives, as that would change the
+      # resulting schema (e.g. from int to Row(element=int)).
+      is_dict = [isinstance(e, dict) for e in elements]
+      if not all(is_dict) and any(is_dict):
+        elements = [
+            e if isinstance(e, dict) else {
+                'element': e
+            } for e in elements
+        ]

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The current implementation for checking mixed types creates an intermediate 
list `is_dict`, which can be memory-intensive for a large number of elements. A 
more efficient approach would be to iterate once to detect if there's a mix of 
types, avoiding the creation of this intermediate list. This can also 
short-circuit as soon as a mix is detected.
   
   ```python
         has_dict = False
         has_non_dict = False
         for e in elements:
           if isinstance(e, dict):
             has_dict = True
           else:
             has_non_dict = True
           if has_dict and has_non_dict:
             break
   
         if has_dict and has_non_dict:
           elements = [
               e if isinstance(e, dict) else {
                   'element': e
               } for e in elements
           ]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to