gemini-code-assist[bot] commented on code in PR #37585:
URL: https://github.com/apache/beam/pull/37585#discussion_r2801359749
##########
sdks/python/apache_beam/yaml/yaml_provider.py:
##########
@@ -878,6 +878,22 @@ def create(elements: Iterable[Any], reshuffle:
Optional[bool] = True):
if not isinstance(elements, Iterable) or isinstance(elements, (dict, str)):
raise TypeError('elements must be a list of elements')
+ if elements:
+ # Normalize elements to be all dicts or all primitives.
+ # If we have a mix, we want to treat them all as dicts for the purpose
+ # of schema inference (so we can have a schema like
+ # Row(element=..., other_field=...)).
+ # Note that we don't want to change the elements themselves if they
+ # are already all dicts or all primitives, as that would change the
+ # resulting schema (e.g. from int to Row(element=int)).
+ is_dict = [isinstance(e, dict) for e in elements]
+ if not all(is_dict) and any(is_dict):
+ elements = [
+ e if isinstance(e, dict) else {
+ 'element': e
+ } for e in elements
+ ]
Review Comment:

The current implementation for checking mixed types creates an intermediate
list `is_dict`, which can be memory-intensive for a large number of elements. A
more efficient approach would be to iterate once to detect if there's a mix of
types, avoiding the creation of this intermediate list. This can also
short-circuit as soon as a mix is detected.
```python
has_dict = False
has_non_dict = False
for e in elements:
if isinstance(e, dict):
has_dict = True
else:
has_non_dict = True
if has_dict and has_non_dict:
break
if has_dict and has_non_dict:
elements = [
e if isinstance(e, dict) else {
'element': e
} for e in elements
]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]