gemini-code-assist[bot] commented on code in PR #35884:
URL: https://github.com/apache/beam/pull/35884#discussion_r2280345297
##########
sdks/python/apache_beam/ml/anomaly/detectors/pyod_adapter.py:
##########
@@ -77,7 +77,13 @@ def run_inference(
) -> Iterable[PredictionResult]:
np_batch = []
for row in batch:
- np_batch.append(np.fromiter(row, dtype=np.float64))
+ features = []
+ for value in row:
+ if isinstance(value, (list, tuple, np.ndarray)):
+ features.extend(value)
+ else:
+ features.append(value)
+ np_batch.append(np.array(features, dtype=np.float64))
Review Comment:

The current implementation is correct, but it can be made more efficient and
concise.
It currently builds an intermediate list `features` for each row, and uses
`np.array()` which can be slower than `np.fromiter()` for creating 1D arrays
from iterables.
A more performant approach is to use a generator function to flatten
features on-the-fly and pass it to `np.fromiter`. This avoids creating the
intermediate list for each row.
```python
def _flatten_row(row_values):
for value in row_values:
if isinstance(value, (list, tuple, np.ndarray)):
yield from value
else:
yield value
np_batch = [np.fromiter(_flatten_row(row), dtype=np.float64) for row in
batch]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]