gemini-code-assist[bot] commented on code in PR #35884:
URL: https://github.com/apache/beam/pull/35884#discussion_r2280345297


##########
sdks/python/apache_beam/ml/anomaly/detectors/pyod_adapter.py:
##########
@@ -77,7 +77,13 @@ def run_inference(
   ) -> Iterable[PredictionResult]:
     np_batch = []
     for row in batch:
-      np_batch.append(np.fromiter(row, dtype=np.float64))
+      features = []
+      for value in row:
+        if isinstance(value, (list, tuple, np.ndarray)):
+          features.extend(value)
+        else:
+          features.append(value)
+      np_batch.append(np.array(features, dtype=np.float64))

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The current implementation is correct, but it can be made more efficient and 
concise.
   It currently builds an intermediate list `features` for each row, and uses 
`np.array()` which can be slower than `np.fromiter()` for creating 1D arrays 
from iterables.
   
   A more performant approach is to use a generator function to flatten 
features on-the-fly and pass it to `np.fromiter`. This avoids creating the 
intermediate list for each row.
   
   ```python
       def _flatten_row(row_values):
         for value in row_values:
           if isinstance(value, (list, tuple, np.ndarray)):
             yield from value
           else:
             yield value
       np_batch = [np.fromiter(_flatten_row(row), dtype=np.float64) for row in 
batch]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to