claudevdm commented on code in PR #34709:
URL: https://github.com/apache/beam/pull/34709#discussion_r2054744084


##########
sdks/python/apache_beam/ml/anomaly/detectors/pyod_adapter.py:
##########
@@ -0,0 +1,104 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import pickle
+from typing import Any
+from typing import Dict
+from typing import Iterable
+from typing import Optional
+from typing import Sequence
+
+import numpy as np
+from pyod.models.base import BaseDetector as PyODBaseDetector
+
+import apache_beam as beam
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.ml.anomaly.detectors.offline import OfflineDetector
+from apache_beam.ml.anomaly.specifiable import specifiable
+from apache_beam.ml.anomaly.thresholds import FixedThreshold
+from apache_beam.ml.inference.base import KeyedModelHandler
+from apache_beam.ml.inference.base import ModelHandler
+from apache_beam.ml.inference.base import PredictionResult
+from apache_beam.ml.inference.base import _PostProcessingModelHandler

Review Comment:
   Should we remove underscores from these (e.g. _PostProcessingModelHandler -> 
PostProcessingModelHandler) since they are being imported? 



##########
sdks/python/apache_beam/ml/anomaly/transforms.py:
##########
@@ -454,6 +454,13 @@ def _restore_and_convert(
         ])
     return orig_key, (temp_key, result)
 
+  def _select_features(self, elem: Tuple[Any,
+                                         beam.Row]) -> Tuple[Any, beam.Row]:
+    assert self._offline_detector._features is not None
+    k, v = elem
+    row_dict = v._asdict()
+    return k, beam.Row(**{k: row_dict[k] for k in 
self._offline_detector._features})  # pylint: disable=line-too-long

Review Comment:
   Any reason why we cant split this to two lines? Also here it looks a bit 
strange to access _features (with underscore) on _offline_detector.



##########
sdks/python/apache_beam/ml/anomaly/detectors/pyod_adapter_test.py:
##########
@@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import logging
+import os.path
+import pickle
+import shutil
+import tempfile
+import unittest
+
+import numpy as np
+from parameterized import parameterized
+
+import apache_beam as beam
+from apache_beam.ml.anomaly.base import AnomalyPrediction
+from apache_beam.ml.anomaly.base import AnomalyResult
+from apache_beam.ml.anomaly.transforms import AnomalyDetection
+from apache_beam.ml.anomaly.transforms_test import _keyed_result_is_equal_to
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+# Protect against environments where onnx and pytorch library is not available.
+# pylint: disable=wrong-import-order, wrong-import-position, ungrouped-imports
+try:
+  import pyod
+  from apache_beam.ml.anomaly.detectors.pyod_adapter import PyODFactory
+  from pyod.models.iforest import IForest
+except ImportError:
+  raise unittest.SkipTest('PyOD dependencies are not installed')
+
+
+class PyODIForestTest(unittest.TestCase):
+  def setUp(self) -> None:
+    self.tmp_dir = tempfile.mkdtemp()
+
+    seed = 1234
+    model = IForest(random_state=seed)
+    model.fit(self.get_train_data())
+    self.tmp_fn = os.path.join(self.tmp_dir, 'iforest_pickled')

Review Comment:
   Can we rename tmp_fn pickled_model_uri?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to