[GitHub] [beam] ryanthompson591 commented on a diff in pull request #17800: [BEAM-14535] Added support for pandas in sklearn inference runner

GitBox Thu, 09 Jun 2022 07:45:53 -0700


ryanthompson591 commented on code in PR #17800:
URL: https://github.com/apache/beam/pull/17800#discussion_r893599417



##########
sdks/python/apache_beam/ml/inference/sklearn_inference.py:
##########
@@ -41,20 +43,57 @@ class ModelFileType(enum.Enum):
   JOBLIB = 2
 
 
-class SklearnInferenceRunner(InferenceRunner[numpy.ndarray,
+class SklearnInferenceRunner(InferenceRunner[Union[numpy.ndarray,
+                                                   pandas.DataFrame],
                                              PredictionResult,
                                              BaseEstimator]):
   def run_inference(
-      self, batch: List[numpy.ndarray], model: BaseEstimator,
+      self,
+      batch: List[Union[numpy.ndarray, pandas.DataFrame]],
+      model: BaseEstimator,
       **kwargs) -> Iterable[PredictionResult]:
+    if isinstance(batch[0], numpy.ndarray):
+      return SklearnInferenceRunner._predict_np_array(batch, model)
+    elif isinstance(batch[0], pandas.DataFrame):
+      return SklearnInferenceRunner._predict_pandas_dataframe(batch, model)

Review Comment:
   There was a conversation with heejong and the xlang folk as well as a 
separate one with the TFMA group going on about setting the input type 
correctly.
   
   I think that that sort of a feature and a set_input_type would go hand in 
hand well.
   
   Instead of implementing that in this PR, I think about making a more clean 
set_input_type PR that makes this more generic.
   
   Let me leave an issue for it here:
   https://github.com/apache/beam/issues/21769



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] ryanthompson591 commented on a diff in pull request #17800: [BEAM-14535] Added support for pandas in sklearn inference runner

Reply via email to