tvalentyn commented on code in PR #22949: URL: https://github.com/apache/beam/pull/22949#discussion_r961953253
########## website/www/site/content/en/documentation/sdks/python-machine-learning.md: ########## @@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run a pipeline that uses M ## Beam Java SDK support -RunInference API is available to Beam Java SDK 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). Please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java) for the Java wrapper transform to use and please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java) for some example pipelines. +The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). + +## TensorFlow support + +To use TensorFlow with the RunInference API, you need to do the following: + +* Use `tfx_bsl` version 1.10.0 or later. +* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. +* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. + +A sample pipeline might look like the following example: + +``` +import apache_beam as beam +from apache_beam.ml.inference.base import RunInference +from tensorflow_serving.apis import prediction_log_pb2 +from tfx_bsl.public.proto import model_spec_pb2 +from tfx_bsl.public.tfxio import TFExampleRecord +from tfx_bsl.public.beam.run_inference import CreateModelHandler + +pipeline = beam.Pipeline() +tfexample_beam_record = TFExampleRecord(file_pattern=predict_values_five_times_table) +saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply) +inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) +model_handler = CreateModelHandler(inference_spec_type) +with pipeline as p: + _ = (p | tfexample_beam_record.RawRecordBeamSource() + | RunInference(model_handler) + | beam.Map(print) + ) +``` + +First, within `tfx_bsl`, create a model handler. For more information, see [run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py) in the TensorFlow GitHub repository. + +``` +tf_handler = CreateModelHandler(inference_spec_type) + +# unkeyed +RunInference(tf_handler) + +# keyed +RunInference(KeyedModelHandler(tf_handler)) +``` + +The model handler that is created from within `tfx-bsl` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: + +``` +from apache_beam.ml.inference.base import RunInference +from apache_beam.ml.inference.base import KeyedModelHandler +RunInference(KeyedModelHandler(tf_handler)) +``` + +If you are unsure if your data is keyed, you can also use the `maybe_keyed` handler. + +Next, import the required modules: Review Comment: module imports come first, so this is out of place. Again, this was mentioned in the example, and somewhat self-evident. ########## website/www/site/content/en/documentation/sdks/python-machine-learning.md: ########## @@ -165,7 +165,29 @@ For detailed instructions explaining how to build and run a pipeline that uses M ## Beam Java SDK support -RunInference API is available to Beam Java SDK 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). Please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java) for the Java wrapper transform to use and please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java) for some example pipelines. +The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). + +## TensorFlow support Review Comment: > file_pattern=predict_values_five_times_table) predict_values_five_times_table, save_model_dir_multiply is not defied in this snippet, so it's somewhat confusing. ########## website/www/site/content/en/documentation/sdks/python-machine-learning.md: ########## @@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run a pipeline that uses M ## Beam Java SDK support -RunInference API is available to Beam Java SDK 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). Please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java) for the Java wrapper transform to use and please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java) for some example pipelines. +The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). + +## TensorFlow support + +To use TensorFlow with the RunInference API, you need to do the following: + +* Use `tfx_bsl` version 1.10.0 or later. +* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. +* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. + +A sample pipeline might look like the following example: + +``` +import apache_beam as beam +from apache_beam.ml.inference.base import RunInference +from tensorflow_serving.apis import prediction_log_pb2 +from tfx_bsl.public.proto import model_spec_pb2 +from tfx_bsl.public.tfxio import TFExampleRecord +from tfx_bsl.public.beam.run_inference import CreateModelHandler + +pipeline = beam.Pipeline() +tfexample_beam_record = TFExampleRecord(file_pattern=predict_values_five_times_table) +saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply) +inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) +model_handler = CreateModelHandler(inference_spec_type) +with pipeline as p: + _ = (p | tfexample_beam_record.RawRecordBeamSource() + | RunInference(model_handler) + | beam.Map(print) + ) +``` + +First, within `tfx_bsl`, create a model handler. For more information, see [run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py) in the TensorFlow GitHub repository. + +``` +tf_handler = CreateModelHandler(inference_spec_type) + +# unkeyed +RunInference(tf_handler) + +# keyed +RunInference(KeyedModelHandler(tf_handler)) +``` + +The model handler that is created from within `tfx-bsl` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: Review Comment: Optional: > is always unkeyed If there is description in beam docs re: keyed versus unkeyed, we could link it here ########## website/www/site/content/en/documentation/sdks/python-machine-learning.md: ########## @@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run a pipeline that uses M ## Beam Java SDK support -RunInference API is available to Beam Java SDK 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). Please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java) for the Java wrapper transform to use and please see [here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java) for some example pipelines. +The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). + +## TensorFlow support + +To use TensorFlow with the RunInference API, you need to do the following: + +* Use `tfx_bsl` version 1.10.0 or later. +* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. +* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. + +A sample pipeline might look like the following example: + +``` +import apache_beam as beam +from apache_beam.ml.inference.base import RunInference +from tensorflow_serving.apis import prediction_log_pb2 +from tfx_bsl.public.proto import model_spec_pb2 +from tfx_bsl.public.tfxio import TFExampleRecord +from tfx_bsl.public.beam.run_inference import CreateModelHandler + +pipeline = beam.Pipeline() +tfexample_beam_record = TFExampleRecord(file_pattern=predict_values_five_times_table) +saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply) +inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) +model_handler = CreateModelHandler(inference_spec_type) +with pipeline as p: + _ = (p | tfexample_beam_record.RawRecordBeamSource() + | RunInference(model_handler) + | beam.Map(print) + ) +``` + +First, within `tfx_bsl`, create a model handler. For more information, see [run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py) in the TensorFlow GitHub repository. + +``` +tf_handler = CreateModelHandler(inference_spec_type) + +# unkeyed +RunInference(tf_handler) + +# keyed +RunInference(KeyedModelHandler(tf_handler)) +``` + +The model handler that is created from within `tfx-bsl` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: Review Comment: ```suggestion The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
