[GitHub] [beam] tvalentyn commented on a diff in pull request #22949: Adding TensorFlow support to the Machine Learning overview page

GitBox Fri, 02 Sep 2022 12:31:16 -0700


tvalentyn commented on code in PR #22949:
URL: https://github.com/apache/beam/pull/22949#discussion_r961953253



##########
website/www/site/content/en/documentation/sdks/python-machine-learning.md:
##########
@@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run 
a pipeline that uses M
 
 ## Beam Java SDK support
 
-RunInference API is available to Beam Java SDK 2.41.0 and later through Apache 
Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 Please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java)
 for the Java wrapper transform to use and please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java)
 for some example pipelines.
+The RunInference API is available with the Beam Java SDK versions 2.41.0 and 
later through Apache Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 For information about the Java wrapper transform, see 
[RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java).
 For example pipelines, see 
[RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java).
+
+## TensorFlow support
+
+To use TensorFlow with the RunInference API, you need to do the following:
+
+* Use `tfx_bsl` version 1.10.0 or later.
+* Create a model handler using 
`tfx_bsl.public.beam.run_inference.CreateModelHandler()`.
+* Use the model handler with the 
[`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html)
 transform.
+
+A sample pipeline might look like the following example:
+
+```
+import apache_beam as beam
+from apache_beam.ml.inference.base import RunInference
+from tensorflow_serving.apis import prediction_log_pb2
+from tfx_bsl.public.proto import model_spec_pb2
+from tfx_bsl.public.tfxio import TFExampleRecord
+from tfx_bsl.public.beam.run_inference import CreateModelHandler
+
+pipeline = beam.Pipeline()
+tfexample_beam_record = 
TFExampleRecord(file_pattern=predict_values_five_times_table)
+saved_model_spec = 
model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply)
+inference_spec_type = 
model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)
+model_handler = CreateModelHandler(inference_spec_type)
+with pipeline as p:
+    _ = (p | tfexample_beam_record.RawRecordBeamSource()
+           | RunInference(model_handler)
+           | beam.Map(print)
+        )
+```
+
+First, within `tfx_bsl`, create a model handler. For more information, see 
[run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py)
 in the TensorFlow GitHub repository.
+
+```
+tf_handler = CreateModelHandler(inference_spec_type)
+
+# unkeyed
+RunInference(tf_handler)
+
+# keyed
+RunInference(KeyedModelHandler(tf_handler))
+```
+
+The model handler that is created from within `tfx-bsl` is always unkeyed. To 
make a keyed model handler, wrap the unkeyed model handler in the keyed model 
handler, which would then take the `tfx-bsl` model handler as a parameter. For 
example:
+
+```
+from apache_beam.ml.inference.base import RunInference
+from apache_beam.ml.inference.base import KeyedModelHandler
+RunInference(KeyedModelHandler(tf_handler))
+```
+
+If you are unsure if your data is keyed, you can also use the `maybe_keyed` 
handler.
+
+Next, import the required modules:

Review Comment:
   module imports come first, so this is out of place. Again, this was 
mentioned in the example, and somewhat self-evident.



##########
website/www/site/content/en/documentation/sdks/python-machine-learning.md:
##########
@@ -165,7 +165,29 @@ For detailed instructions explaining how to build and run 
a pipeline that uses M
 
 ## Beam Java SDK support
 
-RunInference API is available to Beam Java SDK 2.41.0 and later through Apache 
Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 Please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java)
 for the Java wrapper transform to use and please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java)
 for some example pipelines.
+The RunInference API is available with the Beam Java SDK versions 2.41.0 and 
later through Apache Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 For information about the Java wrapper transform, see 
[RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java).
 For example pipelines, see 
[RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java).
+
+## TensorFlow support

Review Comment:
   > file_pattern=predict_values_five_times_table)
   
   predict_values_five_times_table, save_model_dir_multiply  is not defied in 
this snippet, so it's somewhat confusing.



##########
website/www/site/content/en/documentation/sdks/python-machine-learning.md:
##########
@@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run 
a pipeline that uses M
 
 ## Beam Java SDK support
 
-RunInference API is available to Beam Java SDK 2.41.0 and later through Apache 
Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 Please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java)
 for the Java wrapper transform to use and please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java)
 for some example pipelines.
+The RunInference API is available with the Beam Java SDK versions 2.41.0 and 
later through Apache Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 For information about the Java wrapper transform, see 
[RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java).
 For example pipelines, see 
[RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java).
+
+## TensorFlow support
+
+To use TensorFlow with the RunInference API, you need to do the following:
+
+* Use `tfx_bsl` version 1.10.0 or later.
+* Create a model handler using 
`tfx_bsl.public.beam.run_inference.CreateModelHandler()`.
+* Use the model handler with the 
[`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html)
 transform.
+
+A sample pipeline might look like the following example:
+
+```
+import apache_beam as beam
+from apache_beam.ml.inference.base import RunInference
+from tensorflow_serving.apis import prediction_log_pb2
+from tfx_bsl.public.proto import model_spec_pb2
+from tfx_bsl.public.tfxio import TFExampleRecord
+from tfx_bsl.public.beam.run_inference import CreateModelHandler
+
+pipeline = beam.Pipeline()
+tfexample_beam_record = 
TFExampleRecord(file_pattern=predict_values_five_times_table)
+saved_model_spec = 
model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply)
+inference_spec_type = 
model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)
+model_handler = CreateModelHandler(inference_spec_type)
+with pipeline as p:
+    _ = (p | tfexample_beam_record.RawRecordBeamSource()
+           | RunInference(model_handler)
+           | beam.Map(print)
+        )
+```
+
+First, within `tfx_bsl`, create a model handler. For more information, see 
[run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py)
 in the TensorFlow GitHub repository.
+
+```
+tf_handler = CreateModelHandler(inference_spec_type)
+
+# unkeyed
+RunInference(tf_handler)
+
+# keyed
+RunInference(KeyedModelHandler(tf_handler))
+```
+
+The model handler that is created from within `tfx-bsl` is always unkeyed. To 
make a keyed model handler, wrap the unkeyed model handler in the keyed model 
handler, which would then take the `tfx-bsl` model handler as a parameter. For 
example:

Review Comment:
   Optional:
   
   > is always unkeyed
   
   If there is description in beam docs re: keyed versus unkeyed, we could link 
it here



##########
website/www/site/content/en/documentation/sdks/python-machine-learning.md:
##########
@@ -165,7 +165,82 @@ For detailed instructions explaining how to build and run 
a pipeline that uses M
 
 ## Beam Java SDK support
 
-RunInference API is available to Beam Java SDK 2.41.0 and later through Apache 
Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 Please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java)
 for the Java wrapper transform to use and please see 
[here](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java)
 for some example pipelines.
+The RunInference API is available with the Beam Java SDK versions 2.41.0 and 
later through Apache Beam's [Multi-language Pipelines 
framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines).
 For information about the Java wrapper transform, see 
[RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java).
 For example pipelines, see 
[RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java).
+
+## TensorFlow support
+
+To use TensorFlow with the RunInference API, you need to do the following:
+
+* Use `tfx_bsl` version 1.10.0 or later.
+* Create a model handler using 
`tfx_bsl.public.beam.run_inference.CreateModelHandler()`.
+* Use the model handler with the 
[`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html)
 transform.
+
+A sample pipeline might look like the following example:
+
+```
+import apache_beam as beam
+from apache_beam.ml.inference.base import RunInference
+from tensorflow_serving.apis import prediction_log_pb2
+from tfx_bsl.public.proto import model_spec_pb2
+from tfx_bsl.public.tfxio import TFExampleRecord
+from tfx_bsl.public.beam.run_inference import CreateModelHandler
+
+pipeline = beam.Pipeline()
+tfexample_beam_record = 
TFExampleRecord(file_pattern=predict_values_five_times_table)
+saved_model_spec = 
model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply)
+inference_spec_type = 
model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)
+model_handler = CreateModelHandler(inference_spec_type)
+with pipeline as p:
+    _ = (p | tfexample_beam_record.RawRecordBeamSource()
+           | RunInference(model_handler)
+           | beam.Map(print)
+        )
+```
+
+First, within `tfx_bsl`, create a model handler. For more information, see 
[run_inference.py](https://github.com/tensorflow/tfx-bsl/blob/d1fca25e5eeaac9ef0111ec13e7634df836f36f6/tfx_bsl/public/beam/run_inference.py)
 in the TensorFlow GitHub repository.
+
+```
+tf_handler = CreateModelHandler(inference_spec_type)
+
+# unkeyed
+RunInference(tf_handler)
+
+# keyed
+RunInference(KeyedModelHandler(tf_handler))
+```
+
+The model handler that is created from within `tfx-bsl` is always unkeyed. To 
make a keyed model handler, wrap the unkeyed model handler in the keyed model 
handler, which would then take the `tfx-bsl` model handler as a parameter. For 
example:

Review Comment:
   ```suggestion
   The model handler that is created with `CreateModelHander()` is always 
unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the 
keyed model handler, which would then take the `tfx-bsl` model handler as a 
parameter. For example:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] tvalentyn commented on a diff in pull request #22949: Adding TensorFlow support to the Machine Learning overview page

Reply via email to