damccorm commented on code in PR #30351:
URL: https://github.com/apache/beam/pull/30351#discussion_r1518265761


##########
learning/prompts/documentation-lookup-nolinks/42_ai_custom_inference.md:
##########
@@ -0,0 +1,47 @@
+Prompt:
+How can I use custom remote inference calls in my Apache Beam pipeline?
+
+Response:
+The optimal approach for conducting inference within an Apache Beam pipeline 
is by leveraging the RunInference API provided by the Apache Beam Python SDK. 
This feature allows you to seamlessly incorporate ML models into your pipeline 
or execute remote inference calls.
+
+When developing custom inference calls, consider the following factors:
+* **API quotas**: heavy loads may lead to reaching API quota limits. You can 
use `PipelineOptions` to specify the maximum number of parallel API calls. Use 
`direct_num_workers` for the Direct Runner or `max_num_workers` for the Google 
Cloud Dataflow Runner. Refer to the Beam Capability Matrix for information 
about other runners.
+* **Error handling**: it is essential to handle errors in case of API call 
failures. Consider implementing exponential backoff for retries or utilizing 
dead-letter queues for failed API calls.
+* **Monitoring**: incorporate monitoring and performance metrics to track the 
performance of your inference calls and the health of your pipeline.
+* **Batching**: batching can be used to send multiple inputs in a single API 
call for improved efficiency.
+
+To execute external API calls with the `RunInference` transform, you will need 
to create a `beam.DoFn` in the form of a custom model handler:
+
+```python
+class CustomModelHandler(ModelHandler):
+  """DoFn that accepts a batch of inputs and sends that batch to the remote 
API for inference"""
+
+  def load_model(self):
+    """Initiate the Custom remote API client."""
+    client = ... # Initialize the client
+    return client
+
+  def run_inference(self, batch, model, inference):
+
+    # Prepare a batch request for all inputs in the batch.
+    inputs = ... # Process inputs from the batch
+    input_requests = ... # Prepare input requests for the model
+    batch_request = ... # Prepare batch request for the model
+
+    # Send the batch request to the remote endpoint.
+    responses = model.(request=batch_request).responses

Review Comment:
   I don't think this makes sense, could we update to something like:
   
   ```
   responses = client.remote_call(request=batch_request).responses
   ```
   
   and rename `model` to `client` above?



##########
learning/prompts/documentation-lookup-nolinks/33_ai_ml_pipelines.md:
##########
@@ -0,0 +1,22 @@
+Prompt:
+How can I use Apache Beam in my AI/ML pipeline?
+
+Response:
+Apache Beam is an open-source, unified programming model for defining batch 
and streaming data-parallel processing pipelines. Each Apache Beam pipeline is 
represented as a Directed Acyclic Graph (DAG) of data processing operations 
constructed using the Apache Beam SDK in one of the supported languages. To 
execute a pipeline, you need to deploy it to one of the supported Beam runners.
+
+You can use Apache Beam for various tasks within your AI/ML pipeline, 
including data validation, preprocessing, model validation, and model 
deployment and inference.
+
+Apache Beam offers a rich set of I/O connectors and `PTransforms` that allow 
for reading and writing data from and to various data sources and sinks, as 
well as performing data validation.
+
+For data preprocessing, Apache Beam provides the `MLTransform` class. This 
feature allows you to ensure data consistency by applying the same 
preprocessing steps for both training and inference.
+
+Additionally, Apache Beam allows integration with pre-trained models from 
PyTorch, Scikit-learn, and TensorFlow.
+
+To execute machine learning inference tasks, Apache Beam provides the 
RunInference API. `RunInference` is a `PTransform` that supports the usage of 
any custom models. For an example of running inference on a custom model loaded 
with spaCy, refer to the Apache Beam GitHub repository.
+
+The `RunInference` transform efficiently handles models of any size, making it 
suitable for large language models (LLMs) and other complex architectures. You 
can find an example of deploying and performing inference on large language 
models (LLMs) in the 'RunInference' section of the Apache Beam documentation.
+
+For more information on implementing AI/ML pipelines using Apache Beam, see 
the 'Get started with AI/ML pipelines' section in the Apache Beam documentation.
+
+
+

Review Comment:
   ```suggestion
   For more information on implementing AI/ML pipelines using Apache Beam, see 
the 'Get started with AI/ML pipelines' section in the Apache Beam documentation.
   ```
   
   Nit: remove trailing whitespace



##########
learning/prompts/documentation-lookup-nolinks/40_ai_bigquery_ml.md:
##########
@@ -0,0 +1,50 @@
+Prompt:
+How can I integrate BigQuery ML with Apache Beam?
+
+Response:
+BigQuery ML allows you to develop and execute ML models using GoogleSQL 
queries. You can seamlessly integrate these models into your Apache Beam 
pipeline using TFX Basic Shared Libraries (tfx_bsl) and perform inference on 
them using the RunInference API.
+
+For an example of training a basic BigQuery ML model, refer to the BigQuery 
documentation.
+
+Once you have trained your model, you will need to export it. Here is an 
example BigQuery command to export a model to a Google Cloud Storage bucket:
+
+```
+bq extract -m <model name> gs://<cloud storage path>
+```
+
+To incorporate your BigQuery ML model into an Apache Beam pipeline using 
`tfx_bsl`, it must be saved in the TensorFlow SavedModel format. You will need 
to download the model to your local directory to perform local predictions:
+
+```python
+import apache_beam
+import tensorflow as tf
+from google.protobuf import text_format
+from tensorflow.python.framework import tensor_util
+from tfx_bsl.beam import run_inference
+from tfx_bsl.public.beam import RunInference
+from tfx_bsl.public.proto import model_spec_pb2
+
+inputs = tf.train.Example(...)
+
+model_path = <path to the directory where the model is stored>
+
+def extract_prediction(response):
+    # Extract the prediction from the response depending on the signature of 
the model
+
+with beam.Pipeline() as p:
+    res = (
+        p
+        | beam.Create([inputs])
+        | RunInference(
+            model_spec_pb2.InferenceSpecType(
+                saved_model_spec=model_spec_pb2.SavedModelSpec(
+                    model_path=model_path,
+                    signature_name=['serving_default'])))
+        | beam.ParDo(extract_prediction)
+    )
+```
+
+This example uses the `RunInference` transform from the `tfx_bsl` library, 
directing it to the local directory where the model is stored. The transform 
takes `tf.train.Example` type elements as inputs and produces 
`tensorflow_serving.apis.prediction_log_pb2.PredictionLog` type elements as 
outputs. Depending on the signature of your model, you can extract values from 
the output. For the prediction output format of exported models for each model 
type, refer to the 'Prediction output format' section in the BigQuery 
documentation.
+
+
+
+

Review Comment:
   ```suggestion
   This example uses the `RunInference` transform from the `tfx_bsl` library, 
directing it to the local directory where the model is stored. The transform 
takes `tf.train.Example` type elements as inputs and produces 
`tensorflow_serving.apis.prediction_log_pb2.PredictionLog` type elements as 
outputs. Depending on the signature of your model, you can extract values from 
the output. For the prediction output format of exported models for each model 
type, refer to the 'Prediction output format' section in the BigQuery 
documentation.
   ```
   
   Spacing nit



##########
learning/prompts/documentation-lookup-nolinks/38_ai_runinference.md:
##########
@@ -0,0 +1,37 @@
+Prompt:
+How can I run inference on a trained ML model using Apache Beam?
+
+Response:
+Apache Beam enables efficient inference on both local and remote ML models 
within your pipelines through the RunInference API. This functionality is 
available in the Python SDK versions 2.40.0 and later. The Java SDK versions 
2.41.0 and later also support the API through Apache Beam’s Multi-language 
Pipelines framework. The `RunInference` transform performs inference on a 
`PCollection` of examples using an ML model and outputs a `PCollection` 
containing both the input examples and the corresponding output predictions.
+
+Key features of the RunInference API include:
+* Support for both batch and streaming inference.
+* Centralized model management for efficient memory and bandwidth usage.
+* Compatibility with multiple model frameworks and model hubs.
+* Automatic model refreshing to ensure the latest model version is used.
+* GPU support for model inference.
+
+The RunInference API supports a variety of frameworks and model hubs, 
including TensorFlow, PyTorch, Scikit-learn, XGBoost, Hugging Face, TensorFlow 
Hub, Vertex AI, TensorRT, and ONNX. Additionally, you can easily integrate 
custom model frameworks.
+
+To import a model into your Apache Beam pipeline, you will need to configure 
the `ModelHandler` object, which wraps the underlying model and allows you to 
set necessary environment variables for inference.
+
+Here is an example of importing a PyTorch model handler for use in your 
pipeline:
+
+```python
+from apache_beam.ml.inference.pytorch_inference import 
PytorchModelHandlerTensor
+from apache_beam.ml.inference.base import RunInference
+
+  model_handler = PytorchModelHandlerTensor(
+    # Model handler setup
+  )
+
+with pipeline as p:
+    predictions = p |  'Read' >> beam.ReadFromSource('a_source')
+                    | 'RunInference' >> RunInference(model_handler)

Review Comment:
   ```suggestion
       predictions = p |  'Read' >> beam.ReadFromSource('a_source')
                       |  'RunInference' >> RunInference(model_handler)
   ```
   Spacing nit



##########
learning/prompts/documentation-lookup-nolinks/43_ai_llm_inference.md:
##########
@@ -0,0 +1,56 @@
+Prompt:
+Ho can I use Apache Beam to run inference on Large Language Models (LLMs)?

Review Comment:
   ```suggestion
   How can I use Apache Beam to run inference on Large Language Models (LLMs)?
   ```



##########
learning/prompts/documentation-lookup/33_ai_ml_pipelines.md:
##########
@@ -0,0 +1,23 @@
+Prompt:
+How can I use Apache Beam in my AI/ML pipeline?
+
+Response:
+Apache Beam is an open-source, unified programming model for defining batch 
and streaming data-parallel processing pipelines. Each Apache Beam pipeline is 
represented as a Directed Acyclic Graph (DAG) of data processing operations 
constructed using the Apache Beam SDK in one of the [supported 
languages](https://beam.apache.org/documentation/sdks/java/). To execute a 
pipeline, you need to deploy it to one of the supported [Beam 
runners](https://beam.apache.org/documentation/runners/capability-matrix/).
+
+You can use Apache Beam for various tasks within your AI/ML pipeline, 
including data validation, preprocessing, model validation, and model 
deployment and inference.
+
+Apache Beam offers a rich set of [I/O 
connectors](https://beam.apache.org/documentation/io/connectors/) and 
[transforms](https://beam.apache.org/documentation/transforms/python/) that 
allow for reading and writing data from and to various data sources and sinks, 
as well as performing data validation.
+
+For data preprocessing, Apache Beam provides the 
[MLTransform](https://beam.apache.org/documentation/ml/preprocess-data/) class. 
This feature allows you to ensure data consistency by applying the same 
preprocessing steps for both training and inference.
+
+Additionally, Apache Beam allows integration with pre-trained models from 
[PyTorch](https://pytorch.org/), 
[Scikit-learn](https://scikit-learn.org/stable/), and 
[TensorFlow](https://www.tensorflow.org/).
+
+To execute machine learning inference tasks, Apache Beam provides the 
RunInference API.
+[`RunInference`](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/)
 is a [`PTransform`](https://beam.apache.org/documentation/basics/#ptransform) 
that supports the usage of any custom models. For an example of running 
inference on a custom model loaded with [spaCy](https://spacy.io/), refer to 
the [Bring your own ML model to Beam 
RunInference](https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_custom_inference.ipynb)
 example in the Apache Beam GitHub repository.
+
+The `RunInference` transform efficiently handles models of any size, making it 
suitable for large language models (LLMs) and other complex architectures. You 
can find an example of deploying and performing inference on large language 
models (LLMs) in the 
[RunInference](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/)
 section of the Apache Beam documentation.
+
+For more information on implementing AI/ML pipelines using Apache Beam, see 
the [Get started with AI/ML 
pipelines](https://beam.apache.org/documentation/ml/overview/) section in the 
Apache Beam documentation.
+
+
+

Review Comment:
   ```suggestion
   For more information on implementing AI/ML pipelines using Apache Beam, see 
the [Get started with AI/ML 
pipelines](https://beam.apache.org/documentation/ml/overview/) section in the 
Apache Beam documentation.
   ```
   
   Spacing nit



##########
learning/prompts/documentation-lookup-nolinks/43_ai_llm_inference.md:
##########
@@ -0,0 +1,56 @@
+Prompt:
+Ho can I use Apache Beam to run inference on Large Language Models (LLMs)?
+
+Response:

Review Comment:
   I think we probably need to rework this response - this mostly just 
describes how to run inference generally, and doesn't have much content on 
large models. I see this was largely pulled from 
https://beam.apache.org/documentation/ml/large-language-modeling/ and that 
needs updated as well, I added https://github.com/apache/beam/pull/30585 - 
could you pull in the info from that PR here?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to