[GitHub] [beam] rszper commented on a diff in pull request #25947: Add documentation for the auto model updates

via GitHub Mon, 27 Mar 2023 13:43:54 -0700


rszper commented on code in PR #25947:
URL: https://github.com/apache/beam/pull/25947#discussion_r1149770971



##########
website/www/site/content/en/documentation/ml/side-input-updates.md:
##########
@@ -15,22 +15,22 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 -->
 
-# Use Slowly-Updating Side Input Pattern to Auto Update Models in RunInference 
Transform
+# Use slowly-updating side input patterns to auto-update models
 
-The pipeline in this example uses 
[RunInference](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/)
 PTransform with a `side input` PCollection that emits `ModelMetadata` to run 
inferences on images using open source Tensorflow models trained on `imagenet`.
+The pipeline in this example uses a 
[RunInference](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/)
 `PTransform` with a side input `PCollection` that emits `ModelMetadata` to run 
inferences on images using open source Tensorflow models trained on `imagenet`.
 
-In this example, we will use `WatchFilePattern` as a side input. 
`WatchFilePattern` is used to watch for the file updates matching the 
`file_pattern`
-based on timestamps and emits the latest 
[ModelMetadata](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/),
 which is used in
-`RunInference` PTransform for the dynamic auto model updates without the need 
for stopping the beam pipeline.
+This example uses `WatchFilePattern` as a side input. `WatchFilePattern` is 
used to watch for the file updates matching the `file_pattern`
+based on timestamps. It emits the latest 
[ModelMetadata](https://beam.apache.org/documentation/transforms/python/elementwise/runinference/),
 which is used in
+the RunInference `PTransform` to dynamically update the model without stopping 
the Beam pipeline.
 
-**Note**: Slowly-updating side input pattern is non-deterministic.
+**Note**: Slowly-updating side input patterns are non-deterministic.
 
 ### Setting up source
 
-We will use PubSub topic as a source to read the image names. 
- * PubSub topic emits a `UTF-8` encoded model path that will be used read and 
preprocess images for running the inference.
+To read the image names, use a Pub/Sub topic as the source. 
+ * The Pub/Sub topic emits a `UTF-8` encoded model path that is used to read 
and preprocess images to run the inference.
 
-### Models for image segmentation
+## Models for image segmentation
 
 For the purpose of this example, use models saved in 
[HDF5](https://www.tensorflow.org/tutorials/keras/save_and_load#hdf5_format) 
format. Initially, pass a model to the Tensorflow ModelHandler for predictions 
until there is an update via side input. 
 After a while, upload a model that matches the `file_pattern` to the GCS 
bucket. The bucket path will be used a glob pattern and is passed to the 
`WatchFilePattern`.

Review Comment:
   GCS bucket should be Google Cloud Storage bucket.
   
   Also, the second sentence has a grammatical error. I think it should be:
   
   The bucket path will be used as a glob pattern and is passed to the 
`WatchFilePattern`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] rszper commented on a diff in pull request #25947: Add documentation for the auto model updates

Reply via email to