AnandInguva commented on code in PR #21887:
URL: https://github.com/apache/beam/pull/21887#discussion_r898119092


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -235,3 +235,54 @@ He looked up and saw the sun and stars .;moon
 Each line has data separated by a semicolon ";".
 The first item is the sentence with the last word masked. The second item
 is the word that the model predicts for the mask.
+
+---
+## MNITST digit classification
+[`sklearn_mnist_classification.py`](./sklearn_mnist_classification.py) contains
+an implementation for a RunInference pipeline that performs image 
classification on handwritten digits from the 
[MNIST](https://en.wikipedia.org/wiki/MNIST_database) database.
+
+The pipeline reads rows of pixels corresponding to a digit, performs basic 
preprocessing, passes the pixels to the Scikit-learn implementation of 
RunInference, and then writes the predictions to a text file.
+
+### Dataset and model for language modeling
+- **Required**: A path to a file called `INPUT` that contains label and pixels 
to
+feed into the model. Each row should have elements that are comma-separated. 
The first element is the label. All subsuequent values are pixels from pixel0 
to pixel784. It should look something like this:
+```
+1,0,0,0...
+0,0,0,0...
+1,0,0,0...
+4,0,0,0...
+...
+```
+- **Required**: A path to a file called `OUTPUT`, to which the pipeline will
+write the predictions.
+- **Required**: A path to a file called `MODEL_PATH` that contains the pickled 
file of a scikit-learn model trained on MNIST data. Please refer to this 
scikit-learn 
[documentation](https://scikit-learn.org/stable/model_persistence.html) on how 
to serialize models.
+
+
+### Running `sklearn_mnist_classification.py`
+
+To run the MNIST classification pipeline locally, use the following command:
+```sh
+python -m apache_beam.examples.inference.sklearn_mnist_classification.py \
+  --input_file INPUT \
+  --output OUTPUT \
+  --model_path MODEL_PATH
+```
+For example:
+```sh
+python -m apache_beam.examples.inference.sklearn_mnist_classification.py \
+  --input_file mnist_data.csv \
+  --output predictions.csv \
+  --model_path mnist_model_svm.pickle
+```
+
+This writes the output to the `predictions.csv` with contents like:

Review Comment:
   ```suggestion
   This writes the output to the `predictions.txt` with contents like:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to