Re: [PR] Add input boxes for required user inputs [beam]

via GitHub Thu, 21 Nov 2024 14:00:16 -0800


damccorm commented on code in PR #33183:
URL: https://github.com/apache/beam/pull/33183#discussion_r1852953984



##########
examples/notebooks/beam-ml/automatic_model_refresh.ipynb:
##########
@@ -244,135 +233,145 @@
         "# To expedite the model update process, it's recommended to set 
num_workers>1.\n",
         "# https://github.com/apache/beam/issues/28776\n";,
         "options.view_as(WorkerOptions).num_workers = 5"
-      ],
-      "metadata": {
-        "id": "wWjbnq6X-4uE"
-      },
-      "execution_count": null,
-                       "outputs": [{
-                               "output_type": "stream",
-                               "name": "stdout",
-                               "text": [
-                                       "\n"
-                               ]
-                       }]
+      ]
     },
     {
       "cell_type": "markdown",
-      "source": [
-        "Install the `tensorflow` and `tensorflow_hub` dependencies on 
Dataflow. Use the `requirements_file` pipeline option to pass these 
dependencies."
-      ],
       "metadata": {
         "id": "HTJV8pO2Wcw4"
-      }
+      },
+      "source": [
+        "Install the `tensorflow` and `tensorflow_hub` dependencies on 
Dataflow. Use the `requirements_file` pipeline option to pass these 
dependencies."
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "lEy4PkluWbdm"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n"
+          ]
+        }
+      ],
       "source": [
         "# In a requirements file, define the dependencies required for the 
pipeline.\n",
         "!printf 
'tensorflow==2.15.0\\ntensorflow_hub==0.16.1\\nkeras==2.15.0\\nPillow==11.0.0' 
> ./requirements.txt\n",
         "# Install the pipeline dependencies on Dataflow.\n",
         "options.view_as(SetupOptions).requirements_file = 
'./requirements.txt'"
-      ],
-      "metadata": {
-        "id": "lEy4PkluWbdm"
-      },
-      "execution_count": null,
-                       "outputs": [{
-                               "output_type": "stream",
-                               "name": "stdout",
-                               "text": [
-                                       "\n"
-                               ]
-                       }]
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "_AUNH_GJk_NE"
+      },
       "source": [
         "## Use the TensorFlow model handler\n",
         " This example uses `TFModelHandlerTensor` as the model handler and 
the `resnet_101` model trained on [ImageNet](https://www.image-net.org/).\n",
         "\n",
         "\n",
         "For the Dataflow runner, you need to store the model in a remote 
location that the Apache Beam pipeline can access. For this example, download 
the `ResNet101` model, and upload it to the Google Cloud Storage bucket.\n"
-      ],
-      "metadata": {
-        "id": "_AUNH_GJk_NE"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "ibkWiwVNvyrn"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n"
+          ]
+        }
+      ],
       "source": [
         "model = tf.keras.applications.resnet.ResNet101()\n",
         "model.save('resnet101_weights_tf_dim_ordering_tf_kernels.keras')\n",
         "# After saving the model locally, upload the model to GCS bucket and 
provide that gcs bucket `URI` as `model_uri` to the `TFModelHandler`\n",
         "# Replace `BUCKET_NAME` value with actual bucket name.\n",

Review Comment:
   Can we get rid of this comment now?



##########
examples/notebooks/beam-ml/automatic_model_refresh.ipynb:
##########
@@ -534,108 +541,118 @@
         "      | \"ApplyWindowing\" >> 
beam.WindowInto(beam.window.FixedWindows(10))\n",
         "      | \"RunInference\" >> 
RunInference(model_handler=model_handler,\n",
         "                                      
model_metadata_pcoll=side_input_pcoll))"
-      ],
-      "metadata": {
-        "id": "_AjvvexJ_hUq"
-      },
-      "execution_count": null,
-                       "outputs": [{
-                               "output_type": "stream",
-                               "name": "stdout",
-                               "text": [
-                                       "\n"
-                               ]
-                       }]
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "lTA4wRWNDVis"
+      },
       "source": [
         "4. Post-process the `PredictionResult` object.\n",
         "When the inference is complete, RunInference outputs a 
`PredictionResult` object that contains the fields `example`, `inference`, and 
`model_id`. The `model_id` field identifies the model used to run the 
inference. The `PostProcessor` returns the predicted label and the model ID 
used to run the inference on the predicted label."
-      ],
-      "metadata": {
-        "id": "lTA4wRWNDVis"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "9TB76fo-_vZJ"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n"
+          ]
+        }
+      ],
       "source": [
         "post_processor = (\n",
         "    inferences\n",
         "    | \"PostProcessResults\" >> beam.ParDo(PostProcessor())\n",
         "    | \"LogResults\" >> beam.Map(logging.info))"
-      ],
-      "metadata": {
-        "id": "9TB76fo-_vZJ"
-      },
-      "execution_count": null,
-                       "outputs": [{
-                               "output_type": "stream",
-                               "name": "stdout",
-                               "text": [
-                                       "\n"
-                               ]
-                       }]
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "wYp-mBHHjOjA"
+      },
       "source": [
         "### Watch for the model update\n",
         "\n",
         "After the pipeline starts processing data, when you see output 
emitted from the RunInference `PTransform`, upload a `resnet152` model saved in 
the `.keras` format to a Google Cloud Storage bucket location that matches the 
`file_pattern` you defined earlier.\n"
-      ],
-      "metadata": {
-        "id": "wYp-mBHHjOjA"
-      }
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "FpUfNBSWH9Xy"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "\n"
+          ]
+        }
+      ],
       "source": [
         "model = tf.keras.applications.resnet.ResNet152()\n",
         "model.save('resnet152_weights_tf_dim_ordering_tf_kernels.keras')\n",
         "# Replace the `BUCKET_NAME` with the actual bucket name.\n",

Review Comment:
   Same - can we get rid of this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add input boxes for required user inputs [beam]

Reply via email to