damccorm commented on code in PR #23884:
URL: https://github.com/apache/beam/pull/23884#discussion_r1009429763
##########
examples/notebooks/beam-ml/run_inference_pytorch_tensorflow_sklearn.ipynb:
##########
@@ -0,0 +1,1349 @@
+{
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "provenance": [],
+ "collapsed_sections": [
+ "5kkjbcIzZIf6",
+ "vA1UmbFRb5C-",
+ "-7ABKlZvkFHy"
+ ]
+ },
+ "kernelspec": {
+ "name": "python3",
+ "display_name": "Python 3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "LzOTNrs_P6Vv"
+ },
+ "outputs": [],
+ "source": [
+ "# @title ###### Licensed to the Apache Software Foundation (ASF),
Version 2.0 (the \"License\")\n",
+ "\n",
+ "# Licensed to the Apache Software Foundation (ASF) under one\n",
+ "# or more contributor license agreements. See the NOTICE file\n",
+ "# distributed with this work for additional information\n",
+ "# regarding copyright ownership. The ASF licenses this file\n",
+ "# to you under the Apache License, Version 2.0 (the\n",
+ "# \"License\"); you may not use this file except in compliance\n",
+ "# with the License. You may obtain a copy of the License at\n",
+ "#\n",
+ "# http://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing,\n",
+ "# software distributed under the License is distributed on an\n",
+ "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
+ "# KIND, either express or implied. See the License for the\n",
+ "# specific language governing permissions and limitations\n",
+ "# under the License"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## RunInference in Beam"
+ ],
+ "metadata": {
+ "id": "faayYQYrQzY3"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Starting with Apache Beam 2.40.0, a new API called RunInference can
be used for using machine learning (ML) models to do local and remote inference
with batch and streaming pipelines. RunInference API leverages Apache Beam
concepts such as the BatchElements transform and the Shared class, to enable
you to use models in your pipelines to create transforms optimized for machine
learning inferences.\n",
+ "\n",
+ "One can find more details about RunInference API,
here:https://beam.apache.org/documentation/sdks/python-machine-learning/"
+ ],
+ "metadata": {
+ "id": "JjAt1GesQ9sg"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "In this notebook, we show how to use RunInference with three
different popular ML frameworks: PyTorch, TensorFlow and Scikit-learn. We
showcase three pipelines that uses a text classification model for generating
prediction.\n",
+ "\n",
+ "The different steps needed to build this pipeline can be summarized
as follows:\n",
+ "* Read the images.\n",
+ "* Preprocess the text if needed\n",
+ "* Inference with PyTorch/TensorFlow/Scikit-learn Model\n",
+ "* PostProcess the output from RunInference if needed "
+ ],
+ "metadata": {
+ "id": "A8xNRyZMW1yK"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### RunInference with a PyTorch Model\n",
+ "\n",
+ "\n"
+ ],
+ "metadata": {
+ "id": "CTtBTpsHZFCk"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Install Dependency"
+ ],
+ "metadata": {
+ "id": "5kkjbcIzZIf6"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!pip install --upgrade pip\n",
+ "!pip install apache_beam[gcp]>=2.40.0\n",
+ "!pip install transformers\n",
+ "!pip install google-api-core==1.32"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 1000
+ },
+ "id": "MRASwRTxY-2u",
+ "outputId": "28760c59-c4dc-4486-dbd2-e7ac2c92c3b8"
+ },
+ "execution_count": null,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Looking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Requirement already satisfied: pip in
/usr/local/lib/python3.7/dist-packages (22.3)\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0m\u001b[33mWARNING: Running pip as the 'root' user can
result in broken permissions and conflicting behaviour with the system package
manager. It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0mLooking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Requirement already satisfied: transformers in
/usr/local/lib/python3.7/dist-packages (4.23.1)\n",
+ "Requirement already satisfied: tqdm>=4.27 in
/usr/local/lib/python3.7/dist-packages (from transformers) (4.64.1)\n",
+ "Requirement already satisfied: numpy>=1.17 in
/usr/local/lib/python3.7/dist-packages (from transformers) (1.21.6)\n",
+ "Requirement already satisfied: packaging>=20.0 in
/usr/local/lib/python3.7/dist-packages (from transformers) (21.3)\n",
+ "Requirement already satisfied: huggingface-hub<1.0,>=0.10.0 in
/usr/local/lib/python3.7/dist-packages (from transformers) (0.10.1)\n",
+ "Requirement already satisfied: importlib-metadata in
/usr/local/lib/python3.7/dist-packages (from transformers) (4.13.0)\n",
+ "Requirement already satisfied: pyyaml>=5.1 in
/usr/local/lib/python3.7/dist-packages (from transformers) (6.0)\n",
+ "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1
in /usr/local/lib/python3.7/dist-packages (from transformers) (0.13.1)\n",
+ "Requirement already satisfied: regex!=2019.12.17 in
/usr/local/lib/python3.7/dist-packages (from transformers) (2022.6.2)\n",
+ "Requirement already satisfied: filelock in
/usr/local/lib/python3.7/dist-packages (from transformers) (3.8.0)\n",
+ "Requirement already satisfied: requests in
/usr/local/lib/python3.7/dist-packages (from transformers) (2.28.1)\n",
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in
/usr/local/lib/python3.7/dist-packages (from
huggingface-hub<1.0,>=0.10.0->transformers) (4.1.1)\n",
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in
/usr/local/lib/python3.7/dist-packages (from packaging>=20.0->transformers)
(3.0.9)\n",
+ "Requirement already satisfied: zipp>=0.5 in
/usr/local/lib/python3.7/dist-packages (from importlib-metadata->transformers)
(3.9.0)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from requests->transformers) (2.1.1)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from requests->transformers)
(2022.9.24)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from requests->transformers) (2.10)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from requests->transformers)
(1.24.3)\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0mLooking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Collecting google-api-core==1.32\n",
+ " Using cached google_api_core-1.32.0-py2.py3-none-any.whl (93
kB)\n",
+ "Requirement already satisfied: protobuf<4.0.0dev,>=3.12.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (3.20.3)\n",
+ "Requirement already satisfied: pytz in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (2022.4)\n",
+ "Requirement already satisfied: google-auth<2.0dev,>=1.25.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (1.35.0)\n",
+ "Requirement already satisfied:
googleapis-common-protos<2.0dev,>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (1.56.4)\n",
+ "Requirement already satisfied: setuptools>=40.3.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (57.4.0)\n",
+ "Requirement already satisfied: six>=1.13.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (1.15.0)\n",
+ "Requirement already satisfied: packaging>=14.3 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (21.3)\n",
+ "Requirement already satisfied: requests<3.0.0dev,>=2.18.0 in
/usr/local/lib/python3.7/dist-packages (from google-api-core==1.32) (2.28.1)\n",
+ "Requirement already satisfied: rsa<5,>=3.1.4 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<2.0dev,>=1.25.0->google-api-core==1.32) (4.9)\n",
+ "Requirement already satisfied: cachetools<5.0,>=2.0.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<2.0dev,>=1.25.0->google-api-core==1.32) (4.2.4)\n",
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<2.0dev,>=1.25.0->google-api-core==1.32) (0.2.8)\n",
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in
/usr/local/lib/python3.7/dist-packages (from
packaging>=14.3->google-api-core==1.32) (3.0.9)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0dev,>=2.18.0->google-api-core==1.32) (2.10)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0dev,>=2.18.0->google-api-core==1.32) (2022.9.24)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0dev,>=2.18.0->google-api-core==1.32) (1.24.3)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0dev,>=2.18.0->google-api-core==1.32) (2.1.1)\n",
+ "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in
/usr/local/lib/python3.7/dist-packages (from
pyasn1-modules>=0.2.1->google-auth<2.0dev,>=1.25.0->google-api-core==1.32)
(0.4.8)\n",
+ "Installing collected packages: google-api-core\n",
+ " Attempting uninstall: google-api-core\n",
+ " Found existing installation: google-api-core 1.33.2\n",
+ " Uninstalling google-api-core-1.33.2:\n",
+ " Successfully uninstalled google-api-core-1.33.2\n",
+ "Successfully installed google-api-core-1.32.0\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0m"
+ ]
+ },
+ {
+ "output_type": "display_data",
+ "data": {
+ "application/vnd.colab-display-data+json": {
+ "pip_warning": {
+ "packages": [
+ "google"
+ ]
+ }
+ }
+ },
+ "metadata": {}
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Model\n",
+ "\n",
+ "We are using a pretrained text classification model,
[distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english?text=I+like+you.+I+love+you).
This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on
SST-2 dataset.\n"
+ ],
+ "metadata": {
+ "id": "ObRPUrlEbjHj"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "! git lfs install\n",
+ "! git clone
https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english\n",
+ "! ls"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "vfDyy4WNQaJM",
+ "outputId": "75683116-f415-4956-f44c-baa953c564e1"
+ },
+ "execution_count": null,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Error: Failed to call git rev-parse --git-dir --show-toplevel:
\"fatal: not a git repository (or any of the parent directories): .git\\n\"\n",
+ "Git LFS initialized.\n",
+ "fatal: destination path
'distilbert-base-uncased-finetuned-sst-2-english' already exists and is not an
empty directory.\n",
+ "'=2.40.0' distilbert-base-uncased-finetuned-sst-2-english
sample_data\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Helper Functions"
+ ],
+ "metadata": {
+ "id": "vA1UmbFRb5C-"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "from collections import defaultdict\n",
+ "\n",
+ "import torch\n",
+ "from transformers import DistilBertForSequenceClassification,
DistilBertTokenizer, DistilBertConfig\n",
+ "\n",
+ "import apache_beam as beam\n",
+ "from apache_beam.ml.inference import RunInference\n",
+ "from apache_beam.ml.inference.base import PredictionResult,
KeyedModelHandler\n",
+ "from apache_beam.ml.inference.pytorch_inference import
PytorchModelHandlerKeyedTensor\n",
+ "\n",
+ "\n",
+ "class
HuggingFaceStripBatchingWrapper(DistilBertForSequenceClassification):\n",
+ " \"\"\"Wrapper around HugginFace model because RunInference requires
a batch\n",
+ " as a list of dicts instead of a dict of lists. Another workaround
can be found\n",
+ " here where they disable batching instead.\n",
+ "
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/pytorch_language_modeling.py\"\"\"\n",
+ " def forward(self, **kwargs):\n",
+ " output = super().forward(**kwargs)\n",
+ " return [dict(zip(output, v)) for v in zip(*output.values())]\n",
+ "\n",
+ "\n",
+ "\n",
+ "class Tokenize(beam.DoFn):\n",
+ " def __init__(self, model_name: str):\n",
+ " self._model_name = model_name\n",
+ "\n",
+ " def setup(self):\n",
+ " self._tokenizer =
DistilBertTokenizer.from_pretrained(self._model_name)\n",
+ " \n",
+ " def process(self, text_input: str):\n",
+ " # We need to pad the tokens tensors to max length to make sure
that all the tensors\n",
+ " # are of the same length and hence stack-able by the RunInference
API, normally you would batch first\n",
+ " # and tokenize the batch after and pad each tensor the the max
length in the batch.\n",
+ " # see:
https://beam.apache.org/documentation/sdks/python-machine-learning/#unable-to-batch-tensor-elements\n",
+ " tokens = self._tokenizer(text_input, return_tensors='pt',
padding='max_length', max_length=512)\n",
+ " # squeeze because tokenization adds an extra dimension, which is
empty\n",
+ " # in this case because we're tokenizing one element at a time.\n",
+ " tokens = {key: torch.squeeze(val) for key, val in
tokens.items()}\n",
+ " return [(text_input, tokens)]\n",
+ "\n",
+ "class PostProcessor(beam.DoFn):\n",
+ " def process(self, tuple_):\n",
+ " text_input, prediction_result = tuple_\n",
+ " softmax =
torch.nn.Softmax(dim=-1)(prediction_result.inference['logits']).detach().numpy()\n",
+ " return [{\"input\": text_input, \"softmax\": softmax}]"
+ ],
+ "metadata": {
+ "id": "c4ZwN8wsbvgK"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### RunInference Pipeline"
+ ],
+ "metadata": {
+ "id": "WYYbQTMWctkW"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "inputs = [\n",
+ " \"This is the worst food I have ever eaten\",\n",
+ " \"In my soul and in my heart, I’m convinced I’m wrong!\",\n",
+ " \"Be with me always—take any form—drive me mad! only do not leave
me in this abyss, where I cannot find you!\",\n",
+ " \"Do I want to live? Would you like to live with your soul in the
grave?\",\n",
+ " \"Honest people don’t hide their deeds.\",\n",
+ " \"Nelly, I am Heathcliff! He’s always, always in my mind: not as
a pleasure, any more than I am always a pleasure to myself, but as my own
being.\",\n",
+ "]"
+ ],
+ "metadata": {
+ "id": "lLb8D2n2n09n"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "model_handler = PytorchModelHandlerKeyedTensor(\n",
+ "
state_dict_path=\"./distilbert-base-uncased-finetuned-sst-2-english/pytorch_model.bin\",\n",
+ " model_class=HuggingFaceStripBatchingWrapper,\n",
+ " model_params={\"config\":
DistilBertConfig.from_pretrained(\"./distilbert-base-uncased-finetuned-sst-2-english/config.json\")},\n",
+ " device='cuda:0')\n",
+ "\n",
+ "keyed_model_handler = KeyedModelHandler(model_handler)\n",
+ "\n",
+ "with beam.Pipeline() as pipeline:\n",
+ " _ = (pipeline | \"Create inputs\" >> beam.Create(inputs)\n",
+ " | \"Tokenize\" >>
beam.ParDo(Tokenize(\"distilbert-base-uncased-finetuned-sst-2-english\"))\n",
+ " | \"Inference\" >>
RunInference(model_handler=keyed_model_handler)\n",
+ " | \"Postprocess\" >> beam.ParDo(PostProcessor())\n",
+ " | \"Print\" >> beam.Map(lambda x: print(f\"Input:
{x['input']} -> negative={100 * x['softmax'][0]:.4f}%/positive={100 *
x['softmax'][1]:.4f}%\"))\n",
+ " )"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 269
+ },
+ "id": "TDmMARxGb751",
+ "outputId": "437e168a-b4c5-463b-ce5f-09a8cb8d8191"
+ },
+ "execution_count": null,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stderr",
+ "text": [
+ "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:10:
FutureWarning: PytorchModelHandlerKeyedTensor is experimental. No
backwards-compatibility guarantees.\n",
+ " # Remove the CWD from sys.path while we load stuff.\n",
+
"WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies
required for Interactive Beam PCollection visualization are not available,
please use: `pip install apache-beam[interactive]` to install necessary
dependencies to enable all data visualization features.\n"
+ ]
+ },
+ {
+ "output_type": "display_data",
+ "data": {
+ "application/javascript": [
+ "\n",
+ " if (typeof window.interactive_beam_jquery ==
'undefined') {\n",
+ " var jqueryScript =
document.createElement('script');\n",
+ " jqueryScript.src =
'https://code.jquery.com/jquery-3.4.1.slim.min.js';\n",
+ " jqueryScript.type = 'text/javascript';\n",
+ " jqueryScript.onload = function() {\n",
+ " var datatableScript =
document.createElement('script');\n",
+ " datatableScript.src =
'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';\n",
+ " datatableScript.type = 'text/javascript';\n",
+ " datatableScript.onload = function() {\n",
+ " window.interactive_beam_jquery =
jQuery.noConflict(true);\n",
+ "
window.interactive_beam_jquery(document).ready(function($){\n",
+ " \n",
+ " });\n",
+ " }\n",
+ " document.head.appendChild(datatableScript);\n",
+ " };\n",
+ " document.head.appendChild(jqueryScript);\n",
+ " } else {\n",
+ "
window.interactive_beam_jquery(document).ready(function($){\n",
+ " \n",
+ " });\n",
+ " }"
+ ]
+ },
+ "metadata": {}
+ },
+ {
+ "output_type": "stream",
+ "name": "stderr",
+ "text": [
+ "/usr/local/lib/python3.7/dist-packages/dill/_dill.py:472:
FutureWarning: PytorchModelHandlerKeyedTensor is experimental. No
backwards-compatibility guarantees.\n",
+ " obj = StockUnpickler.load(self)\n",
+ "/usr/local/lib/python3.7/dist-packages/dill/_dill.py:472:
FutureWarning: PytorchModelHandlerKeyedTensor is experimental. No
backwards-compatibility guarantees.\n",
+ " obj = StockUnpickler.load(self)\n"
+ ]
+ },
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Input: This is the worst food I have ever eaten ->
negative=99.9777%/positive=0.0223%\n",
+ "Input: In my soul and in my heart, I’m convinced I’m wrong! ->
negative=1.6313%/positive=98.3687%\n",
+ "Input: Be with me always—take any form—drive me mad! only do not
leave me in this abyss, where I cannot find you! ->
negative=62.1188%/positive=37.8812%\n",
+ "Input: Do I want to live? Would you like to live with your soul
in the grave? -> negative=73.6841%/positive=26.3159%\n",
+ "Input: Honest people don’t hide their deeds. ->
negative=0.2377%/positive=99.7623%\n",
+ "Input: Nelly, I am Heathcliff! He’s always, always in my mind:
not as a pleasure, any more than I am always a pleasure to myself, but as my
own being. -> negative=0.0672%/positive=99.9328%\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### RunInference with a TensorFlow Model\n"
+ ],
+ "metadata": {
+ "id": "7KXeaQg3eCcp"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Note: Tensorflow models are supported through tfx-bsl."
+ ],
+ "metadata": {
+ "id": "hEHxNka4eOhC"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Install Dependency"
+ ],
+ "metadata": {
+ "id": "8KyXULYbeYlD"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!pip install --upgrade pip\n",
+ "!pip install apache_beam[gcp]==2.41.0\n",
+ "!pip install tensorflow==2.8\n",
+ "!pip install tfx_bsl\n",
+ "!pip install tensorflow-text==2.8.1"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 1000
+ },
+ "id": "uqWJhQBlc4oT",
+ "outputId": "2a17a966-fe2d-45d8-b6b9-02534f40c9a8"
+ },
+ "execution_count": null,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Looking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Requirement already satisfied: pip in
/usr/local/lib/python3.7/dist-packages (22.3)\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0mLooking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Collecting apache_beam[gcp]==2.41.0\n",
+ " Downloading
apache_beam-2.41.0-cp37-cp37m-manylinux2010_x86_64.whl (10.9 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m10.9/10.9
MB\u001b[0m \u001b[31m42.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: proto-plus<2,>=1.7.1 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.22.1)\n",
+ "Requirement already satisfied: pydot<2,>=1.2.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.3.0)\n",
+ "Requirement already satisfied: numpy<1.23.0,>=1.14.3 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.21.6)\n",
+ "Requirement already satisfied: pyarrow<8.0.0,>=0.15.1 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(6.0.1)\n",
+ "Requirement already satisfied: fastavro<2,>=0.23.6 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.6.1)\n",
+ "Requirement already satisfied: hdfs<3.0.0,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.7.0)\n",
+ "Requirement already satisfied: dill<0.3.2,>=0.3.1.1 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.3.1.1)\n",
+ "Requirement already satisfied: requests<3.0.0,>=2.24.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.28.1)\n",
+ "Requirement already satisfied: python-dateutil<3,>=2.8.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.8.2)\n",
+ "Requirement already satisfied: pytz>=2018.3 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2022.4)\n",
+ "Requirement already satisfied: crcmod<2.0,>=1.7 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0) (1.7)\n",
+ "Requirement already satisfied: protobuf<4,>=3.12.2 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(3.20.3)\n",
+ "Requirement already satisfied: cloudpickle<3,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.1.0)\n",
+ "Requirement already satisfied: orjson<4.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(3.8.0)\n",
+ "Requirement already satisfied: pymongo<4.0.0,>=3.8.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(3.12.3)\n",
+ "Requirement already satisfied: grpcio<2,>=1.33.1 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.49.1)\n",
+ "Requirement already satisfied: typing-extensions>=3.7.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(4.1.1)\n",
+ "Requirement already satisfied: httplib2<0.21.0,>=0.8 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.17.4)\n",
+ "Requirement already satisfied: google-cloud-language<2,>=1.3.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.3.2)\n",
+ "Requirement already satisfied: google-cloud-pubsub<3,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.13.10)\n",
+ "Requirement already satisfied: google-apitools<0.5.32,>=0.5.31 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.5.31)\n",
+ "Requirement already satisfied:
google-cloud-recommendations-ai<0.8.0,>=0.1.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.7.1)\n",
+ "Requirement already satisfied: cachetools<5,>=3.1.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(4.2.4)\n",
+ "Requirement already satisfied: google-cloud-bigtable<2,>=0.31.1
in /usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.7.2)\n",
+ "Requirement already satisfied: google-cloud-dlp<4,>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(3.9.2)\n",
+ "Requirement already satisfied: google-auth-httplib2<0.2.0,>=0.1.0
in /usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.1.0)\n",
+ "Requirement already satisfied: google-cloud-datastore<2,>=1.8.0
in /usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.8.0)\n",
+ "Requirement already satisfied: google-cloud-spanner<2,>=1.13.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.19.3)\n",
+ "Requirement already satisfied:
google-cloud-bigquery-storage<2.14,>=2.6.3 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(2.13.2)\n",
+ "Requirement already satisfied: google-cloud-vision<2,>=0.38.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.0.2)\n",
+ "Requirement already satisfied: google-cloud-core<3,>=0.28.1 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.7.3)\n",
+ "Requirement already satisfied:
google-cloud-videointelligence<2,>=1.8.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.16.3)\n",
+ "Requirement already satisfied: grpcio-gcp<1,>=0.2.2 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(0.2.2)\n",
+ "Requirement already satisfied: google-cloud-pubsublite<2,>=1.2.0
in /usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.6.0)\n",
+ "Requirement already satisfied: google-auth<3,>=1.18.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.35.0)\n",
+ "Requirement already satisfied: google-cloud-bigquery<3,>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.21.0)\n",
+ "Requirement already satisfied: google-api-core!=2.8.2,<3 in
/usr/local/lib/python3.7/dist-packages (from apache_beam[gcp]==2.41.0)
(1.32.0)\n",
+ "Requirement already satisfied: six>=1.13.0 in
/usr/local/lib/python3.7/dist-packages (from
google-api-core!=2.8.2,<3->apache_beam[gcp]==2.41.0) (1.15.0)\n",
+ "Requirement already satisfied: packaging>=14.3 in
/usr/local/lib/python3.7/dist-packages (from
google-api-core!=2.8.2,<3->apache_beam[gcp]==2.41.0) (21.3)\n",
+ "Requirement already satisfied:
googleapis-common-protos<2.0dev,>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
google-api-core!=2.8.2,<3->apache_beam[gcp]==2.41.0) (1.56.4)\n",
+ "Requirement already satisfied: setuptools>=40.3.0 in
/usr/local/lib/python3.7/dist-packages (from
google-api-core!=2.8.2,<3->apache_beam[gcp]==2.41.0) (57.4.0)\n",
+ "Requirement already satisfied: fasteners>=0.14 in
/usr/local/lib/python3.7/dist-packages (from
google-apitools<0.5.32,>=0.5.31->apache_beam[gcp]==2.41.0) (0.18)\n",
+ "Requirement already satisfied: oauth2client>=1.4.12 in
/usr/local/lib/python3.7/dist-packages (from
google-apitools<0.5.32,>=0.5.31->apache_beam[gcp]==2.41.0) (4.1.3)\n",
+ "Requirement already satisfied: rsa<5,>=3.1.4 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.18.0->apache_beam[gcp]==2.41.0) (4.9)\n",
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.18.0->apache_beam[gcp]==2.41.0) (0.2.8)\n",
+ "Requirement already satisfied:
google-resumable-media!=0.4.0,<0.5.0dev,>=0.3.1 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-bigquery<3,>=1.6.0->apache_beam[gcp]==2.41.0) (0.4.1)\n",
+ "Requirement already satisfied:
grpc-google-iam-v1<0.13dev,>=0.12.3 in /usr/local/lib/python3.7/dist-packages
(from google-cloud-bigtable<2,>=0.31.1->apache_beam[gcp]==2.41.0) (0.12.4)\n",
+ "Requirement already satisfied: grpcio-status>=1.16.0 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-pubsub<3,>=2.1.0->apache_beam[gcp]==2.41.0) (1.48.2)\n",
+ "Requirement already satisfied: overrides<7.0.0,>=6.0.1 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-pubsublite<2,>=1.2.0->apache_beam[gcp]==2.41.0) (6.5.0)\n",
+ "Requirement already satisfied: docopt in
/usr/local/lib/python3.7/dist-packages (from
hdfs<3.0.0,>=2.1.0->apache_beam[gcp]==2.41.0) (0.6.2)\n",
+ "Requirement already satisfied: pyparsing>=2.1.4 in
/usr/local/lib/python3.7/dist-packages (from
pydot<2,>=1.2.0->apache_beam[gcp]==2.41.0) (3.0.9)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache_beam[gcp]==2.41.0) (1.24.3)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache_beam[gcp]==2.41.0) (2.1.1)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache_beam[gcp]==2.41.0) (2.10)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache_beam[gcp]==2.41.0) (2022.9.24)\n",
+ "Requirement already satisfied: pyasn1>=0.1.7 in
/usr/local/lib/python3.7/dist-packages (from
oauth2client>=1.4.12->google-apitools<0.5.32,>=0.5.31->apache_beam[gcp]==2.41.0)
(0.4.8)\n",
+ "Installing collected packages: apache_beam\n",
+ " Attempting uninstall: apache_beam\n",
+ " Found existing installation: apache-beam 2.42.0\n",
+ " Uninstalling apache-beam-2.42.0:\n",
+ " Successfully uninstalled apache-beam-2.42.0\n",
+ "Successfully installed apache_beam-2.41.0\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0m"
+ ]
+ },
+ {
+ "output_type": "display_data",
+ "data": {
+ "application/vnd.colab-display-data+json": {
+ "pip_warning": {
+ "packages": [
+ "apache_beam"
+ ]
+ }
+ }
+ },
+ "metadata": {}
+ },
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Looking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Collecting tensorflow==2.8\n",
+ " Downloading
https://us-python.pkg.dev/colab-wheels/public/tensorflow/tensorflow-2.8.0%2Bzzzcolab20220506162203-cp37-cp37m-linux_x86_64.whl
(668.3 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m
\u001b[32m668.3/668.3 MB\u001b[0m \u001b[31m2.1 MB/s\u001b[0m eta
\u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: libclang>=9.0.1 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (14.0.6)\n",
+ "Requirement already satisfied: h5py>=2.9.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (3.1.0)\n",
+ "Requirement already satisfied: google-pasta>=0.1.1 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (0.2.0)\n",
+ "Collecting keras<2.9,>=2.8.0rc0\n",
+ " Downloading keras-2.8.0-py2.py3-none-any.whl (1.4 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.4/1.4
MB\u001b[0m \u001b[31m17.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: flatbuffers>=1.12 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.12)\n",
+ "Requirement already satisfied: gast>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (0.4.0)\n",
+ "Requirement already satisfied: opt-einsum>=2.3.2 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (3.3.0)\n",
+ "Requirement already satisfied: six>=1.12.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.15.0)\n",
+ "Requirement already satisfied: grpcio<2.0,>=1.24.3 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.49.1)\n",
+ "Requirement already satisfied: typing-extensions>=3.6.6 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (4.1.1)\n",
+ "Requirement already satisfied:
tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.7/dist-packages
(from tensorflow==2.8) (0.27.0)\n",
+ "Requirement already satisfied: setuptools in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (57.4.0)\n",
+ "Requirement already satisfied: keras-preprocessing>=1.1.1 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.1.2)\n",
+ "Requirement already satisfied: absl-py>=0.4.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.3.0)\n",
+ "Requirement already satisfied: protobuf>=3.9.2 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (3.20.3)\n",
+ "Requirement already satisfied: astunparse>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.6.3)\n",
+ "Requirement already satisfied: termcolor>=1.1.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (2.0.1)\n",
+ "Collecting tf-estimator-nightly==2.8.0.dev2021122109\n",
+ " Downloading
tf_estimator_nightly-2.8.0.dev2021122109-py2.py3-none-any.whl (462 kB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m462.5/462.5
kB\u001b[0m \u001b[31m19.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hCollecting tensorboard<2.9,>=2.8\n",
+ " Downloading tensorboard-2.8.0-py3-none-any.whl (5.8 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.8/5.8
MB\u001b[0m \u001b[31m62.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: wrapt>=1.11.0 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.14.1)\n",
+ "Requirement already satisfied: numpy>=1.20 in
/usr/local/lib/python3.7/dist-packages (from tensorflow==2.8) (1.21.6)\n",
+ "Requirement already satisfied: wheel<1.0,>=0.23.0 in
/usr/local/lib/python3.7/dist-packages (from
astunparse>=1.6.0->tensorflow==2.8) (0.37.1)\n",
+ "Requirement already satisfied: cached-property in
/usr/local/lib/python3.7/dist-packages (from h5py>=2.9.0->tensorflow==2.8)
(1.5.2)\n",
+ "Requirement already satisfied: markdown>=2.6.8 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (3.4.1)\n",
+ "Requirement already satisfied: requests<3,>=2.21.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (2.28.1)\n",
+ "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1
in /usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (0.4.6)\n",
+ "Requirement already satisfied: werkzeug>=0.11.15 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (1.0.1)\n",
+ "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (1.8.1)\n",
+ "Requirement already satisfied: google-auth<3,>=1.6.3 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow==2.8) (1.35.0)\n",
+ "Requirement already satisfied:
tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages
(from tensorboard<2.9,>=2.8->tensorflow==2.8) (0.6.1)\n",
+ "Requirement already satisfied: cachetools<5.0,>=2.0.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow==2.8) (4.2.4)\n",
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow==2.8) (0.2.8)\n",
+ "Requirement already satisfied: rsa<5,>=3.1.4 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow==2.8) (4.9)\n",
+ "Requirement already satisfied: requests-oauthlib>=0.7.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow==2.8)
(1.3.1)\n",
+ "Requirement already satisfied: importlib-metadata>=4.4 in
/usr/local/lib/python3.7/dist-packages (from
markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow==2.8) (4.13.0)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow==2.8) (2.10)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow==2.8) (2.1.1)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow==2.8) (2022.9.24)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow==2.8) (1.24.3)\n",
+ "Requirement already satisfied: zipp>=0.5 in
/usr/local/lib/python3.7/dist-packages (from
importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow==2.8)
(3.9.0)\n",
+ "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in
/usr/local/lib/python3.7/dist-packages (from
pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow==2.8)
(0.4.8)\n",
+ "Requirement already satisfied: oauthlib>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from
requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow==2.8)
(3.2.1)\n",
+ "Installing collected packages: tf-estimator-nightly, keras,
tensorboard, tensorflow\n",
+ " Attempting uninstall: keras\n",
+ " Found existing installation: keras 2.9.0\n",
+ " Uninstalling keras-2.9.0:\n",
+ " Successfully uninstalled keras-2.9.0\n",
+ " Attempting uninstall: tensorboard\n",
+ " Found existing installation: tensorboard 2.9.1\n",
+ " Uninstalling tensorboard-2.9.1:\n",
+ " Successfully uninstalled tensorboard-2.9.1\n",
+ " Attempting uninstall: tensorflow\n",
+ " Found existing installation: tensorflow 2.9.2\n",
+ " Uninstalling tensorflow-2.9.2:\n",
+ " Successfully uninstalled tensorflow-2.9.2\n",
+ "Successfully installed keras-2.8.0 tensorboard-2.8.0
tensorflow-2.8.0+zzzcolab20220506162203
tf-estimator-nightly-2.8.0.dev2021122109\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0mLooking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Collecting tfx_bsl\n",
+ " Downloading
tfx_bsl-1.10.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (21.6
MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m21.6/21.6
MB\u001b[0m \u001b[31m49.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied:
tensorflow-metadata<1.11.0,>=1.10.0 in /usr/local/lib/python3.7/dist-packages
(from tfx_bsl) (1.10.0)\n",
+ "Collecting
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5\n",
+ " Downloading
tensorflow-2.10.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
(578.0 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m
\u001b[32m578.0/578.0 MB\u001b[0m \u001b[31m2.4 MB/s\u001b[0m eta
\u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied:
google-api-python-client<2,>=1.7.11 in /usr/local/lib/python3.7/dist-packages
(from tfx_bsl) (1.12.11)\n",
+ "Collecting
tensorflow-serving-api!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15\n",
+ " Downloading tensorflow_serving_api-2.10.0-py2.py3-none-any.whl
(37 kB)\n",
+ "Requirement already satisfied: numpy<2,>=1.16 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (1.21.6)\n",
+ "Requirement already satisfied: apache-beam[gcp]<3,>=2.40 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (2.41.0)\n",
+ "Requirement already satisfied: absl-py<2.0.0,>=0.9 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (1.3.0)\n",
+ "Requirement already satisfied: protobuf<3.21,>=3.13 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (3.20.3)\n",
+ "Requirement already satisfied: pyarrow<7,>=6 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (6.0.1)\n",
+ "Requirement already satisfied: pandas<2,>=1.0 in
/usr/local/lib/python3.7/dist-packages (from tfx_bsl) (1.3.5)\n",
+ "Requirement already satisfied: requests<3.0.0,>=2.24.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.28.1)\n",
+ "Requirement already satisfied: dill<0.3.2,>=0.3.1.1 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.3.1.1)\n",
+ "Requirement already satisfied: pymongo<4.0.0,>=3.8.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (3.12.3)\n",
+ "Requirement already satisfied: cloudpickle<3,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.1.0)\n",
+ "Requirement already satisfied: fastavro<2,>=0.23.6 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.6.1)\n",
+ "Requirement already satisfied: pydot<2,>=1.2.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.3.0)\n",
+ "Requirement already satisfied: pytz>=2018.3 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2022.4)\n",
+ "Requirement already satisfied: grpcio<2,>=1.33.1 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.49.1)\n",
+ "Requirement already satisfied: crcmod<2.0,>=1.7 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.7)\n",
+ "Requirement already satisfied: httplib2<0.21.0,>=0.8 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.17.4)\n",
+ "Requirement already satisfied: hdfs<3.0.0,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.7.0)\n",
+ "Requirement already satisfied: proto-plus<2,>=1.7.1 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.22.1)\n",
+ "Requirement already satisfied: typing-extensions>=3.7.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (4.1.1)\n",
+ "Requirement already satisfied: orjson<4.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (3.8.0)\n",
+ "Requirement already satisfied: python-dateutil<3,>=2.8.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.8.2)\n",
+ "Requirement already satisfied: cachetools<5,>=3.1.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (4.2.4)\n",
+ "Requirement already satisfied: google-cloud-spanner<2,>=1.13.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.19.3)\n",
+ "Requirement already satisfied: grpcio-gcp<1,>=0.2.2 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.2.2)\n",
+ "Requirement already satisfied:
google-cloud-videointelligence<2,>=1.8.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.16.3)\n",
+ "Requirement already satisfied: google-cloud-language<2,>=1.3.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.3.2)\n",
+ "Requirement already satisfied: google-cloud-pubsub<3,>=2.1.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.13.10)\n",
+ "Requirement already satisfied: google-cloud-core<3,>=0.28.1 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.7.3)\n",
+ "Requirement already satisfied: google-cloud-dlp<4,>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (3.9.2)\n",
+ "Requirement already satisfied: google-auth<3,>=1.18.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.35.0)\n",
+ "Requirement already satisfied: google-auth-httplib2<0.2.0,>=0.1.0
in /usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.1.0)\n",
+ "Requirement already satisfied: google-cloud-bigtable<2,>=0.31.1
in /usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.7.2)\n",
+ "Requirement already satisfied:
google-cloud-bigquery-storage<2.14,>=2.6.3 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.13.2)\n",
+ "Requirement already satisfied: google-api-core!=2.8.2,<3 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.32.0)\n",
+ "Requirement already satisfied: google-cloud-datastore<2,>=1.8.0
in /usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.8.0)\n",
+ "Requirement already satisfied:
google-cloud-recommendations-ai<0.8.0,>=0.1.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.7.1)\n",
+ "Requirement already satisfied: google-apitools<0.5.32,>=0.5.31 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.5.31)\n",
+ "Requirement already satisfied: google-cloud-vision<2,>=0.38.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.0.2)\n",
+ "Requirement already satisfied: google-cloud-bigquery<3,>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.21.0)\n",
+ "Requirement already satisfied: google-cloud-pubsublite<2,>=1.2.0
in /usr/local/lib/python3.7/dist-packages (from
apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.6.0)\n",
+ "Requirement already satisfied: six<2dev,>=1.13.0 in
/usr/local/lib/python3.7/dist-packages (from
google-api-python-client<2,>=1.7.11->tfx_bsl) (1.15.0)\n",
+ "Requirement already satisfied: uritemplate<4dev,>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from
google-api-python-client<2,>=1.7.11->tfx_bsl) (3.0.1)\n",
+ "Collecting tensorflow-estimator<2.11,>=2.10.0\n",
+ " Downloading tensorflow_estimator-2.10.0-py2.py3-none-any.whl
(438 kB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m438.7/438.7
kB\u001b[0m \u001b[31m31.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: opt-einsum>=2.3.2 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.3.0)\n",
+ "Requirement already satisfied: h5py>=2.9.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.1.0)\n",
+ "Requirement already satisfied: packaging in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(21.3)\n",
+ "Requirement already satisfied: gast<=0.4.0,>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.4.0)\n",
+ "Requirement already satisfied: astunparse>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.6.3)\n",
+ "Requirement already satisfied: termcolor>=1.1.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(2.0.1)\n",
+ "Requirement already satisfied: setuptools in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(57.4.0)\n",
+ "Collecting protobuf<3.21,>=3.13\n",
+ " Downloading
protobuf-3.19.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1
MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.1/1.1
MB\u001b[0m \u001b[31m24.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: google-pasta>=0.1.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.2.0)\n",
+ "Requirement already satisfied: wrapt>=1.11.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.14.1)\n",
+ "Requirement already satisfied:
tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.7/dist-packages
(from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.27.0)\n",
+ "Collecting flatbuffers>=2.0\n",
+ " Downloading flatbuffers-22.9.24-py2.py3-none-any.whl (26 kB)\n",
+ "Collecting tensorboard<2.11,>=2.10\n",
+ " Downloading tensorboard-2.10.1-py3-none-any.whl (5.9 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m5.9/5.9
MB\u001b[0m \u001b[31m58.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hCollecting keras<2.11,>=2.10.0\n",
+ " Downloading keras-2.10.0-py2.py3-none-any.whl (1.7 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.7/1.7
MB\u001b[0m \u001b[31m38.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: libclang>=13.0.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(14.0.6)\n",
+ "Requirement already satisfied: keras-preprocessing>=1.1.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.1.2)\n",
+ "Requirement already satisfied:
googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.7/dist-packages
(from tensorflow-metadata<1.11.0,>=1.10.0->tfx_bsl) (1.56.4)\n",
+ "Requirement already satisfied: wheel<1.0,>=0.23.0 in
/usr/local/lib/python3.7/dist-packages (from
astunparse>=1.6.0->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.37.1)\n",
+ "Requirement already satisfied: fasteners>=0.14 in
/usr/local/lib/python3.7/dist-packages (from
google-apitools<0.5.32,>=0.5.31->apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.18)\n",
+ "Requirement already satisfied: oauth2client>=1.4.12 in
/usr/local/lib/python3.7/dist-packages (from
google-apitools<0.5.32,>=0.5.31->apache-beam[gcp]<3,>=2.40->tfx_bsl) (4.1.3)\n",
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.18.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.2.8)\n",
+ "Requirement already satisfied: rsa<5,>=3.1.4 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.18.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (4.9)\n",
+ "Requirement already satisfied:
google-resumable-media!=0.4.0,<0.5.0dev,>=0.3.1 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-bigquery<3,>=1.6.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.4.1)\n",
+ "Requirement already satisfied:
grpc-google-iam-v1<0.13dev,>=0.12.3 in /usr/local/lib/python3.7/dist-packages
(from google-cloud-bigtable<2,>=0.31.1->apache-beam[gcp]<3,>=2.40->tfx_bsl)
(0.12.4)\n",
+ "Requirement already satisfied: grpcio-status>=1.16.0 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-pubsub<3,>=2.1.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.48.2)\n",
+ "Requirement already satisfied: overrides<7.0.0,>=6.0.1 in
/usr/local/lib/python3.7/dist-packages (from
google-cloud-pubsublite<2,>=1.2.0->apache-beam[gcp]<3,>=2.40->tfx_bsl)
(6.5.0)\n",
+ "Requirement already satisfied: cached-property in
/usr/local/lib/python3.7/dist-packages (from
h5py>=2.9.0->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.5.2)\n",
+ "Requirement already satisfied: docopt in
/usr/local/lib/python3.7/dist-packages (from
hdfs<3.0.0,>=2.1.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (0.6.2)\n",
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in
/usr/local/lib/python3.7/dist-packages (from
packaging->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.0.9)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (2022.9.24)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.1.1)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (2.10)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from
requests<3.0.0,>=2.24.0->apache-beam[gcp]<3,>=2.40->tfx_bsl) (1.24.3)\n",
+ "Requirement already satisfied: werkzeug>=1.0.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.0.1)\n",
+ "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1
in /usr/local/lib/python3.7/dist-packages (from
tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.4.6)\n",
+ "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.8.1)\n",
+ "Requirement already satisfied:
tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages
(from
tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(0.6.1)\n",
+ "Requirement already satisfied: markdown>=2.6.8 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.4.1)\n",
+ "Requirement already satisfied: requests-oauthlib>=0.7.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(1.3.1)\n",
+ "Requirement already satisfied: importlib-metadata>=4.4 in
/usr/local/lib/python3.7/dist-packages (from
markdown>=2.6.8->tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(4.13.0)\n",
+ "Requirement already satisfied: pyasn1>=0.1.7 in
/usr/local/lib/python3.7/dist-packages (from
oauth2client>=1.4.12->google-apitools<0.5.32,>=0.5.31->apache-beam[gcp]<3,>=2.40->tfx_bsl)
(0.4.8)\n",
+ "Requirement already satisfied: zipp>=0.5 in
/usr/local/lib/python3.7/dist-packages (from
importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.9.0)\n",
+ "Requirement already satisfied: oauthlib>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from
requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.11,>=2.10->tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5->tfx_bsl)
(3.2.1)\n",
+ "Installing collected packages: keras, flatbuffers,
tensorflow-estimator, protobuf, tensorboard, tensorflow,
tensorflow-serving-api, tfx_bsl\n",
+ " Attempting uninstall: keras\n",
+ " Found existing installation: keras 2.8.0\n",
+ " Uninstalling keras-2.8.0:\n",
+ " Successfully uninstalled keras-2.8.0\n",
+ " Attempting uninstall: flatbuffers\n",
+ " Found existing installation: flatbuffers 1.12\n",
+ " Uninstalling flatbuffers-1.12:\n",
+ " Successfully uninstalled flatbuffers-1.12\n",
+ " Attempting uninstall: tensorflow-estimator\n",
+ " Found existing installation: tensorflow-estimator 2.9.0\n",
+ " Uninstalling tensorflow-estimator-2.9.0:\n",
+ " Successfully uninstalled tensorflow-estimator-2.9.0\n",
+ " Attempting uninstall: protobuf\n",
+ " Found existing installation: protobuf 3.20.3\n",
+ " Uninstalling protobuf-3.20.3:\n",
+ " Successfully uninstalled protobuf-3.20.3\n",
+ " Attempting uninstall: tensorboard\n",
+ " Found existing installation: tensorboard 2.8.0\n",
+ " Uninstalling tensorboard-2.8.0:\n",
+ " Successfully uninstalled tensorboard-2.8.0\n",
+ " Attempting uninstall: tensorflow\n",
+ " Found existing installation: tensorflow
2.8.0+zzzcolab20220506162203\n",
+ " Uninstalling tensorflow-2.8.0+zzzcolab20220506162203:\n",
+ " Successfully uninstalled
tensorflow-2.8.0+zzzcolab20220506162203\n",
+ "Successfully installed flatbuffers-22.9.24 keras-2.10.0
protobuf-3.19.6 tensorboard-2.10.1 tensorflow-2.10.0
tensorflow-estimator-2.10.0 tensorflow-serving-api-2.10.0 tfx_bsl-1.10.1\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0m"
+ ]
+ },
+ {
+ "output_type": "display_data",
+ "data": {
+ "application/vnd.colab-display-data+json": {
+ "pip_warning": {
+ "packages": [
+ "google"
+ ]
+ }
+ }
+ },
+ "metadata": {}
+ },
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Looking in indexes: https://pypi.org/simple,
https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+ "Collecting tensorflow-text==2.8.1\n",
+ " Downloading
tensorflow_text-2.8.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
(4.9 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.9/4.9
MB\u001b[0m \u001b[31m39.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: tensorflow-hub>=0.8.0
in /usr/local/lib/python3.7/dist-packages (from tensorflow-text==2.8.1)
(0.12.0)\n",
+ "Collecting tensorflow<2.9,>=2.8.0\n",
+ " Downloading
tensorflow-2.8.3-cp37-cp37m-manylinux2010_x86_64.whl (497.9 MB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m
\u001b[32m497.9/497.9 MB\u001b[0m \u001b[31m2.7 MB/s\u001b[0m eta
\u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: termcolor>=1.1.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (2.0.1)\n",
+ "Requirement already satisfied: protobuf<3.20,>=3.9.2 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (3.19.6)\n",
+ "Requirement already satisfied: flatbuffers>=1.12 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (22.9.24)\n",
+ "Requirement already satisfied: typing-extensions>=3.6.6 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (4.1.1)\n",
+ "Requirement already satisfied: setuptools in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (57.4.0)\n",
+ "Requirement already satisfied: opt-einsum>=2.3.2 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (3.3.0)\n",
+ "Requirement already satisfied: grpcio<2.0,>=1.24.3 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.49.1)\n",
+ "Requirement already satisfied: absl-py>=0.4.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.3.0)\n",
+ "Collecting tensorboard<2.9,>=2.8\n",
+ " Using cached tensorboard-2.8.0-py3-none-any.whl (5.8 MB)\n",
+ "Collecting tensorflow-estimator<2.9,>=2.8\n",
+ " Downloading tensorflow_estimator-2.8.0-py2.py3-none-any.whl
(462 kB)\n",
+ "\u001b[2K
\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m462.3/462.3
kB\u001b[0m \u001b[31m25.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+ "\u001b[?25hRequirement already satisfied: numpy>=1.20 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.21.6)\n",
+ "Requirement already satisfied:
tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.7/dist-packages
(from tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (0.27.0)\n",
+ "Requirement already satisfied: libclang>=9.0.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (14.0.6)\n",
+ "Requirement already satisfied: wrapt>=1.11.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.14.1)\n",
+ "Requirement already satisfied: google-pasta>=0.1.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (0.2.0)\n",
+ "Collecting keras<2.9,>=2.8.0rc0\n",
+ " Using cached keras-2.8.0-py2.py3-none-any.whl (1.4 MB)\n",
+ "Requirement already satisfied: gast>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (0.4.0)\n",
+ "Requirement already satisfied: h5py>=2.9.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (3.1.0)\n",
+ "Requirement already satisfied: six>=1.12.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.15.0)\n",
+ "Requirement already satisfied: keras-preprocessing>=1.1.1 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.1.2)\n",
+ "Requirement already satisfied: astunparse>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.6.3)\n",
+ "Requirement already satisfied: wheel<1.0,>=0.23.0 in
/usr/local/lib/python3.7/dist-packages (from
astunparse>=1.6.0->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (0.37.1)\n",
+ "Requirement already satisfied: cached-property in
/usr/local/lib/python3.7/dist-packages (from
h5py>=2.9.0->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1) (1.5.2)\n",
+ "Requirement already satisfied: markdown>=2.6.8 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(3.4.1)\n",
+ "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1
in /usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(0.4.6)\n",
+ "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(1.8.1)\n",
+ "Requirement already satisfied: requests<3,>=2.21.0 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(2.28.1)\n",
+ "Requirement already satisfied: google-auth<3,>=1.6.3 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(1.35.0)\n",
+ "Requirement already satisfied: werkzeug>=0.11.15 in
/usr/local/lib/python3.7/dist-packages (from
tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(1.0.1)\n",
+ "Requirement already satisfied:
tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages
(from tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(0.6.1)\n",
+ "Requirement already satisfied: rsa<5,>=3.1.4 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(4.9)\n",
+ "Requirement already satisfied: pyasn1-modules>=0.2.1 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(0.2.8)\n",
+ "Requirement already satisfied: cachetools<5.0,>=2.0.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(4.2.4)\n",
+ "Requirement already satisfied: requests-oauthlib>=0.7.0 in
/usr/local/lib/python3.7/dist-packages (from
google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(1.3.1)\n",
+ "Requirement already satisfied: importlib-metadata>=4.4 in
/usr/local/lib/python3.7/dist-packages (from
markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(4.13.0)\n",
+ "Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(2022.9.24)\n",
+ "Requirement already satisfied: charset-normalizer<3,>=2 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(2.1.1)\n",
+ "Requirement already satisfied: idna<4,>=2.5 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(2.10)\n",
+ "Requirement already satisfied: urllib3<1.27,>=1.21.1 in
/usr/local/lib/python3.7/dist-packages (from
requests<3,>=2.21.0->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(1.24.3)\n",
+ "Requirement already satisfied: zipp>=0.5 in
/usr/local/lib/python3.7/dist-packages (from
importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(3.9.0)\n",
+ "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in
/usr/local/lib/python3.7/dist-packages (from
pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(0.4.8)\n",
+ "Requirement already satisfied: oauthlib>=3.0.0 in
/usr/local/lib/python3.7/dist-packages (from
requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow<2.9,>=2.8.0->tensorflow-text==2.8.1)
(3.2.1)\n",
+ "Installing collected packages: tensorflow-estimator, keras,
tensorboard, tensorflow, tensorflow-text\n",
+ " Attempting uninstall: tensorflow-estimator\n",
+ " Found existing installation: tensorflow-estimator 2.10.0\n",
+ " Uninstalling tensorflow-estimator-2.10.0:\n",
+ " Successfully uninstalled tensorflow-estimator-2.10.0\n",
+ " Attempting uninstall: keras\n",
+ " Found existing installation: keras 2.10.0\n",
+ " Uninstalling keras-2.10.0:\n",
+ " Successfully uninstalled keras-2.10.0\n",
+ " Attempting uninstall: tensorboard\n",
+ " Found existing installation: tensorboard 2.10.1\n",
+ " Uninstalling tensorboard-2.10.1:\n",
+ " Successfully uninstalled tensorboard-2.10.1\n",
+ " Attempting uninstall: tensorflow\n",
+ " Found existing installation: tensorflow 2.10.0\n",
+ " Uninstalling tensorflow-2.10.0:\n",
+ " Successfully uninstalled tensorflow-2.10.0\n",
+ "\u001b[31mERROR: pip's dependency resolver does not currently
take into account all the packages that are installed. This behaviour is the
source of the following dependency conflicts.\n",
+ "tfx-bsl 1.10.1 requires
tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<3,>=1.15.5,
but you have tensorflow 2.8.3 which is incompatible.\n",
+ "tensorflow-serving-api 2.10.0 requires tensorflow<3,>=2.10.0, but
you have tensorflow 2.8.3 which is incompatible.\u001b[0m\u001b[31m\n",
+ "\u001b[0mSuccessfully installed keras-2.8.0 tensorboard-2.8.0
tensorflow-2.8.3 tensorflow-estimator-2.8.0 tensorflow-text-2.8.1\n",
+ "\u001b[33mWARNING: Running pip as the 'root' user can result in
broken permissions and conflicting behaviour with the system package manager.
It is recommended to use a virtual environment instead:
https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+ "\u001b[0m"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import numpy as np\n",
+ "import tensorflow as tf\n",
+ "import tensorflow_text as text\n",
+ "from scipy.special import expit\n",
+ "\n",
+ "import apache_beam as beam\n",
+ "import tfx_bsl\n",
+ "from tfx_bsl.public.beam import RunInference\n",
+ "from tfx_bsl.public import tfxio\n",
+ "from tfx_bsl.public.proto import model_spec_pb2\n",
+ "from tfx_bsl.public.tfxio import TFExampleRecord\n",
+ "from tensorflow_serving.apis import prediction_log_pb2"
+ ],
+ "metadata": {
+ "id": "642maF_redwC"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Model"
+ ],
+ "metadata": {
+ "id": "h2JP7zsqerCT"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Download a pretrained binary classifier to perform sentiment analysis
on an IMDB dataset from GCS. This model was trained by following this
[tutorial](https://www.tensorflow.org/tutorials/keras/text_classification)"
+ ],
+ "metadata": {
+ "id": "ydYQ_5EyfeEM"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "model_dir = \"gs://apache-beam-testing-ml-examples/imdb_bert\""
+ ],
+ "metadata": {
+ "id": "BucRWly0flz8"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Helper Functions"
+ ],
+ "metadata": {
+ "id": "GZ-Ioc8ZfyIT"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "class ExampleProcessor:\n",
+ " \"\"\"\n",
+ " Process the raw text input to a format suitable for
RunInference.\n",
+ " TensorFlow model handler expects a serialized tf.Example as
input\n",
+ " \"\"\"\n",
+ " def create_example(self, feature):\n",
+ " return tf.train.Example(\n",
+ " features=tf.train.Features(\n",
+ " feature={'x' : self.create_feature(feature)})\n",
+ " )\n",
+ "\n",
+ " def create_feature(self, element):\n",
+ " return
tf.train.Feature(bytes_list=tf.train.BytesList(value=[element]))\n",
+ "\n",
+ "class PredictionProcessor(beam.DoFn):\n",
+ " \"\"\"\n",
+ " Process the RunInference output to return the input text and the
softmax probability\n",
+ " \"\"\"\n",
+ " def process(\n",
+ " self,\n",
+ " element: prediction_log_pb2.PredictionLog):\n",
+ " predict_log = element.predict_log\n",
+ " input_value =
tf.train.Example.FromString(predict_log.request.inputs['text'].string_val[0])\n",
+ " output_value = predict_log.response.outputs\n",
+ " # print(output_value)\n",
+ " yield (f\"input is
[{input_value.features.feature['x'].bytes_list.value}] output is
{expit(output_value['classifier'].float_val)}\")"
+ ],
+ "metadata": {
+ "id": "pZ0LNtHUfsRq"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Prepare the Input"
+ ],
+ "metadata": {
+ "id": "PZVwI4BbgaAI"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "inputs = np.array([\n",
+ " b\"this is such an amazing movie\",\n",
+ " b\"The movie was great\",\n",
+ " b\"The movie was okish\",\n",
+ " b\"The movie was terrible\"\n",
+ "])"
+ ],
+ "metadata": {
+ "id": "TOXX1KMKi_mm"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "input_strings_file = 'input_strings.tfrecord'\n",
+ "\n",
+ "# Preprocess the input as RunInference is expecting a serialized
tf.example as an input\n",
+ "# Write the processed input to a file \n",
+ "# One can also do it as a pipeline step by using beam.Map() \n",
+ "\n",
+ "with tf.io.TFRecordWriter(input_strings_file) as writer:\n",
+ " for i in inputs:\n",
+ " example = ExampleProcessor().create_example(feature=i)\n",
+ " writer.write(example.SerializeToString())"
+ ],
+ "metadata": {
+ "id": "O2Y15WmfgZXQ"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### RunInference Pipeline"
+ ],
+ "metadata": {
+ "id": "BYkQl_l8gRgo"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "saved_model_spec =
model_spec_pb2.SavedModelSpec(model_path=model_dir)\n",
+ "inference_spec_type =
model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
+ "\n",
+ "#A Beam IO that reads a file of serialized tf.Examples\n",
+ "tfexample_beam_record =
TFExampleRecord(file_pattern='input_strings.tfrecord')\n",
+ "\n",
+ "with beam.Pipeline() as pipeline:\n",
+ " _ = ( pipeline | \"Create Input PCollection\" >>
tfexample_beam_record.RawRecordBeamSource()\n",
+ " | \"Do Inference\" >>
RunInference(model_spec_pb2.InferenceSpecType(\n",
+ "
saved_model_spec=model_spec_pb2.SavedModelSpec(model_path=model_dir)))\n",
+ " | \"Post Process\" >>
beam.ParDo(PredictionProcessor())\n",
+ " | beam.Map(print)\n",
+ " )\n"
+ ],
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "uh5bMhxdgA7Q",
+ "outputId": "2a22059f-519c-44f7-e36f-59e09b1cb24a"
+ },
+ "execution_count": null,
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stderr",
+ "text": [
+ "WARNING:tensorflow:From
/usr/local/lib/python3.7/dist-packages/tfx_bsl/beam/run_inference.py:615: load
(from tensorflow.python.saved_model.loader_impl) is deprecated and will be
removed in a future version.\n",
+ "Instructions for updating:\n",
+ "This function will only be available through the v1 compatibility
library as tf.compat.v1.saved_model.loader.load or
tf.compat.v1.saved_model.load. There will be a new function for importing
SavedModels in Tensorflow 2.0.\n",
+ "WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so
the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could
be.\n"
+ ]
+ },
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "input is [[b'this is such an amazing movie']] output is
[0.99906057]\n",
+ "input is [[b'The movie was great']] output is [0.99307914]\n",
+ "input is [[b'The movie was okish']] output is [0.03274685]\n",
+ "input is [[b'The movie was terrible']] output is [0.00680008]\n"
+ ]
+ }
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### RunInference with Scikit-Learn\n"
+ ],
+ "metadata": {
+ "id": "8wBUckzHjGV6"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Install Dependency"
+ ],
+ "metadata": {
+ "id": "6ArL_55kjxkO"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "!pip install --upgrade pip\n",
+ "!pip install apache_beam[gcp]==2.41.0"
+ ],
+ "metadata": {
+ "id": "R4p6Mil0jxSy"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "import pickle\n",
+ "\n",
+ "import apache_beam as beam\n",
+ "from apache_beam.ml.inference import RunInference\n",
+ "from apache_beam.ml.inference.sklearn_inference import
SklearnModelHandlerNumpy, ModelFileType"
+ ],
+ "metadata": {
+ "id": "_YtRRxh1hLag"
+ },
+ "execution_count": null,
+ "outputs": []
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "#### Model\n",
+ "\n",
+ "Train and save a sentiment analysis pipeline on movie reviews to
classify movie reviews as either positive or negative"
+ ],
+ "metadata": {
+ "id": "-7ABKlZvkFHy"
+ }
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Train model based on this
[tutorial](https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#exercise-2-sentiment-analysis-on-movie-reviews)"
Review Comment:
Sorry I missed this earlier; could we do the same thing we do above with
Tensorflow and just give a link to a gcs bucket and say "This model was trained
by following this
[tutorial](https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#exercise-2-sentiment-analysis-on-movie-reviews)"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]