charlespnh commented on issue #35788:
URL: https://github.com/apache/beam/issues/35788#issuecomment-3156531328
The weird thing is, if I point `model_artifact_path` to the local path of
the model `./knn_model.pkl` instead of gcs path, then it works... This error
`INFO:apache_beam.utils.subprocess_server:NotImplementedError: Method not
implemented!` occurs in both cases, though.
Here is what the log looks like when I use the local path to the model
instead:
```
INFO:root:Missing pipeline option (runner). Executing pipeline using the
default runner: DirectRunner.
Building pipeline...
WARNING:root:Could not load ML transform module
apache_beam.ml.transforms.embeddings.open_ai: No module named 'openai'. Please
install the necessary module dependencies
WARNING:root:Could not load ML transform module
apache_beam.ml.transforms.embeddings.tensorflow_hub: No module named
'tensorflow'. Please install the necessary module dependencies
INFO:apache_beam.yaml.yaml_transform:Expanding "ReadFromBigQuery" at line 23
WARNING:apache_beam.transforms.core:('No iterator is returned by the process
method in %s.', <class
'apache_beam.io.gcp.bigquery_read_internal._PassThroughThenCleanupTempDatasets.expand.<locals>.CleanUpProjects'>)
INFO:apache_beam.yaml.yaml_transform:Expanding "KNN" at line 31
INFO:apache_beam.utils.subprocess_server:Starting service with
['/Users/charlesnguyen/.apache_beam/cache/venvs/5316be40511ca369dcddc7cc8c711c62d053c3cdfda7fc4168fbb4232ba67964/bin/python'
'-m' 'apache_beam.runners.portability.expansion_service_main' '--port' '62974'
'--fully_qualified_name_glob=*' '--pickle_library=cloudpickle'
'--requirements_file=/Users/charlesnguyen/.apache_beam/cache/venvs/5316be40511ca369dcddc7cc8c711c62d053c3cdfda7fc4168fbb4232ba67964-requirements.txt'
'--serve_loopback_worker']
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.portability.stager:Executing
command:
['/Users/charlesnguyen/.apache_beam/cache/venvs/5316be40511ca369dcddc7cc8c711c62d053c3cdfda7fc4168fbb4232ba67964/bin/python',
'-m', 'pip', 'download', '--dest',
'/var/folders/c9/3ylgx85x553b8451_158qxpr0000gn/T/dataflow-requirements-cache',
'-r',
'/var/folders/c9/3ylgx85x553b8451_158qxpr0000gn/T/tmpp_2wzrpt/tmp_requirements.txt',
'--exists-action', 'i', '--no-deps', '--implementation', 'cp', '--abi',
'cp311', '--platform', 'manylinux2014_x86_64']
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.portability.stager:Executing
command:
['/Users/charlesnguyen/.apache_beam/cache/venvs/5316be40511ca369dcddc7cc8c711c62d053c3cdfda7fc4168fbb4232ba67964/bin/python',
'-m', 'pip', 'download', '--dest',
'/var/folders/c9/3ylgx85x553b8451_158qxpr0000gn/T/dataflow-requirements-cache',
'-r',
'/var/folders/c9/3ylgx85x553b8451_158qxpr0000gn/T/tmpa4ugtyer/tmp_requirements.txt',
'--exists-action', 'i', '--no-deps', '--implementation', 'cp', '--abi',
'cp311', '--platform', 'manylinux2014_x86_64']
INFO:apache_beam.utils.subprocess_server:INFO:__main__:Listening for
expansion requests at 62974
INFO:apache_beam.utils.subprocess_server:ERROR:grpc._server:Exception
calling application: Method not implemented!
INFO:apache_beam.utils.subprocess_server:Traceback (most recent call last):
INFO:apache_beam.utils.subprocess_server: File
"/Users/charlesnguyen/.apache_beam/cache/venvs/5316be40511ca369dcddc7cc8c711c62d053c3cdfda7fc4168fbb4232ba67964/lib/python3.11/site-packages/grpc/_server.py",
line 610, in _call_behavior
INFO:apache_beam.utils.subprocess_server: response_or_iterator =
behavior(argument, context)
INFO:apache_beam.utils.subprocess_server:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
INFO:apache_beam.utils.subprocess_server: File
"/Users/charlesnguyen/Code/beam/sdks/python/apache_beam/portability/api/org/apache/beam/model/job_management/v1/beam_expansion_api_pb2_grpc.py",
line 46, in DiscoverSchemaTransform
INFO:apache_beam.utils.subprocess_server: raise
NotImplementedError('Method not implemented!')
INFO:apache_beam.utils.subprocess_server:NotImplementedError: Method not
implemented!
INFO:apache_beam.utils.subprocess_server:INFO:root:Missing pipeline option
(runner). Executing pipeline using the default runner: DirectRunner.
INFO:apache_beam.yaml.yaml_transform:Expanding "LogForTesting" at line 37
Running pipeline...
INFO:apache_beam.runners.worker.statecache:Creating state cache with size
104857600
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:starting
control server on port 62992
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:starting
data server on port 62993
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:starting
state server on port 62994
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:starting
logging server on port 62995
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:Requesting
worker at 0.0.0.0:62974
INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:self.control_address:
localhost:62992
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.statecache:Creating
state cache with size 0
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.sdk_worker:Creating
insecure control channel for localhost:62992.
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.sdk_worker:Control
channel established.
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.sdk_worker:Initializing
SDKHarness with unbounded number of workers.
WARNING: All log messages before absl::InitializeLog() is called are written
to STDERR
I0000 00:00:1754425496.285480 8399329 fork_posix.cc:77] Other threads are
currently calling into gRPC, skipping fork() handlers
I0000 00:00:1754425496.575693 8399329 check_gcp_environment_no_op.cc:29]
ALTS: Platforms other than Linux and Windows are not supported
INFO:apache_beam.io.gcp.bigquery:Sent BigQuery Storage API CreateReadSession
request:
data_format: AVRO
table: "projects/apache-beam-testing/datasets/charlesnguyen/tables/test"
read_options {
selected_fields: "embedding"
row_restriction: "id = 5"
}
Received 1 streams
data_format: 1
estimated_total_bytes_scanned: 6178480
Avro Schema:schema: "{\n \"type\": \"record\",\n \"name\":
\"__root__\",\n \"fields\": [\n {\n \"name\":
\"embedding\",\n \"type\": {\n \"type\": \"array\",\n
\"items\": \"double\"\n }\n }\n ]\n}"
.
INFO:apache_beam.io.gcp.bigquery:Started BigQuery Storage API read from
stream
projects/apache-beam-testing/locations/us-central1/sessions/CAISDDRMYU9NVC0zYjJYaRoCaWYaAmpx/streams/GgJpZhoCanEoAg.
I0000 00:00:1754425502.589118 8399329 fork_posix.cc:77] Other threads are
currently calling into gRPC, skipping fork() handlers
I0000 00:00:1754425502.830988 8399329 check_gcp_environment_no_op.cc:29]
ALTS: Platforms other than Linux and Windows are not supported
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.sdk_worker:Creating
insecure state channel for localhost:62994.
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.sdk_worker:State
channel established.
INFO:apache_beam.utils.subprocess_server:INFO:apache_beam.runners.worker.data_plane:Creating
client data channel for localhost:62993
INFO:apache_beam.utils.subprocess_server:INFO:root:BatchElements statistics:
element_count=1 batch_count=1 next_batch_size=1 timings=[]
INFO:root:{"example": [0.038988106768286936, 0.025528610286658572,
-0.10666340067582115, 0.029036934163589195, -0.005526015184928034,
0.055632029892425765, 0.04430381301332253, ...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]