Hi,

I am trying to run sample WordCount beam job with PortableRunner by
following the documentation here,

https://beam.apache.org/documentation/runners/spark/

I want to run this as a spark-submit command with YARN resource manager.

Can you please let me know what is missing here? Thanks your help.


I tried the below commands and giving some weird errors,




spark-submit --master yarn --deploy-mode client --driver-memory 2g
--executor-memory 1g --executor-cores 1 WordCount.py --input "XXXX"
--output "XXXX" --runner PortableRunner --job_endpoint localhost:8099






  File "/usr/local/lib64/python3.7/site-packages/apache_beam/pipeline.py",
line 503, in __exit__

    self.run().wait_until_finish()

  File "/usr/local/lib64/python3.7/site-packages/apache_beam/pipeline.py",
line 483, in run

    self._options).run(False)

  File "/usr/local/lib64/python3.7/site-packages/apache_beam/pipeline.py",
line 496, in run

   return self.runner.run_pipeline(self, self._options)

  File
"/usr/local/lib64/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
line 384, in run_pipeline

    job_service_plan.submit(proto_pipeline)

  File
"/usr/local/lib64/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
line 105, in submit

    prepare_response.staging_session_token)

  File
"/usr/local/lib64/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
line 190, in stage

    staging_location='')

  File
"/usr/local/lib64/python3.7/site-packages/apache_beam/runners/portability/stager.py",
line 229, in stage_job_resources

    self.stage_artifact(pickled_session_file, staged_path)

  File
"/usr/local/lib64/python3.7/site-packages/apache_beam/runners/portability/portable_stager.py",
line 98, in stage_artifact

    self._artifact_staging_stub.PutArtifact(artifact_request_generator())

  File "/usr/local/lib64/python3.7/site-packages/grpc/_channel.py", line
1011, in __call__

    return _end_unary_response_blocking(state, call, False, None)

  File "/usr/local/lib64/python3.7/site-packages/grpc/_channel.py", line
729, in _end_unary_response_blocking

    raise _InactiveRpcError(state)

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated
with:

            status = StatusCode.UNIMPLEMENTED

            details = "Method not found:
org.apache.beam.model.job_management.v1.ArtifactStagingService/PutArtifact"

            debug_error_string =
"{"created":"@1589406258.175447016","description":"Error received from peer
ipv6:[::1]:8098","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Method
not found:
org.apache.beam.model.job_management.v1.ArtifactStagingService/PutArtifact","grpc_status":12}"

Reply via email to