[ 
https://issues.apache.org/jira/browse/BEAM-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Nazif updated BEAM-9249:
-----------------------------
    Summary: Python SDK DirectRunner with multiprocessing fails with 
grpc.RPCError: 'Received message larger than max'  (was: Python SDK 
DirectRunner with multiprocessing fails with grpc.RPCError@ 'Received message 
larger than max')

> Python SDK DirectRunner with multiprocessing fails with grpc.RPCError: 
> 'Received message larger than max'
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-9249
>                 URL: https://issues.apache.org/jira/browse/BEAM-9249
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct
>    Affects Versions: 2.19.0
>            Reporter: Enis Nazif
>            Priority: Major
>
> When running a pipeline using the Python SDK, DirectRunner, and a 
> direct_running_mode of 'multi_processing' or 'multi_threading', the job fails 
> with
> Reproducible example:
> {code:java}
> import apache_beam as beam
> p = 
> beam.Pipeline(options=PipelineOptions(direct_running_mode='multi_processing', 
> num_direct_runners=4)) | beam.Create([i for i in range(1000000)])
> p.pipeline.run(){code}
>  
> Output:
> {code:java}
> Traceback (most recent call last):Traceback (most recent call last):  File 
> "/opt/man/releases/python-medusa/36-1/lib/python3.6/runpy.py", line 193, in 
> _run_module_as_main    "__main__", mod_spec)  File 
> "/opt/man/releases/python-medusa/36-1/lib/python3.6/runpy.py", line 85, in 
> _run_code    exec(code, run_globals)  File 
> "/users/is/enazif/beam/sdks/python/apache_beam/runners/worker/sdk_worker_main.py",
>  line 249, in <module>    main(sys.argv)  File 
> "/users/is/enazif/beam/sdks/python/apache_beam/runners/worker/sdk_worker_main.py",
>  line 160, in main    sdk_pipeline_options.view_as(ProfilingOptions))  File 
> "/users/is/enazif/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py", 
> line 137, in run    for work_request in 
> control_stub.Control(get_responses()):  File 
> "/users/is/enazif/pyenvs/04-02-2020-1/lib/python3.6/site-packages/grpcio-1.18.0+ahl1-py3.6-linux-x86_64.egg/grpc/_channel.py",
>  line 364, in __next__    return self._next()  File 
> "/users/is/enazif/pyenvs/04-02-2020-1/lib/python3.6/site-packages/grpcio-1.18.0+ahl1-py3.6-linux-x86_64.egg/grpc/_channel.py",
>  line 358, in _next    raise selfgrpc._channel._Rendezvous: <_Rendezvous of 
> RPC that terminated with: status = StatusCode.RESOURCE_EXHAUSTED details = 
> "Received message larger than max (4806980 vs. 4194304)" debug_error_string = 
> "{"created":"@1580828874.271749579","description":"Received message larger 
> than max (4806980 vs. 
> 4194304)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":174,"grpc_status":8}">Exception
>  in thread run_worker:Traceback (most recent call last):  File 
> "/opt/man/releases/python-medusa/36-1/lib/python3.6/threading.py", line 916, 
> in _bootstrap_inner    self.run()  File 
> "/opt/man/releases/python-medusa/36-1/lib/python3.6/threading.py", line 864, 
> in run    self._target(*self._args, **self._kwargs)  File 
> "/users/is/enazif/beam/sdks/python/apache_beam/runners/portability/local_job_service.py",
>  line 218, in run    'Worker subprocess exited with return code %s' % 
> p.returncode)RuntimeError: Worker subprocess exited with return code 1
> {code}
>  
> This seems to be due to the fact that we're setting 
> grpc.max_receive_message_length and grpc.max_send_message_length on the GRPC 
> server side, but not on the channel side 
> ([https://github.com/apache/beam/blob/7547ac6b273e6e2ffe7d69775606e14c0fd455b2/sdks/python/apache_beam/runners/worker/channel_factory.py#L28)|https://github.com/apache/beam/blob/7547ac6b273e6e2ffe7d69775606e14c0fd455b2/sdks/python/apache_beam/runners/worker/channel_factory.py#L28]
> Adding grpc.max_receive_message_length and grpc.max_send_message_length to 
> the default options here fixes the issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to