[jira] [Commented] (BEAM-3645) Support multi-process execution on the FnApiRunner
[ https://issues.apache.org/jira/browse/BEAM-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924585#comment-16924585 ] Yifan Mai commented on BEAM-3645: - While testing this I noticed that the multi-process runner does not handle SIGINT gracefully. To reproduce, run wordcount.py using the "Run with multiprocessing mode" instructions from the comment above (in Python 3). Expected: wordcount terminates gracefully when Ctrl-C is pressed during pipeline execution (similarly to default direct runner) Actual: wordcount hangs forever after printing the following once per worker: {code} Exception in thread run_worker: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py", line 216, in run 'Worker subprocess exited with return code %s' % p.returncode) RuntimeError: Worker subprocess exited with return code 1 {code} > Support multi-process execution on the FnApiRunner > -- > > Key: BEAM-3645 > URL: https://issues.apache.org/jira/browse/BEAM-3645 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Charles Chen >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.15.0 > > Time Spent: 35h 20m > Remaining Estimate: 0h > > https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance > gain over the previous DirectRunner. We can do even better in multi-core > environments by supporting multi-process execution in the FnApiRunner, to > scale past Python GIL limitations. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-3645) Support multi-process execution on the FnApiRunner
[ https://issues.apache.org/jira/browse/BEAM-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896575#comment-16896575 ] Hannah Jiang commented on BEAM-3645: Previously, shuffle data, which we call buffer as well with FnApi, is processed by a single worker. Now, the shuffle data, except WindowGroupingBuffer, is partitioned to N chunks and each chunk is processed by a worker, thus processing performance is improved. Depending on running environment, partitioned shuffles can be multi-thread or multi-process processed. Fn Api provides following running environment. 1. EmbeddedWorkerHandler 2. EmbeddedGrpcWorkerHandler 3. SubprocessSdkWorkerHandler 4. DockerSdkWorkerHandler 5. ExternalWorkerHandler Workers running with SubprocessSdkWorkerHandler are running at processes, so if we want to multi-process shuffle data, we should use this environment. Workers running with other worker handlers are running at a single process. {code:java} --direct_num_workers {code} option is used to define number of partitions of shuffle, default value is 1. > Support multi-process execution on the FnApiRunner > -- > > Key: BEAM-3645 > URL: https://issues.apache.org/jira/browse/BEAM-3645 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Charles Chen >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.15.0 > > Time Spent: 35h > Remaining Estimate: 0h > > https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance > gain over the previous DirectRunner. We can do even better in multi-core > environments by supporting multi-process execution in the FnApiRunner, to > scale past Python GIL limitations. -- This message was sent by Atlassian JIRA (v7.6.14#76016)