[
https://issues.apache.org/jira/browse/BEAM-3645?focusedWorklogId=255816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-255816
]
ASF GitHub Bot logged work on BEAM-3645:
----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Jun/19 11:13
Start Date: 07/Jun/19 11:13
Worklog Time Spent: 10m
Work Description: robertwb commented on pull request #8769: [WIP]
[BEAM-3645] support multi processes for Python FnApiRunner with
EmbeddedGrpcWorkerHandler
URL: https://github.com/apache/beam/pull/8769#discussion_r291543845
##########
File path: sdks/python/apache_beam/runners/portability/fn_api_runner.py
##########
@@ -1035,19 +1102,32 @@ def stop_worker(self):
class EmbeddedGrpcWorkerHandler(GrpcWorkerHandler):
def __init__(self, num_workers_payload, state, provision_info):
super(EmbeddedGrpcWorkerHandler, self).__init__(state, provision_info)
- self._num_threads = int(num_workers_payload) if num_workers_payload else 1
+ self._num_workers = int(num_workers_payload) \
+ if num_workers_payload else 1
+ self._worker_list = []
def start_worker(self):
- self.worker = sdk_worker.SdkHarness(
- self.control_address, worker_count=self._num_threads)
- self.worker_thread = threading.Thread(
- name='run_worker', target=self.worker.run)
- self.worker_thread.daemon = True
- self.worker_thread.start()
+ _work_commend_line = b'%s -m apache_beam.runners.worker.sdk_worker_main' \
Review comment:
(Actually, the ability to manage multiple workers shouldn't be tied to a
particular type of worker; we should be able to handle multiple docker workers,
multiple in-process workers, multiple sub-process workers, etc. which would
indicate this should be a new type of class that delegates to a set of
WorkerHandlers. It is an optimization (that can come later) to share a single
control and data service rather than start one for each worker.)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 255816)
Time Spent: 2.5h (was: 2h 20m)
> Support multi-process execution on the FnApiRunner
> --------------------------------------------------
>
> Key: BEAM-3645
> URL: https://issues.apache.org/jira/browse/BEAM-3645
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Affects Versions: 2.2.0, 2.3.0
> Reporter: Charles Chen
> Assignee: Hannah Jiang
> Priority: Major
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance
> gain over the previous DirectRunner. We can do even better in multi-core
> environments by supporting multi-process execution in the FnApiRunner, to
> scale past Python GIL limitations.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)