[
https://issues.apache.org/jira/browse/BEAM-3645?focusedWorklogId=282591&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282591
]
ASF GitHub Bot logged work on BEAM-3645:
----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Jul/19 10:06
Start Date: 25/Jul/19 10:06
Worklog Time Spent: 10m
Work Description: robertwb commented on pull request #8979: [BEAM-3645]
add multiplexing for python FnApiRunner
URL: https://github.com/apache/beam/pull/8979#discussion_r307209414
##########
File path: sdks/python/apache_beam/runners/worker/data_plane.py
##########
@@ -69,22 +69,78 @@ def close(self):
self._close_callback(self.get())
-class DataChannel(with_metaclass(abc.ABCMeta, object)):
- """Represents a channel for reading and writing data over the data plane.
+class DataChannel(object):
+
+ def __init__(self):
+ self.data_conn = DataChannelConnection()
+
+
+class InMemoryDataChannel(DataChannel):
+ """An in-memory implementation of a DataChannel.
+
+ This channel is two-sided. What is written to one side is read by the other.
+ The inverse() method returns the other side of an instance.
+ """
+
+ def __init__(self, inverse=None, data_conn=None):
+ self.data_conn = data_conn or InMemoryDataChannelConnection()
+ self._inverse = inverse or InMemoryDataChannel(
+ self, self.data_conn.inverse())
+
+ def inverse(self):
+ return self._inverse
- Read from this channel with the input_elements method::
+
+class GrpcClientDataChannel(DataChannel):
+ """A DataChannel wrapping the client side of a BeamFnData connection."""
+
+ def __init__(self, data_stub):
+ self.data_conn = GrpcDataChannelConnection()
+ self.data_conn._start_reader(data_stub.Data(
+ self.data_conn._write_outputs()))
+
+
+class GrpcServerDataChannel(
+ beam_fn_api_pb2_grpc.BeamFnDataServicer, DataChannel):
+ """A DataChannel wrapping the server side of a BeamFnData connection."""
+
+ def __init__(self):
+ self.data_conn = GrpcDataChannelConnection()
+
+ def Data(self, elements_iterator, context):
+ # py27 doesn't support recursive import at module level
+ from apache_beam.runners.portability import fn_api_runner
+ worker_id = dict(context.invocation_metadata()).get('worker_id')
+ # if data_plane_test, GrpcServer is not created, hence conn_handler will
+ # throw out an error.
+ try:
+ conn_handler = fn_api_runner.GrpcServer.get_data_conn_handler()
+ self.data_conn = conn_handler.get(worker_id)
+ except:
+ pass
+ self.data_conn._start_reader(elements_iterator)
Review comment:
_start_reader probably shouldn't be private if it's meant to be called here.
(I like the name, maybe rename set_inputs of the control connection similarly
for consistency).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 282591)
> Support multi-process execution on the FnApiRunner
> --------------------------------------------------
>
> Key: BEAM-3645
> URL: https://issues.apache.org/jira/browse/BEAM-3645
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Affects Versions: 2.2.0, 2.3.0
> Reporter: Charles Chen
> Assignee: Hannah Jiang
> Priority: Major
> Fix For: 2.15.0
>
> Time Spent: 31h
> Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance
> gain over the previous DirectRunner. We can do even better in multi-core
> environments by supporting multi-process execution in the FnApiRunner, to
> scale past Python GIL limitations.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)