Since I don't think this is a contentious change. On Fri, Apr 19, 2019 at 9:25 AM Lukasz Cwik <lc...@google.com> wrote:
> Yes, using T makes sense. > > The WindowedValue was meant to be a context object in the SDK harness that > propagates various information about the current element. We have discussed > in the past about: > * making optimizations which would pass around less of the context > information if we know that the DoFns don't need it (for example, all the > values share the same window). > * versioning the encoding separately from the WindowedValue context object > (see recent discussion about element timestamp precision [1]) > * the runner may want its own representation of a context object that > makes sense for it which isn't a WindowedValue necessarily. > > Feel free to cut a JIRA about this and start working on a change towards > this. > > 1: > https://lists.apache.org/thread.html/221b06e81bba335d0ea8d770212cc7ee047dba65bec7978368a51473@%3Cdev.beam.apache.org%3E > > On Fri, Apr 19, 2019 at 3:18 AM jincheng sun <sunjincheng...@gmail.com> > wrote: > >> Hi Beam devs, >> >> I read some of the docs about `Communicating over the Fn API` in Beam. I >> feel that Beam has a very good design for Control Plane/Data Plane/State >> Plane/Logging services, and it is described in <How to send and receive >> data> document. When communicating between Runner and SDK Harness, the >> DataPlane API will be WindowedValue(An immutable triple of value, >> timestamp, and windows.) As a contract object between Runner and SDK >> Harness. I see the interface definitions for sending and receiving data in >> the code as follows: >> >> - org.apache.beam.runners.fnexecution.data.FnDataService >> >> public interface FnDataService { >>> <T> InboundDataClient receive(LogicalEndpoint inputLocation, >>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> listener); >>> <T> CloseableFnDataReceiver<WindowedValue<T>> send( >>> LogicalEndpoint outputLocation, Coder<WindowedValue<T>> coder); >>> } >> >> >> >> - org.apache.beam.fn.harness.data.BeamFnDataClient >> >> public interface BeamFnDataClient { >>> <T> InboundDataClient receive(ApiServiceDescriptor >>> apiServiceDescriptor, LogicalEndpoint inputLocation, >>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> receiver); >>> <T> CloseableFnDataReceiver<WindowedValue<T>> >>> send(BeamFnDataGrpcClient Endpoints.ApiServiceDescriptor >>> apiServiceDescriptor, LogicalEndpoint outputLocation, >>> Coder<WindowedValue<T>> coder); >>> } >> >> >> Both `Coder<WindowedValue<T>>` and `FnDataReceiver<WindowedValue<T>>` use >> `WindowedValue` as the data structure that both sides of Runner and SDK >> Harness know each other. Control Plane/Data Plane/State Plane/Logging is a >> highly abstraction, such as Control Plane and Logging, these are common >> requirements for all multi-language platforms. For example, the Flink >> community is also discussing how to support Python UDF, as well as how to >> deal with docker environment. how to data transfer, how to state access, >> how to logging etc. If Beam can further abstract these service interfaces, >> i.e., interface definitions are compatible with multiple engines, and >> finally provided to other projects in the form of class libraries, it >> definitely will help other platforms that want to support multiple >> languages. So could beam can further abstract the interface definition of >> FnDataService's BeamFnDataClient? Here I am to throw out a minnow to catch >> a whale, take the FnDataService#receive interface as an example, and turn >> `WindowedValue<T>` into `T` so that other platforms can be extended >> arbitrarily, as follows: >> >> <T> InboundDataClient receive(LogicalEndpoint inputLocation, Coder<T> >> coder, FnDataReceiver<T>> listener); >> >> What do you think? >> >> Feel free to correct me if there any incorrect understanding. And welcome >> any feedback! >> >> >> Regards, >> Jincheng >> >