Since I don't think this is a contentious change.

On Fri, Apr 19, 2019 at 9:25 AM Lukasz Cwik <lc...@google.com> wrote:

> Yes, using T makes sense.
>
> The WindowedValue was meant to be a context object in the SDK harness that
> propagates various information about the current element. We have discussed
> in the past about:
> * making optimizations which would pass around less of the context
> information if we know that the DoFns don't need it (for example, all the
> values share the same window).
> * versioning the encoding separately from the WindowedValue context object
> (see recent discussion about element timestamp precision [1])
> * the runner may want its own representation of a context object that
> makes sense for it which isn't a WindowedValue necessarily.
>
> Feel free to cut a JIRA about this and start working on a change towards
> this.
>
> 1:
> https://lists.apache.org/thread.html/221b06e81bba335d0ea8d770212cc7ee047dba65bec7978368a51473@%3Cdev.beam.apache.org%3E
>
> On Fri, Apr 19, 2019 at 3:18 AM jincheng sun <sunjincheng...@gmail.com>
> wrote:
>
>> Hi Beam devs,
>>
>> I read some of the docs about `Communicating over the Fn API` in Beam. I
>> feel that Beam has a very good design for Control Plane/Data Plane/State
>> Plane/Logging services, and it is described in <How to send and receive
>> data> document. When communicating between Runner and SDK Harness, the
>> DataPlane API will be WindowedValue(An immutable triple of value,
>> timestamp, and windows.) As a contract object between Runner and SDK
>> Harness. I see the interface definitions for sending and receiving data in
>> the code as follows:
>>
>> - org.apache.beam.runners.fnexecution.data.FnDataService
>>
>> public interface FnDataService {
>>>   <T> InboundDataClient receive(LogicalEndpoint inputLocation,
>>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> listener);
>>>   <T> CloseableFnDataReceiver<WindowedValue<T>> send(
>>>       LogicalEndpoint outputLocation, Coder<WindowedValue<T>> coder);
>>> }
>>
>>
>>
>> - org.apache.beam.fn.harness.data.BeamFnDataClient
>>
>> public interface BeamFnDataClient {
>>>   <T> InboundDataClient receive(ApiServiceDescriptor
>>> apiServiceDescriptor, LogicalEndpoint inputLocation,
>>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> receiver);
>>>   <T> CloseableFnDataReceiver<WindowedValue<T>>
>>> send(BeamFnDataGrpcClient Endpoints.ApiServiceDescriptor
>>> apiServiceDescriptor, LogicalEndpoint outputLocation,
>>> Coder<WindowedValue<T>> coder);
>>> }
>>
>>
>> Both `Coder<WindowedValue<T>>` and `FnDataReceiver<WindowedValue<T>>` use
>> `WindowedValue` as the data structure that both sides of Runner and SDK
>> Harness know each other. Control Plane/Data Plane/State Plane/Logging is a
>> highly abstraction, such as Control Plane and Logging, these are common
>> requirements for all multi-language platforms. For example, the Flink
>> community is also discussing how to support Python UDF, as well as how to
>> deal with docker environment. how to data transfer, how to state access,
>> how to logging etc. If Beam can further abstract these service interfaces,
>> i.e., interface definitions are compatible with multiple engines, and
>> finally provided to other projects in the form of class libraries, it
>> definitely will help other platforms that want to support multiple
>> languages. So could beam can further abstract the interface definition of
>> FnDataService's BeamFnDataClient? Here I am to throw out a minnow to catch
>> a whale, take the FnDataService#receive interface as an example, and turn
>> `WindowedValue<T>` into `T` so that other platforms can be extended
>> arbitrarily, as follows:
>>
>> <T> InboundDataClient receive(LogicalEndpoint inputLocation, Coder<T>
>> coder, FnDataReceiver<T>> listener);
>>
>> What do you think?
>>
>> Feel free to correct me if there any incorrect understanding. And welcome
>> any feedback!
>>
>>
>> Regards,
>> Jincheng
>>
>

Reply via email to