Hi Lukasz,

Thanks for your affirmation and provide more contextual information. :)

Would you please give me the contributor permission?  My JIRA ID is
sunjincheng121.

I would like to create/assign tickets for this work.

Thanks,
Jincheng

Lukasz Cwik <lc...@google.com> 于2019年4月20日周六 上午12:26写道:

> Since I don't think this is a contentious change.
>
> On Fri, Apr 19, 2019 at 9:25 AM Lukasz Cwik <lc...@google.com> wrote:
>
>> Yes, using T makes sense.
>>
>> The WindowedValue was meant to be a context object in the SDK harness
>> that propagates various information about the current element. We have
>> discussed in the past about:
>> * making optimizations which would pass around less of the context
>> information if we know that the DoFns don't need it (for example, all the
>> values share the same window).
>> * versioning the encoding separately from the WindowedValue context
>> object (see recent discussion about element timestamp precision [1])
>> * the runner may want its own representation of a context object that
>> makes sense for it which isn't a WindowedValue necessarily.
>>
>> Feel free to cut a JIRA about this and start working on a change towards
>> this.
>>
>> 1:
>> https://lists.apache.org/thread.html/221b06e81bba335d0ea8d770212cc7ee047dba65bec7978368a51473@%3Cdev.beam.apache.org%3E
>>
>> On Fri, Apr 19, 2019 at 3:18 AM jincheng sun <sunjincheng...@gmail.com>
>> wrote:
>>
>>> Hi Beam devs,
>>>
>>> I read some of the docs about `Communicating over the Fn API` in Beam. I
>>> feel that Beam has a very good design for Control Plane/Data Plane/State
>>> Plane/Logging services, and it is described in <How to send and receive
>>> data> document. When communicating between Runner and SDK Harness, the
>>> DataPlane API will be WindowedValue(An immutable triple of value,
>>> timestamp, and windows.) As a contract object between Runner and SDK
>>> Harness. I see the interface definitions for sending and receiving data in
>>> the code as follows:
>>>
>>> - org.apache.beam.runners.fnexecution.data.FnDataService
>>>
>>> public interface FnDataService {
>>>>   <T> InboundDataClient receive(LogicalEndpoint inputLocation,
>>>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> listener);
>>>>   <T> CloseableFnDataReceiver<WindowedValue<T>> send(
>>>>       LogicalEndpoint outputLocation, Coder<WindowedValue<T>> coder);
>>>> }
>>>
>>>
>>>
>>> - org.apache.beam.fn.harness.data.BeamFnDataClient
>>>
>>> public interface BeamFnDataClient {
>>>>   <T> InboundDataClient receive(ApiServiceDescriptor
>>>> apiServiceDescriptor, LogicalEndpoint inputLocation,
>>>> Coder<WindowedValue<T>> coder, FnDataReceiver<WindowedValue<T>> receiver);
>>>>   <T> CloseableFnDataReceiver<WindowedValue<T>>
>>>> send(BeamFnDataGrpcClient Endpoints.ApiServiceDescriptor
>>>> apiServiceDescriptor, LogicalEndpoint outputLocation,
>>>> Coder<WindowedValue<T>> coder);
>>>> }
>>>
>>>
>>> Both `Coder<WindowedValue<T>>` and `FnDataReceiver<WindowedValue<T>>`
>>> use `WindowedValue` as the data structure that both sides of Runner and SDK
>>> Harness know each other. Control Plane/Data Plane/State Plane/Logging is a
>>> highly abstraction, such as Control Plane and Logging, these are common
>>> requirements for all multi-language platforms. For example, the Flink
>>> community is also discussing how to support Python UDF, as well as how to
>>> deal with docker environment. how to data transfer, how to state access,
>>> how to logging etc. If Beam can further abstract these service interfaces,
>>> i.e., interface definitions are compatible with multiple engines, and
>>> finally provided to other projects in the form of class libraries, it
>>> definitely will help other platforms that want to support multiple
>>> languages. So could beam can further abstract the interface definition of
>>> FnDataService's BeamFnDataClient? Here I am to throw out a minnow to catch
>>> a whale, take the FnDataService#receive interface as an example, and turn
>>> `WindowedValue<T>` into `T` so that other platforms can be extended
>>> arbitrarily, as follows:
>>>
>>> <T> InboundDataClient receive(LogicalEndpoint inputLocation, Coder<T>
>>> coder, FnDataReceiver<T>> listener);
>>>
>>> What do you think?
>>>
>>> Feel free to correct me if there any incorrect understanding. And
>>> welcome any feedback!
>>>
>>>
>>> Regards,
>>> Jincheng
>>>
>>

Reply via email to