This has lain dormant as I was drawn off to other things. But now I'm
looping back on this so there are no surprises in my upcoming (third)
revision to PR #662 [1] to use protocol buffers instead of JSON schema or
Avro (the two prior versions - now I know what the runner API looks like in
every format :-)).

Here's the reasoning:

1. Since the Fn API requires the SDK harness to have protocol buffers
support, there is no portability to be gained by having a proto-independent
JSON schema or Avro schema for the Runner API. As currently designed, a
language will need proto support in order to implement a Beam SDK.

2. Since proto has a JSON format that can be used for human readability,
there's not really a usability benefit to using JSON schema and some other
form of JSON.

3. Generation of helper libraries for proto is nice versus having a json
schema, where support for generating POJOs, etc, might be incomplete or
strange for some languages.

4. Some of the core generic "graph with stuff on the nodes and edges"
definitions can be shared.

If I've overlooked something, I'd love to hear about it.

Kenn

[1] https://github.com/apache/beam/pull/662

On Fri, Jul 15, 2016 at 8:24 AM, Lukasz Cwik <[email protected]>
wrote:

> Just to give people an update, I'm still working on collecting data.
>
> On Wed, Jun 29, 2016 at 10:47 AM, Aljoscha Krettek <[email protected]>
> wrote:
>
> > My bad, I didn't know that. Thanks for the clarification!
> >
> > On Wed, 29 Jun 2016 at 16:38 Daniel Kulp <[email protected]> wrote:
> >
> > >
> > > > On Jun 27, 2016, at 10:24 AM, Aljoscha Krettek <[email protected]>
> > > wrote:
> > > >
> > > > Out of the systems you suggested Thrift and ProtoBuf3 + gRPC are
> > probably
> > > > best suited for the task. Both of these provide a way for generating
> > > > serializers as well as for specifying an RPC interface. Avro and
> > > > FlatBuffers are only dealing in serializers and we would have to roll
> > our
> > > > own RPC system on top of these.
> > >
> > >
> > > Just a point of clarification, Avro does handle RPC as well as
> > > serialization.   It's one of the main bullets on their overview page:
> > >
> > > http://avro.apache.org/docs/current/index.html
> > >
> > > Unfortunately, their documentation around the subject really sucks.
> Some
> > > info at:
> > >
> > >
> > https://cwiki.apache.org/confluence/display/AVRO/Porting+
> Existing+RPC+Frameworks
> > >
> > > and a “quick start”:
> > >
> > > https://github.com/phunt/avro-rpc-quickstart
> > >
> > >
> > >
> > > --
> > > Daniel Kulp
> > > [email protected] - http://dankulp.com/blog
> > > Talend Community Coder - http://coders.talend.com
> > >
> > >
> >
>

Reply via email to