Interesting. And given that Beam Schemas are recursive (a row can contain
nested rows), we might actually need something like xpath if we want to
make this fully general.

Reuven

On Mon, Jun 4, 2018 at 5:45 PM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

> Yup, it makes sense, it's what I had in mind.
>
> In Apache Camel, in a Processor (similar to a DoFn), we can also pass
> directly languages to the arguments.
>
> We can imagine something like:
>
> @ProcessElement void process(@json-path("foo") String foo)
>
> @ProcessElement void process(@xpath("//foo") String foo)
>
> or even a expression language (simple/groovy/whatever).
>
> Regards
> JB
>
> On 04/06/2018 16:39, Reuven Lax wrote:
> > In the schema branch I have already added some annotations for Schema.
> > However in the future I think we could go even further and allow users
> > to pick individual fields out of the row schema. e.g. the user might
> > have a Schema with 100 fields, but only want to process userId and geo
> > location. I could imagine something like this
> >
> > @ProcessElement void process(@Field("userId") String
> > userId, @Field("latitude") double lat, @Field("longitude") double long) {
> > }
> >
> > And Beam could automatically extract the right fields for the user. In
> > fact we could do the same thing with KVs today - supplying annotations
> > to automatically unpack the KV.
> >
> > I do think there are a few nice ways to do side inputs as well, but it's
> > more work to design implement which is why I left it off (and given that
> > there is some design work, side input annotations should be discussed on
> > the dev list before implementation IMO).
> >
> > Reuven
> >
> > On Mon, Jun 4, 2018 at 5:29 PM Jean-Baptiste Onofré <j...@nanthrax.net
> > <mailto:j...@nanthrax.net>> wrote:
> >
> >     Hi Reuven,
> >
> >     That's a great improvement for user.
> >
> >     I don't see an easy way to have annotation about side input/output.
> >     I think we can also plan some extension annotation about schema. Like
> >     @Element(schema = foo) in addition of the type. Thoughts ?
> >
> >     Regards
> >     JB
> >
> >     On 04/06/2018 16:06, Reuven Lax wrote:
> >     > Beam was created with an annotation-based processing API, that
> allows
> >     > the framework to automatically inject parameters to a DoFn's
> process
> >     > method (and also allows the user to mark any method as the process
> >     > method using @ProcessElement). However, these annotations were
> never
> >     > completed. A specific set of parameters could be injected (e.g. the
> >     > window or PipelineOptions), but for anything else you had to
> access it
> >     > through the ProcessContext. This limited the readability advantage
> of
> >     > this API.
> >     >
> >     > A couple of months ago I spent a bit of time extending the set of
> >     > annotations allowed. In particular, the most common uses of
> >     > ProcessContext were accessing the input element and outputting
> >     elements,
> >     > and both of those can now be done without ProcessContext. Example
> >     usage:
> >     >
> >     > new DoFn<InputT, OutputT>() {
> >     >   @ProcessElement process(@Element InputT element,
> >     > OutputReceiver<OutputT> out) {
> >     >     out.output(convertInputToOutput(element));
> >     >   }
> >     > }
> >     >
> >     > No need for ProcessContext anywhere in this DoFn! The Beam
> framework
> >     > also does type checking - if the @Element type was not InputT, you
> >     would
> >     > have seen an error. Multi-output DoFns also work, using a
> >     > MultiOutputReceiver interface.
> >     >
> >     > I'll update the Beam docs later with this information, but most
> >     > information accessible from ProcessContext, OnTimerContext,
> >     > StartBundleContext, or FinishBundleContext can now be accessed via
> >     this
> >     > sort of injection. The main exceptions are side inputs and output
> from
> >     > finishbundle, both of which still require the context objects;
> >     however I
> >     > hope to find time to provide direct access to those as well.
> >     >
> >     > pr/5331 (in progress) converts most of Beam's built-in transforms
> >     to use
> >     > this clearer style.
> >     >
> >     > Reuven
> >
> >     --
> >     Jean-Baptiste Onofré
> >     jbono...@apache.org <mailto:jbono...@apache.org>
> >     http://blog.nanthrax.net
> >     Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to