Yup, it makes sense, it's what I had in mind. In Apache Camel, in a Processor (similar to a DoFn), we can also pass directly languages to the arguments.
We can imagine something like: @ProcessElement void process(@json-path("foo") String foo) @ProcessElement void process(@xpath("//foo") String foo) or even a expression language (simple/groovy/whatever). Regards JB On 04/06/2018 16:39, Reuven Lax wrote: > In the schema branch I have already added some annotations for Schema. > However in the future I think we could go even further and allow users > to pick individual fields out of the row schema. e.g. the user might > have a Schema with 100 fields, but only want to process userId and geo > location. I could imagine something like this > > @ProcessElement void process(@Field("userId") String > userId, @Field("latitude") double lat, @Field("longitude") double long) { > } > > And Beam could automatically extract the right fields for the user. In > fact we could do the same thing with KVs today - supplying annotations > to automatically unpack the KV. > > I do think there are a few nice ways to do side inputs as well, but it's > more work to design implement which is why I left it off (and given that > there is some design work, side input annotations should be discussed on > the dev list before implementation IMO). > > Reuven > > On Mon, Jun 4, 2018 at 5:29 PM Jean-Baptiste Onofré <j...@nanthrax.net > <mailto:j...@nanthrax.net>> wrote: > > Hi Reuven, > > That's a great improvement for user. > > I don't see an easy way to have annotation about side input/output. > I think we can also plan some extension annotation about schema. Like > @Element(schema = foo) in addition of the type. Thoughts ? > > Regards > JB > > On 04/06/2018 16:06, Reuven Lax wrote: > > Beam was created with an annotation-based processing API, that allows > > the framework to automatically inject parameters to a DoFn's process > > method (and also allows the user to mark any method as the process > > method using @ProcessElement). However, these annotations were never > > completed. A specific set of parameters could be injected (e.g. the > > window or PipelineOptions), but for anything else you had to access it > > through the ProcessContext. This limited the readability advantage of > > this API. > > > > A couple of months ago I spent a bit of time extending the set of > > annotations allowed. In particular, the most common uses of > > ProcessContext were accessing the input element and outputting > elements, > > and both of those can now be done without ProcessContext. Example > usage: > > > > new DoFn<InputT, OutputT>() { > > @ProcessElement process(@Element InputT element, > > OutputReceiver<OutputT> out) { > > out.output(convertInputToOutput(element)); > > } > > } > > > > No need for ProcessContext anywhere in this DoFn! The Beam framework > > also does type checking - if the @Element type was not InputT, you > would > > have seen an error. Multi-output DoFns also work, using a > > MultiOutputReceiver interface. > > > > I'll update the Beam docs later with this information, but most > > information accessible from ProcessContext, OnTimerContext, > > StartBundleContext, or FinishBundleContext can now be accessed via > this > > sort of injection. The main exceptions are side inputs and output from > > finishbundle, both of which still require the context objects; > however I > > hope to find time to provide direct access to those as well. > > > > pr/5331 (in progress) converts most of Beam's built-in transforms > to use > > this clearer style. > > > > Reuven > > -- > Jean-Baptiste Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > http://blog.nanthrax.net > Talend - http://www.talend.com > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com