I would like to avoid complicating the capability matrix itself with such details. Hopefully, user documentation for each of these features would (eventually) give insights what you could use them for, and we could cross-link to that. For now, you can refer to the Dataflow SDK documentation to get some of this information [1]. (We'll have that ported over to Beam soon.)
The answer your specific question about priority, you should probably prioritize "what" over "where" over "when" over "how" parts. That said, it is probably fine to advance to the next category once you have figured out the first few bullets in the current category. [1] https://cloud.google.com/dataflow/model/programming-model On Wed, Apr 20, 2016 at 2:11 AM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi Manu, > > generally speaking, we have to add a complete started guide with "real" > use cases to illustrate beam usage. > > I'm preparing some website PR about this (with the overview of IOs, > DSLs/SDKs, runners, etc we discussed early). > > Regards > JB > > > On 04/20/2016 10:22 AM, Manu Zhang wrote: > >> Guys, >> >> Do you think it's valuable to add real world use cases to capability >> matrix >> <http://beam.incubator.apache.org/capability-matrix/> ? >> Then, we could know why a particular capability is needed and which should >> be prioritized for runner implementations. >> I found some examples in the Dataflow paper (3.3) >> < >> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf >> > >> and another reference is >> http://www.vldb.org/pvldb/vol8/p2040-Kejariwal.pdf. >> >> Thanks, >> Manu Zhang >> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
