Hi Reuven, Thanks for the update ! As I'm working with you on this, I fully agree and great doc gathering the ideas.
It's clearly something we have to add asap in Beam, because it would allow new use cases for our users (in a simple way) and open new areas for the runners (for instance dataframe support in the Spark runner). By the way, while ago, I created BEAM-3437 to track the PoC/PR around this. Thanks ! Regards JB On 01/29/2018 02:08 AM, Reuven Lax wrote: > Previously I submitted a proposal for adding schemas as a first-class concept > on > Beam PCollections. The proposal engendered quite a bit of discussion from the > community - more discussion than I've seen from almost any of our proposals to > date! > > Based on the feedback and comments, I reworked the proposal document quite a > bit. It now talks more explicitly about the different between dynamic schemas > (where the schema is not fully not know at graph-creation time), and static > schemas (which are fully know at graph-creation time). Proposed APIs are more > fleshed out now (again thanks to feedback from community members), and the > document talks in more detail about evolving schemas in long-running streaming > pipelines. > > Please take a look. I think this will be very valuable to Beam, and welcome > any > feedback. > > https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit# > > Reuven -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com