Thanks JB! I definitely recommend this API for all new file based IOs instead of WriteFiles. I'm very curious whether it's a good fit for writing Parquet files too.
On Tue, Dec 19, 2017, 8:20 PM Jean-Baptiste Onofré <[email protected]> wrote: > Sweat !!! > > Thanks Eugene. > > As part of my current work on ParquetIO, I will take a look ! > > Regards > JB > > On 12/19/2017 11:41 PM, Eugene Kirpichov wrote: > > Hey all, > > > > A while ago I proposed an API http://s.apache.org/fileio-write . > > It has just landed on master https://github.com/apache/beam/pull/3817, > in > > somewhat improved form compared to the initial proposal. > > > > I think it's a cool API and I'm excited that it'll be in Beam 2.3. > Please give > > it a try (e.g. by using 2.3.0-SNAPSHOT) :) > > > > Check out some examples in the Javadoc e.g. > > > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java#L242 > > > > > > The main selling points are: > > - It is really really easy to write files of a custom format using this > API > > (example above shows how one could write List<String>'s to CSV with a > header). > > - The API is very Java8-friendly (much more so than the current > > DynamicDestinations APIs in TextIO/AvroIO, which I would like to > deprecate in > > Beam 2.3) > > - It gives a common API to use for various file-based IOs that want to > get all > > the fancy features - e.g. https://github.com/apache/beam/pull/4294 shows > how to > > do that with TFRecordIO and XmlIO: they previously didn't have access to > > features like dynamic destinations, and now they do: you can use > > TFRecordIO.sink() and XmlIO.sink() with FileIO.write() or writeDynamic(). > > > > Thanks to +Reuven Lax <mailto:[email protected]> and +Chamikara Jayalath > > <mailto:[email protected]> for reviews. > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
