Hi Pei, Thanks for sharing.
For the goals, I fully agree with you: as already discussed, the purpose is to have "pluggable" filesystems that will allow us to easily with local, gs, hdfs, s3 filesystems (and even more).
After a quick first glance, it looks good to me. I will try to evaluate the impact later today.
IMHO, once this change is done, the HdfsIO (in the sdk/java/io) should be flagged as deprecated.
Regards JB On 11/17/2016 01:09 AM, Pei He wrote:
Hi, I am working on BEAM-59 <https://issues.apache.org/jira/browse/BEAM-59> "IOChannelFactory redesign". The goals are: 1. Support file-based IOs (TextIO, AvorIO) with user-defined file system. 2. Support configuring any user-defined file system. And, I drafted the design proposal in two parts to address them in order: Part 1: IOChannelFactory Redesign <https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#> Summary: Old API: WritableByteChannel create(String spec, String mimeType); New API: WritableByteChannel create(URI uri, CreateOptions options); Noticeable proposed changes: 1. Includes the options parameter in most methods to specify behaviors. 2. Replace String with URI to include scheme for files/directories locations. 3. Require file systems to provide a SeekableByteChannel for read. 4. Additional methods, such as getMetadata(), rename() e.t.c Part 2: Configurable BeamFileSystem <https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs> Summary: Old API: IOChannelUtils.getFactory(glob).match(glob); New API: BeamFileSystems.getFileSystem(glob, config).match(glob); Looking for comments and feedback. Thanks -- Pei
-- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com