Thanks JB for the feedback.

Yes, we should provide a hadoop.fs.FileSystem adaptor. As you said, it will
make a range of file system available in Beam.

And, people can choose to implement BeamFileSystem directly to get the best
performance (For example, providing bulk operations.)

--
Pei



On Tue, Nov 29, 2016 at 11:11 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Pei,
>
> rethinking about that, I understand that the purpose of the Beam
> filesystem is to avoid to bring a bunch of dependencies into the core. That
> makes perfect sense.
>
> So, I agree that a Beam filesystem abstract is fine.
>
> My point is that we should provide a HadoopFilesystem extension/plugin for
> Beam filesystem asap: that would help us to support a good range of
> filesystems quickly.
>
> Just my $0.01 ;)
>
> Regards
> JB
>
>
> On 11/17/2016 08:18 PM, Pei He wrote:
>
>> Hi JB,
>> My proposals are based on the current IOChannelFactory, and how they are
>> used in FileBasedSink.
>>
>> Let's me spend more time to investigate Hadoop FileSystem interface.
>> --
>> Pei
>>
>> On Thu, Nov 17, 2016 at 1:21 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>> By the way, Pei, for the record: why introducing BeamFileSystem and not
>>> using the Hadoop FileSystem interface ?
>>>
>>> Thanks
>>> Regards
>>> JB
>>>
>>> On 11/17/2016 01:09 AM, Pei He wrote:
>>>
>>> Hi,
>>>>
>>>> I am working on BEAM-59
>>>> <https://issues.apache.org/jira/browse/BEAM-59> "IOChannelFactory
>>>> redesign". The goals are:
>>>>
>>>> 1. Support file-based IOs (TextIO, AvorIO) with user-defined file
>>>> system.
>>>>
>>>> 2. Support configuring any user-defined file system.
>>>>
>>>> And, I drafted the design proposal in two parts to address them in
>>>> order:
>>>>
>>>> Part 1: IOChannelFactory Redesign
>>>> <https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJ
>>>> sVG3qel2lhdKTknmZ_7M/edit#>
>>>>
>>>> Summary:
>>>>
>>>> Old API: WritableByteChannel create(String spec, String mimeType);
>>>>
>>>> New API: WritableByteChannel create(URI uri, CreateOptions options);
>>>>
>>>> Noticeable proposed changes:
>>>>
>>>>
>>>>    1.
>>>>
>>>>    Includes the options parameter in most methods to specify behaviors.
>>>>    2.
>>>>
>>>>    Replace String with URI to include scheme for files/directories
>>>>    locations.
>>>>    3.
>>>>
>>>>    Require file systems to provide a SeekableByteChannel for read.
>>>>    4.
>>>>
>>>>    Additional methods, such as getMetadata(), rename() e.t.c
>>>>
>>>>
>>>> Part 2: Configurable BeamFileSystem
>>>> <https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4
>>>> q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs>
>>>>
>>>> Summary:
>>>>
>>>> Old API: IOChannelUtils.getFactory(glob).match(glob);
>>>>
>>>> New API: BeamFileSystems.getFileSystem(glob, config).match(glob);
>>>>
>>>>
>>>> Looking for comments and feedback.
>>>>
>>>> Thanks
>>>>
>>>> --
>>>>
>>>> Pei
>>>>
>>>>
>>>> --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to