If VFS was mature enough for our needs, then I'd give a +1 to using it in
Beam Java SDK - currently it's not, so we can't use it directly.
It's indeed a reasonable option to use the VFS API inside Beam, and port
our implementations of FileSystem(s) to that API, and then potentially
donate that to the VFS project.
- big contribution to the Apache ecosystem
- more contributors
- we become dependent on VFS release cycles for fixing bugs in our
filesystem implementations, but maybe that's ok, depending on how frequent
are its releases and how well it's maintained in general.
- since the codebase is no longer under our full control, we become
dependent on the diligence of VFS committers and their testing procedures
for code quality. I'd assume they are diligent, but being unfamiliar with
the project, it may be a risk.

I'd say the upsides outweigh the downsides, so - this seems like a very
substantial amount of work but if someone's willing to do it, great.

As for creating a VfsIO transform: I'm very strongly against this. A
filesystem layer should transparently work with everything file-related,
and not be limited to use from a single transform. Same reason we don't
have GCSIO, S3IO, LocalIO, ZipIO etc.

On Mon, Mar 5, 2018 at 1:05 PM Romain Manni-Bucau <rmannibu...@gmail.com>

> Not backing vfs by a filesystem sounds saner so VfsIO is probably the way
> to go. It would be a FileIO concurrent and hopefully replacement on the
> mid/long term.
> What about doing the opposite: implementing a vfs filesystem for all the
> fs we support, potentially enrich vfs if needed? Then we can just drop beam
> abstraction from what i read.
> Le 5 mars 2018 20:49, "Reuven Lax" <re...@google.com> a écrit :
>> terminology is confusing here, since the existing FileIO is a PTransform.
>> VfsFilesystem would be a better name.
>> On Mon, Mar 5, 2018 at 11:46 AM Robert Bradshaw <rober...@google.com>
>> wrote:
>>> On Mon, Mar 5, 2018 at 11:38 AM Reuven Lax <re...@google.com> wrote:
>>>> What about a beam Filesystem impl on top of Vfs as an alternative
>>>> short-term solution? This would allow Vfs to be used with any IO.
>>> Yes, I think this is the VfsIO that was proposed.
>>>> On Mon, Mar 5, 2018 at 11:37 AM Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>> On Mon, Mar 5, 2018 at 11:23 AM Romain Manni-Bucau <
>>>>> rmannibu...@gmail.com> wrote:
>>>>>> 2018-03-05 20:04 GMT+01:00 Chamikara Jayalath <chamik...@google.com>:
>>>>>>> I assume you mean https://commons.apache.org/proper/commons-vfs/.
>>>>>>> I'm not sure if we considered this when we originally implemented
>>>>>>> our own file-system abstraction but based on a quick look seems like 
>>>>>>> this
>>>>>>> is Java only.
>>>>>> Yes, java only
>>>>>>> I think having a similar file-system abstraction for various
>>>>>>> languages is a plus point for Beam. May be we should consider a Java
>>>>>>> file-system implementation for VFS ?
>>>>>> Can be an option but when I see the current complexity I'm not sure
>>>>>> mixing 2 abstractions would help, maybe just a VfsIO for java users would
>>>>>> be good enough - thinking out loud.
>>>>>> What sounds clear to me is that each language will need its own
>>>>>> abstraction - which kind of join your proposal. However we can still make
>>>>>> it smooth and easy on the java side - which
>>>>>> will likely stay mainstream for still some years - using vfs as our
>>>>>> java impl instead of reimplementing the full abstraction? This way we 
>>>>>> keep
>>>>>> our *API* but we drop beam *impl* to just reuse VFS.
>>>>>> PS: for gcs https://github.com/ltouati/vfs-gcs can be a good example
>>>>>> on how it can work.
>>>>> I think a VfsIO makes a lot of sense in the short term, and will give
>>>>> use the experience needed to decide if we can move solely to VFS (for Java
>>>>> at least) for implementation, and possibly API in a future major release,
>>>>> in the long run.

Reply via email to