On Mon, Mar 5, 2018 at 11:14 AM Romain Manni-Bucau <[email protected]> wrote:
> 2018-03-05 19:54 GMT+01:00 Reuven Lax <[email protected]>: > >> Are the filesystem classes marked experimental? If so, precise >> compatibility is less of a concern. However vfs does need to have better fs >> support first. >> > > Anyone has some cycle to list the details here? (even without being a spec > but a few bullet points a bit structured with a small description > sentence). I can get in touch with vfs to see what they think but I used it > to write in my previous job (in java batches) so it sounds like a very good > candidate to be pluggable. > Here are current Java and Python Beam file-system abstractions in case that's the information you are asking for. https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filesystem.py This indeed seems to be marked as experiments but we have other file-based sources as well as end users (for example, directly using this from a ParDo) that might be using this. Additionally these abstractions are similar as I mentioned earlier which might help users who are transitioning for Java to Python and vice versa. > > >> >> Also what about other languages? >> > > This is a bit "?" for me if other languages must go through java or not. > Last option meaning we can't have any valid codebase and increasing the > beam maintenance costs a lot. Since other languages should go through the > portable API IMHO, and most - all? - runners are java based it would be a > better way to go through vfs to have more pluggability than a custom system > rarely extended in the ecosystem, no? > > >> >> On Mon, Mar 5, 2018, 3:35 PM Romain Manni-Bucau <[email protected]> >> wrote: >> >>> I'd say to beam 2.x and to beam 3 to move all IO/extension from the core >>> to actual IO/extension modules. Sounds compatible this way - in the sense >>> we can have it eagerly without breaking anything. >>> >>> wdyt? >>> >>> >>> Romain Manni-Bucau >>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>> <https://rmannibucau.metawerx.net/> | Old Blog >>> <http://rmannibucau.wordpress.com> | Github >>> <https://github.com/rmannibucau> | LinkedIn >>> <https://www.linkedin.com/in/rmannibucau> | Book >>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>> >>> 2018-03-05 19:32 GMT+01:00 Reuven Lax <[email protected]>: >>> >>>> Actually FileIO is only somewhat related. >>>> >>>> It's an interesting proposal. However a quick look shows that vfs only >>>> has read-only support for hdfs and I'm not sure it has any support for gcs. >>>> Both are often used with Beam. Once vfs supports these filesystems it's >>>> worth looking at. >>>> >>>> Maybe add to the beam 3.0 hotlidt? >>>> >>>> On Mon, Mar 5, 2018, 3:26 PM Romain Manni-Bucau <[email protected]> >>>> wrote: >>>> >>>>> Yes (FileIO being the visible part of the FileSystems iceberg ;)). >>>>> >>>>> >>>>> Romain Manni-Bucau >>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>> <http://rmannibucau.wordpress.com> | Github >>>>> <https://github.com/rmannibucau> | LinkedIn >>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>> >>>>> 2018-03-05 19:23 GMT+01:00 Reuven Lax <[email protected]>: >>>>> >>>>>> I'm confused, as FileIO doesn't seem the same as vfs. Are you maybe >>>>>> referring to the filesystem abstraction instead? >>>>>> >>>>>> On Mon, Mar 5, 2018, 3:19 PM Romain Manni-Bucau < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi guys, >>>>>>> >>>>>>> What's the rational behind the fileIO impl? >>>>>>> >>>>>>> Why not using commons-vfs + a pluggable format? Sounds way more open >>>>>>> and reusable for end users than a few hardcoded supported formats, no? >>>>>>> What's the blocker? If there is a blocker, can't we contribute to >>>>>>> [vfs] to >>>>>>> make it disappear? >>>>>>> >>>>>>> Romain Manni-Bucau >>>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>>> >>>>>> >>>>> >>>
