On Mon, Mar 5, 2018 at 11:14 AM Romain Manni-Bucau <[email protected]>
wrote:

> 2018-03-05 19:54 GMT+01:00 Reuven Lax <[email protected]>:
>
>> Are the filesystem classes marked experimental? If so, precise
>> compatibility is less of a concern. However vfs does need to have better fs
>> support first.
>>
>
> Anyone has some cycle to list the details here? (even without being a spec
> but a few bullet points a bit structured with a small description
> sentence). I can get in touch with vfs to see what they think but I used it
> to write in my previous job (in java batches) so it sounds like a very good
> candidate to be pluggable.
>

Here are current Java and Python Beam file-system abstractions in case
that's the information you are asking for.

https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filesystem.py

This indeed seems to be marked as experiments but we have other file-based
sources as well as end users (for example, directly using this from a
ParDo) that might be using this. Additionally these abstractions are
similar as I mentioned earlier which might help users who are transitioning
for Java to Python and vice versa.



>
>
>>
>> Also what about other languages?
>>
>
> This is a bit "?" for me if other languages must go through java or not.
> Last option meaning we can't have any valid codebase and increasing the
> beam maintenance costs a lot. Since other languages should go through the
> portable API IMHO, and most - all? - runners are java based it would be a
> better way to go through vfs to have more pluggability than a custom system
> rarely extended in the ecosystem, no?
>
>
>>
>> On Mon, Mar 5, 2018, 3:35 PM Romain Manni-Bucau <[email protected]>
>> wrote:
>>
>>> I'd say to beam 2.x and to beam 3 to move all IO/extension from the core
>>> to actual IO/extension modules. Sounds compatible this way - in the sense
>>> we can have it eagerly without breaking anything.
>>>
>>> wdyt?
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-03-05 19:32 GMT+01:00 Reuven Lax <[email protected]>:
>>>
>>>> Actually FileIO is only somewhat related.
>>>>
>>>> It's an interesting proposal. However a quick look shows that vfs only
>>>> has read-only support for hdfs and I'm not sure it has any support for gcs.
>>>> Both are often used with Beam. Once vfs supports these filesystems it's
>>>> worth looking at.
>>>>
>>>> Maybe add to the beam 3.0 hotlidt?
>>>>
>>>> On Mon, Mar 5, 2018, 3:26 PM Romain Manni-Bucau <[email protected]>
>>>> wrote:
>>>>
>>>>> Yes (FileIO being the visible part of the FileSystems iceberg ;)).
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-03-05 19:23 GMT+01:00 Reuven Lax <[email protected]>:
>>>>>
>>>>>> I'm confused, as FileIO doesn't seem the same as vfs. Are you maybe
>>>>>> referring to the filesystem abstraction instead?
>>>>>>
>>>>>> On Mon, Mar 5, 2018, 3:19 PM Romain Manni-Bucau <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi guys,
>>>>>>>
>>>>>>> What's the rational behind the fileIO impl?
>>>>>>>
>>>>>>> Why not using commons-vfs + a pluggable format? Sounds way more open
>>>>>>> and reusable for end users than a few hardcoded supported formats, no?
>>>>>>> What's the blocker? If there is a blocker, can't we contribute to  
>>>>>>> [vfs] to
>>>>>>> make it disappear?
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>
>>>>>
>>>

Reply via email to