I am a little concerned about the lack of scheme. The scheme is an
important part about how filesystems work in Beam. If you look at [1],
you'll see that we figure out the file system to use based on the scheme.
This allows us to do ReadFromText('gs://my_gcs_bucket/my_file') or
ReadFromText('s3://...') or ReadFromText('hdfs://...') without the user
being concerned about where the files are stored.
I have looked around a bit, and you're right that Azure blob does not seem
to be relying on any form of scheme. We'll need to think about how to make
this work...
cc: +Chamikara Jayalath <[email protected]>
[1]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filesystems.py#L66-L119
On Fri, Mar 27, 2020 at 10:27 AM Badrul Chowdhury <
[email protected]> wrote:
> Hi Pablo,
>
> Thanks for reviewing the proposal. I have replied to you comment about the
> return value of scheme() for Azure Blob Store, please let me know what you
> think.
>
>
> Thanks,
> Badrul
>
> On Tue, Mar 24, 2020 at 1:59 PM Badrul Chowdhury <
> [email protected]> wrote:
>
>> Hi All,
>>
>> I would love to hear your thoughts on my proposal for adding Python SDK
>> support for Azure Blob Store I/O:
>> https://docs.google.com/document/d/173e_gnDclwavqobiNjwxRlo9D1xjaZat98g6Yax0kGQ/edit?usp=sharing
>>
>> Stay safe!
>>
>> Thanks,
>> Badrul
>>
>
>
> --
>
> Cheers,
> Badrul
>