I also stumbled upon a problem that I can't really pass additional
configuration to a filesystem, e.g.
lines = pipeline | 'read' >> ReadFromText('s3://my-bucket/kinglear.txt',
aws_config=AWSConfig())
because the ReadFromText class relies on PTransform's constructor, which
has a pre-defined set of arguments.
This is probably becoming a cross-topic for the dev list (have I added it
in the right way?)
On Thu, Jul 6, 2017 at 1:27 PM, Dmitry Demeshchuk <[email protected]>
wrote:
> Hi folks,
>
> I'm working on an S3 filesystem for the Python SDK, which already works in
> case of a happy path for both reading and writing, but I feel like there
> are quite a few edge cases that I'm likely missing.
>
> So far, my approach has been: "look at the generic FileSystem
> implementation, look at how gcsio.py and gcsfilesystem.py are written, try
> to copy their approach as much as possible, at least for getting to the
> proof of concept".
>
> That said, I'd like to know a few things:
>
> 1. Are there any official or non-official guidelines or docs on writing
> filesystems? Even Java-specific ones may be really useful.
>
> 2. Are there any existing generic test suites that every filesystem is
> supposed to pass? Again, even if they exist only in Java world, I'd still
> be down for trying to adopt them in Python SDK too.
>
> 3. Are there any established ideas of how to pass AWS credentials to Beam
> for making the S3 filesystem actually work? I currently rely on the
> existing environment variables, which boto just picks up, but sounds like
> setting them up in runners like Dataflow or Spark would be troublesome.
> I've seen this discussion a couple times in the list, but couldn't tell if
> any closure was found. My personal preference would be having AWS settings
> passed in some global context (pipeline options, perhaps?), but there may
> be exceptions to that (say, people want to use different credentials for
> different AWS operations).
>
> Thanks!
>
> --
> Best regards,
> Dmitry Demeshchuk.
>
--
Best regards,
Dmitry Demeshchuk.