Re: Provide credentials for s3 writes

2020-10-08 Thread Ross Vandegrift
I've worked through adapting this to Dataflow, it's simple enough once you try all of the things that don't work. :) In setup.py, write out config files with an identity token and a boto3 config file. File-base config was essential, I couldn't get env vars working. Here's a sample. Be careful!

Re: Provide credentials for s3 writes

2020-10-01 Thread Ross Vandegrift
Can you explain that a little bit? Right now, our pipeline code is structured like this: if __name__ == '__main__': setup_credentials() # exports env vars for default boto session run_pipeline() # runs all the beam stuff So I expect every worker to setup their environment b

Re: Provide credentials for s3 writes

2020-09-30 Thread Ross Vandegrift
I see - it'd be great if the s3 io code would accept a boto session, so the default process could be overridden. But it looks like the module lazy loads boto3 and uses the default session. So I think it'll work if we setup SDK env vars before the pipeline code. i.e., we'll try something like: o

Re: Provide credentials for s3 writes

2020-09-29 Thread Pablo Estrada
Hi Ross, it seems that this feature is missing (e.g. passing a pipeline option with authentication information for AWS). I'm sorry about that - that's pretty annoying. I wonder if you can use the setup.py file to add the default configuration yourself while we have appropriate support for a pipelin

Provide credentials for s3 writes

2020-09-29 Thread Ross Vandegrift
Hello all, I have a python pipeline that writes data to an s3 bucket. On my laptop it picks up the SDK credentials from my boto3 config and works great. Is is possible to provide credentials explicitly? I'd like to use remote dataflow runners, which won't have implicit AWS credentials available