I've worked through adapting this to Dataflow, it's simple enough once you try
all of the things that don't work. :)
In setup.py, write out config files with an identity token and a boto3 config
file. File-base config was essential, I couldn't get env vars working.
Here's a sample. Be careful!
Can you explain that a little bit? Right now, our pipeline code is structured
like this:
if __name__ == '__main__':
setup_credentials() # exports env vars for default boto session
run_pipeline() # runs all the beam stuff
So I expect every worker to setup their environment b
I see - it'd be great if the s3 io code would accept a boto session, so the
default process could be overridden.
But it looks like the module lazy loads boto3 and uses the default
session. So I think it'll work if we setup SDK env vars before the pipeline
code.
i.e., we'll try something like:
o
Hi Ross,
it seems that this feature is missing (e.g. passing a pipeline option with
authentication information for AWS). I'm sorry about that - that's pretty
annoying.
I wonder if you can use the setup.py file to add the default configuration
yourself while we have appropriate support for a pipelin
Hello all,
I have a python pipeline that writes data to an s3 bucket. On my laptop it
picks up the SDK credentials from my boto3 config and works great.
Is is possible to provide credentials explicitly? I'd like to use remote
dataflow runners, which won't have implicit AWS credentials available