Can you explain that a little bit? Right now, our pipeline code is structured
like this:
if __name__ == '__main__':
setup_credentials() # exports env vars for default boto session
run_pipeline() # runs all the beam stuff
So I expect every worker to setup their environment before running any beam
code. This seems to work fine. Is there an issue lurking here?
Ross
On Wed, 2020-09-30 at 17:57 -0700, Pablo Estrada wrote:
> **This message came from an external sender.**
>
> You may need to set those up in setup.py so that the code runs in every
> worker at startup.
>
> On Wed, Sep 30, 2020, 10:16 AM Ross Vandegrift <
> [email protected]> wrote:
> > I see - it'd be great if the s3 io code would accept a boto session, so
> > the
> > default process could be overridden.
> >
> > But it looks like the module lazy loads boto3 and uses the default
> > session. So I think it'll work if we setup SDK env vars before the
> > pipeline
> > code.
> >
> > i.e., we'll try something like:
> >
> > os.environ['AWS_ROLE_ARN'] = 'aws:arn:...'
> > os.environ['AWS_ROLE_SESSION_NAME'] = 'my-beam-pipeline'
> > os.environ['AWS_WEB_IDENTITY_TOKEN_FILE'] = '/path/to/id_token'
> >
> > with beam.Pipline(...) as p:
> > ...
> >
> > Ross
> >
> > On Tue, 2020-09-29 at 14:29 -0700, Pablo Estrada wrote:
> > > **This message came from an external sender.**
> > >
> > > Hi Ross,
> > > it seems that this feature is missing (e.g. passing a pipeline option
> > with
> > > authentication information for AWS). I'm sorry about that - that's
> > pretty
> > > annoying.
> > > I wonder if you can use the setup.py file to add the default
> > configuration
> > > yourself while we have appropriate support for a pipeline option-based
> > > authentication. Could you try adding this default config on setup.py?
> > > Best
> > > -P.
> > >
> > > On Tue, Sep 29, 2020 at 11:16 AM Ross Vandegrift <
> > > [email protected]> wrote:
> > > > Hello all,
> > > >
> > > > I have a python pipeline that writes data to an s3 bucket. On my
> > laptop
> > > > it
> > > > picks up the SDK credentials from my boto3 config and works great.
> > > >
> > > > Is is possible to provide credentials explicitly? I'd like to use
> > remote
> > > > dataflow runners, which won't have implicit AWS credentials available.
> > > >
> > > > Thanks,
> > > > Ross
> > > >
> > >
> > > This message came from an external source. Please exercise caution when
> > > opening attachments or clicking on links.
> >
>
> This message came from an external source. Please exercise caution when
> opening attachments or clicking on links.