We use native boto without the airflow S3 hook or airflow S3 connection storage. Boto can pick up authentication keys based on a hierarchy defined at http://boto.cloudhackers.com/en/latest/boto_config_tut.html#details, or you could consume the credentials some other way programmatically (environment variable, directly through the airflow db) and send them to your boto session yourself.
That specific error looks more like the policies for your authentication credentials are not set up to have permissions properly to whatever resource you're after. In my experience, AWS policies are tricky. I know one gotcha to access keys in S3 buckets via boto is your AWS policy has to whitelist the ListBucket ability on all buckets (not just the one you're after), which was unintuitive to me. On Mon, Jul 25, 2016 at 8:07 PM, Nadeem Ahmed Nazeer <[email protected]> wrote: > Thanks Paul. > > Has anyone tried using native boto instead of s3hook in airflow tasks > (python callable)? I tried using it but get a "S3ResponseError: 403 > Forbidden". Just wondering if we are restricted to using only s3hook. > Reason i want to use native boto is to avoid defining the connection for > the s3hook. > > Thanks, > Nadeem > > On Tue, Jul 19, 2016 at 10:47 AM, Paul Minton <[email protected]> wrote: > > > For 2) we're using something similar to > > https://gist.github.com/syvineckruyk/d2c96b418ed509a174e0e718cb62b20a to > > programmatically load Connection and Variable objects. Those scripts run > > with each chef client run. > > > > On Tue, Jul 19, 2016 at 10:04 AM, Nadeem Ahmed Nazeer < > [email protected] > > > > > wrote: > > > > > Thanks for the response Paul. Crypto package would be my last resort. > But > > > it would still not solve (2) where I am looking to create the > connection > > > automatically. > > > > > > Awaiting further answers.. > > > > > > Thanks, > > > Nadeem > > > > > > On Tue, Jul 19, 2016 at 9:03 AM, Paul Minton <[email protected]> > wrote: > > > > > > > Nadeem, this doesn't directly answer either 1) or 2), but have you > > > > considered using the option "is_exrra_encrypted"? This encrypts the > > extra > > > > json as it would for the rest of the credentials on the connection > > object > > > > (ie using a fernet key and the encryption package) > > > > > > > > On Mon, Jul 18, 2016 at 10:00 PM, Nadeem Ahmed Nazeer < > > > [email protected] > > > > > > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > Appreciate if someone could please provide assistance on this. > > > > > > > > > > Thanks, > > > > > Nadeem > > > > > > > > > > On Fri, Jul 15, 2016 at 4:15 PM, Nadeem Ahmed Nazeer < > > > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > We are using the S3Hook in several of our airflow TI's to read > and > > > > write > > > > > > data from S3. > > > > > > > > > > > > We are creating a s3 connection from the UI where we choose the > > below > > > > > > options. > > > > > > > > > > > > Conn Id - s3 > > > > > > Conn Type - S3 > > > > > > Extra - {"aws_access_key_id":"key", "aws_secret_access_key": > "key"} > > > > > > > > > > > > In pipeline code we use this connection as below, > > > > > > > > > > > > s3 = S3Hook(s3_conn_id='s3') > > > > > > > > > > > > We are looking into other options to define this connection as it > > is > > > a > > > > > > security issue to have the keys open like this. We tried defining > > the > > > > > > connection id and connection type alone in the UI without the > keys. > > > In > > > > > this > > > > > > case, the tasks that read from S3 succeed but the ones that > delete > > or > > > > > > create files/objects fail with '403 Forbidden' error from S3. Did > > > some > > > > > > digging in the S3_Hook code and found that if the keys are not in > > the > > > > > Extra > > > > > > parameter then it would use the boto config but that doesn't seem > > to > > > > work > > > > > > in my case for reasons I am unable to find. > > > > > > > > > > > > All our other python scripts interact with S3 using the boto > config > > > on > > > > > the > > > > > > system without any problems. > > > > > > > > > > > > 1) > > > > > > Need help on why the s3 hook isn't using the boto config. Am I > > > missing > > > > to > > > > > > pass some other parameters to this connection? > > > > > > > > > > > > 2) > > > > > > How to define the s3 connection as environmental variable? We are > > > > > > installing airflow via Chef and would want to have an > environmental > > > > > > variable like AIRFLOW_CONN_S3 created for this connection so that > > we > > > > > don't > > > > > > have to manually do it in the UI every time we run the setup. > > > > > > > > > > > > Documentation says, it has the connection has to be in a URI > > format. > > > On > > > > > > S3, I could access different buckets with the same connection. > But > > > > since > > > > > it > > > > > > has to be in URI format, does that mean i create one connection > per > > > > > bucket > > > > > > and use it? Did not find any examples of this anywhere hence > > asking. > > > > > > > > > > > > Thanks, > > > > > > Nadeem > > > > > > > > > > > > > > > > > > > > >
