Hi, Appreciate if someone could please provide assistance on this.
Thanks, Nadeem On Fri, Jul 15, 2016 at 4:15 PM, Nadeem Ahmed Nazeer <[email protected]> wrote: > Hello, > > We are using the S3Hook in several of our airflow TI's to read and write > data from S3. > > We are creating a s3 connection from the UI where we choose the below > options. > > Conn Id - s3 > Conn Type - S3 > Extra - {"aws_access_key_id":"key", "aws_secret_access_key": "key"} > > In pipeline code we use this connection as below, > > s3 = S3Hook(s3_conn_id='s3') > > We are looking into other options to define this connection as it is a > security issue to have the keys open like this. We tried defining the > connection id and connection type alone in the UI without the keys. In this > case, the tasks that read from S3 succeed but the ones that delete or > create files/objects fail with '403 Forbidden' error from S3. Did some > digging in the S3_Hook code and found that if the keys are not in the Extra > parameter then it would use the boto config but that doesn't seem to work > in my case for reasons I am unable to find. > > All our other python scripts interact with S3 using the boto config on the > system without any problems. > > 1) > Need help on why the s3 hook isn't using the boto config. Am I missing to > pass some other parameters to this connection? > > 2) > How to define the s3 connection as environmental variable? We are > installing airflow via Chef and would want to have an environmental > variable like AIRFLOW_CONN_S3 created for this connection so that we don't > have to manually do it in the UI every time we run the setup. > > Documentation says, it has the connection has to be in a URI format. On > S3, I could access different buckets with the same connection. But since it > has to be in URI format, does that mean i create one connection per bucket > and use it? Did not find any examples of this anywhere hence asking. > > Thanks, > Nadeem >
