Hey Tyrone-
I just set this up on 1.7.1.2 and found the documentation confusing
too. Been meaning to improve the documentation. To get S3 logging
configured I:
(a) Set up an S3Connection (let's call it foo) with only the extra
param set to the following json:
{ "s3_config_file": "/usr/local/airflow/.aws/credentials",
"s3_config_format": "aws" }
(b) Added a remote_log_conn_id key to the core section of airflow.cfg,
with a value of "foo" (my S3Connection name)
(c) Added a remote_base_log_folder key to the core section of
airflow.cfg, with a value of "s3://where/i/put/my/logs"
Everything worked after that.
-Jakob
On 15 June 2016 at 15:35, Tyrone Hinderson <[email protected]> wrote:
> @Jeremiah,
>
> http://pythonhosted.org/airflow/configuration.html#logs
>
> I used to log to s3 in 1.7.0, and my background .aws/credentials would take
> care of authenticating in the background. Now it appears that I need to set
> that "remote_log_conn_id" config field in order to continue logging to s3
> in 1.7.1.2. Rather than create the connection in the web UI (afaik,
> impractical to do programatically), I'd like to use an
> "AIRFLOW_CONN_"-style env variable. I've tried an url like
> s3://[access_key_id]:[secret_key]@[bucket].s3-[region].amazonaws.com, but
> that hasn't worked:
>
> =====================================
> [2016-06-15 21:40:26,583] {base_hook.py:53} INFO - Using connection to:
> [bucket].s3-us-east-1.amazonaws.com <http://s3-us-east-1.amazonaws.com/>
>
> [2016-06-15 21:40:26,583] {logging.py:57} ERROR - Could not create an
> S3Hook with connection id "S3_LOGS". Please make sure that airflow[s3] is
> installed and the S3 connection exists.
>
> =====================================
>
> It's clear that my connection exists because of the "Using connection to:"
> line. However, I fear that my connection URI string is malformed. Can you
> provide some guidance as to how I might properly form an s3 connection URI,
> since I mainly followed a mixture of wikipedia's URI format
> <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Examples>
> and amazon's
> s3 URI format
> <http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html>?
>
> On Tue, May 24, 2016 at 6:03 PM Jeremiah Lowin <[email protected]> wrote:
>
>> Where are you seeing that an S3 connection is required? It will only be
>> accessed if you tols Airflow to send logs to S3. The config option can also
>> be null (default) or a google storage location.
>>
>> The S3 connection is a standard Airflow connection. If you would like it to
>> use environment variables or a boto config, it will -- but the connection
>> object itself must be created in Airflow. See the S3 hook for details.
>>
>>
>> On Tue, May 24, 2016 at 3:57 PM George Leslie-Waksman <
>> [email protected]> wrote:
>>
>> > We ran into this issue as well. If you set the environment variable to
>> > anything random, it'll get ignored and control will pass through to
>> > .aws/credentials
>> >
>> > We used "n/a"
>> >
>> > It's kind of annoying that the s3 connection is a) required, and b)
>> poorly
>> > supported as an env var.
>> >
>> > On Tue, May 24, 2016 at 8:37 AM Tyrone Hinderson <[email protected]
>> >
>> > wrote:
>> >
>> > > I was logging to S3 in 1.7.0, but now I need to create an S3
>> "Connection"
>> > > in airflow (for remote_log_conn_id) to keep doing that in 1.7.1.2.
>> Rather
>> > > than set this "S3" connection in the UI, I'd like set a AIRFLOW_CONN_S3
>> > env
>> > > variable. What does an airlfow-friendly s3 "connection string" look
>> like?
>> > >
>> >
>>