Thanks Bolke and Feng!

I seem to have a working connection with GCS but it seems there some error
occuring in the gcs_task_handler in airflow:

Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 27, in <module>
    args.func(args)
  File "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line
423, in run
    logging.shutdown()
  File "/usr/lib/python3.5/logging/__init__.py", line 1882, in shutdown
    h.close()
  File
"/usr/local/lib/python3.5/dist-packages/airflow/utils/log/gcs_task_handler.py",
line 87, in close
    self.gcs_write(log, remote_loc)
  File
"/usr/local/lib/python3.5/dist-packages/airflow/utils/log/gcs_task_handler.py",
line 144, in gcs_write
    log = '\n'.join([old_log, log]) if old_log else log
UnboundLocalError: local variable 'old_log' referenced before assignment

I believe the connection is working because the tasks are getting a 404
instead of 403 when trying to read from remote logs, but they aren't being
written because of the above error.

Eg.

*** Unable to read remote log from
gs://<mybucket>/<...>/2017-12-20T15:21:23.704614+00:00/1.log
*** <HttpError 404 when requesting
https://www.googleapis.com/storage/v1/b/<mybucket>/o/<...>F2017-12-20T15%3A21%3A23.704614%2B00%3A00%2F1.log?alt=media
returned "Not Found">


On Wed, Dec 20, 2017 at 1:48 AM, Bolke de Bruin <bdbr...@gmail.com> wrote:

> Both will/should work, master is just cleaner and more manageable.
>
> B.
>
> Verstuurd vanaf mijn iPad
>
> > Op 19 dec. 2017 om 23:44 heeft Kevin Lam <ke...@fathomhealth.co> het
> volgende geschreven:
> >
> > Looks like it might be related to
> > https://github.com/apache/incubator-airflow/commit/
> 02ff8ae35dd16e6f23d29d7b24a5fb9c09d0b7a4?
> > Why isn't this fix on the v1-9 branches? Should I be using master
> instead?
> >
> >> On Tue, Dec 19, 2017 at 5:37 PM, Kevin Lam <ke...@fathomhealth.co>
> wrote:
> >>
> >> Hi Feng,
> >>
> >> Thanks for your help! Got it, will try to push on the python based
> logging
> >> config.
> >>
> >> I'm trying to set-up the GCS logging on airflow v1-9-stable and my
> >> logging_config.py seems to be causing a python import error, caused by
> >> 'from airflow import configuration'
> >>
> >> "Initialize database...
> >> Unable to load the config, contains a configuration error.
> >> Traceback (most recent call last):
> >>  File "/usr/lib/python3.5/logging/config.py", line 384, in resolve
> >>    self.importer(used)
> >> ImportError: No module named 'airflow.utils.log.logging_
> mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package
> >>
> >> The above exception was the direct cause of the following exception:
> >>
> >> Traceback (most recent call last):
> >>  File "/usr/lib/python3.5/logging/config.py", line 558, in configure
> >>    handler = self.configure_handler(handlers[name])
> >>  File "/usr/lib/python3.5/logging/config.py", line 708, in
> >> configure_handler
> >>    klass = self.resolve(cname)
> >>  File "/usr/lib/python3.5/logging/config.py", line 391, in resolve
> >>    raise v
> >>  File "/usr/lib/python3.5/logging/config.py", line 384, in resolve
> >>    self.importer(used)
> >> ValueError: Cannot resolve 'airflow.utils.log.logging_
> mixin.RedirectStdHandler':
> >> No module named 'airflow.utils.log.logging_mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package
> >>
> >> During handling of the above exception, another exception occurred:
> >>
> >> Traceback (most recent call last):
> >>  File "/usr/local/bin/airflow", line 16, in <module>
> >>    from airflow import configuration
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/__init__.py",
> line
> >> 31, in <module>
> >>    from airflow import settings
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/settings.py",
> line
> >> 148, in <module>
> >>    configure_logging()
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/logging_
> config.py",
> >> line 75, in configure_logging
> >>    raise e
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/logging_
> config.py",
> >> line 70, in configure_logging
> >>    dictConfig(logging_config)
> >>  File "/usr/lib/python3.5/logging/config.py", line 795, in dictConfig
> >>    dictConfigClass(config).configure()
> >>  File "/usr/lib/python3.5/logging/config.py", line 566, in configure
> >>    '%r: %s' % (name, e))
> >> ValueError: Unable to configure handler 'console': Cannot resolve
> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler': No module named
> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package
> >> HTTP/1.1 200 OK
> >> Unable to load the config, contains a configuration error.
> >> Traceback (most recent call last):
> >>  File "/usr/lib/python3.5/logging/config.py", line 384, in resolve
> >>    self.importer(used)
> >> ImportError: No module named 'airflow.utils.log.logging_
> mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package
> >>
> >> The above exception was the direct cause of the following exception:
> >>
> >> Traceback (most recent call last):
> >>  File "/usr/lib/python3.5/logging/config.py", line 558, in configure
> >>    handler = self.configure_handler(handlers[name])
> >>  File "/usr/lib/python3.5/logging/config.py", line 708, in
> >> configure_handler
> >>    klass = self.resolve(cname)
> >>  File "/usr/lib/python3.5/logging/config.py", line 391, in resolve
> >>    raise v
> >>  File "/usr/lib/python3.5/logging/config.py", line 384, in resolve
> >>    self.importer(used)
> >> ValueError: Cannot resolve 'airflow.utils.log.logging_
> mixin.RedirectStdHandler':
> >> No module named 'airflow.utils.log.logging_mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package
> >>
> >> During handling of the above exception, another exception occurred:
> >>
> >> Traceback (most recent call last):
> >>  File "/usr/local/bin/airflow", line 16, in <module>
> >>    from airflow import configuration
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/__init__.py",
> line
> >> 31, in <module>
> >>    from airflow import settings
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/settings.py",
> line
> >> 148, in <module>
> >>    configure_logging()
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/logging_
> config.py",
> >> line 75, in configure_logging
> >>    raise e
> >>  File "/usr/local/lib/python3.5/dist-packages/airflow/logging_
> config.py",
> >> line 70, in configure_logging
> >>    dictConfig(logging_config)
> >>  File "/usr/lib/python3.5/logging/config.py", line 795, in dictConfig
> >>    dictConfigClass(config).configure()
> >>  File "/usr/lib/python3.5/logging/config.py", line 566, in configure
> >>    '%r: %s' % (name, e))
> >> ValueError: Unable to configure handler 'console': Cannot resolve
> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler': No module named
> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler';
> >> 'airflow.utils.log.logging_mixin' is not a package"
> >>
> >> Have you encountered this before?
> >>
> >> On Mon, Dec 18, 2017 at 8:53 PM, Feng Lu <fen...@google.com.invalid>
> >> wrote:
> >>
> >>> Hi Kevin,
> >>>
> >>> Kindly see my reply inline:
> >>>
> >>>> On Mon, Dec 18, 2017 at 3:28 PM, Kevin Lam <ke...@fathomhealth.co>
> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I'm trying to get airflow to use GCS for logging purposes and had a
> few
> >>>> questions.
> >>>>
> >>>> We're currently using Airflow 1.9rc2, running in a Kubernetes Airflow
> >>>> deployment (similar to https://github.com/mumoshu/kube-airflow)
> >>>>
> >>>> 1/ Seems like the logging code has been going through some changes in
> >>> the
> >>>> recent versions. What's the correct way to set up GCS for logging? Is
> >>> it by
> >>>> just specifying remote_base_log_folder and remote_log_conn_id in
> >>>> airflow.cfg? Or by following this guide:
> >>>> http://airflow.readthedocs.io/en/latest/integration.html#gcp, using
> the
> >>>> python based logging config? Is there an Airflow version that we
> should
> >>> use
> >>>> to be most stable?
> >>>>
> >>> The python based logging config is the right place to make changes, in
> our
> >>> test setup, we override the airflow_local_settings.py similarly to the
> >>> link
> >>> you pasted.
> >>> You may also want to config: [core]task_log_reader = gcs.task
> >>>
> >>>
> >>>>
> >>>> 2/ Is there a way to encode the connection for GCS in a file so that
> one
> >>>> doesn't have to open the webserver and create it from the admin panel?
> >>> It'd
> >>>> be nice if the GCS connection would be automatically created.
> >>>>
> >>> Unfortunately GCS connection ties to some GCP project and is
> impossible to
> >>> pre-populate.
> >>> Airflow1.9 should fix the gcp connection type issue  (
> >>> https://github.com/apache/incubator-airflow/commit/2f107d8a3
> >>> 0910fd025774004d5c4c95407ed55c5),
> >>> so you can use airflow connections CLI directly.
> >>>
> >>>
> >>>>
> >>>> Thanks in advance for your help!
> >>>>
> >>>
> >>
> >>
>

Reply via email to