Sorry to clarify, this is now on master branch. On Wed, Dec 20, 2017 at 10:25 AM, Kevin Lam <[email protected]> wrote:
> Thanks Bolke and Feng! > > I seem to have a working connection with GCS but it seems there some error > occuring in the gcs_task_handler in airflow: > > Traceback (most recent call last): > File "/usr/local/bin/airflow", line 27, in <module> > args.func(args) > File "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line > 423, in run > logging.shutdown() > File "/usr/lib/python3.5/logging/__init__.py", line 1882, in shutdown > h.close() > File > "/usr/local/lib/python3.5/dist-packages/airflow/utils/log/gcs_task_handler.py", > line 87, in close > self.gcs_write(log, remote_loc) > File > "/usr/local/lib/python3.5/dist-packages/airflow/utils/log/gcs_task_handler.py", > line 144, in gcs_write > log = '\n'.join([old_log, log]) if old_log else log > UnboundLocalError: local variable 'old_log' referenced before assignment > > I believe the connection is working because the tasks are getting a 404 > instead of 403 when trying to read from remote logs, but they aren't being > written because of the above error. > > Eg. > > *** Unable to read remote log from gs://<mybucket>/<...>/2017-12- > 20T15:21:23.704614+00:00/1.log > *** <HttpError 404 when requesting https://www.googleapis.com/ > storage/v1/b/<mybucket>/o/<...>F2017-12-20T15%3A21%3A23. > 704614%2B00%3A00%2F1.log?alt=media returned "Not Found"> > > > On Wed, Dec 20, 2017 at 1:48 AM, Bolke de Bruin <[email protected]> wrote: > >> Both will/should work, master is just cleaner and more manageable. >> >> B. >> >> Verstuurd vanaf mijn iPad >> >> > Op 19 dec. 2017 om 23:44 heeft Kevin Lam <[email protected]> het >> volgende geschreven: >> > >> > Looks like it might be related to >> > https://github.com/apache/incubator-airflow/commit/02ff8ae35 >> dd16e6f23d29d7b24a5fb9c09d0b7a4? >> > Why isn't this fix on the v1-9 branches? Should I be using master >> instead? >> > >> >> On Tue, Dec 19, 2017 at 5:37 PM, Kevin Lam <[email protected]> >> wrote: >> >> >> >> Hi Feng, >> >> >> >> Thanks for your help! Got it, will try to push on the python based >> logging >> >> config. >> >> >> >> I'm trying to set-up the GCS logging on airflow v1-9-stable and my >> >> logging_config.py seems to be causing a python import error, caused by >> >> 'from airflow import configuration' >> >> >> >> "Initialize database... >> >> Unable to load the config, contains a configuration error. >> >> Traceback (most recent call last): >> >> File "/usr/lib/python3.5/logging/config.py", line 384, in resolve >> >> self.importer(used) >> >> ImportError: No module named 'airflow.utils.log.logging_mix >> in.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package >> >> >> >> The above exception was the direct cause of the following exception: >> >> >> >> Traceback (most recent call last): >> >> File "/usr/lib/python3.5/logging/config.py", line 558, in configure >> >> handler = self.configure_handler(handlers[name]) >> >> File "/usr/lib/python3.5/logging/config.py", line 708, in >> >> configure_handler >> >> klass = self.resolve(cname) >> >> File "/usr/lib/python3.5/logging/config.py", line 391, in resolve >> >> raise v >> >> File "/usr/lib/python3.5/logging/config.py", line 384, in resolve >> >> self.importer(used) >> >> ValueError: Cannot resolve 'airflow.utils.log.logging_mix >> in.RedirectStdHandler': >> >> No module named 'airflow.utils.log.logging_mixin.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package >> >> >> >> During handling of the above exception, another exception occurred: >> >> >> >> Traceback (most recent call last): >> >> File "/usr/local/bin/airflow", line 16, in <module> >> >> from airflow import configuration >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/__init__.py", >> line >> >> 31, in <module> >> >> from airflow import settings >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/settings.py", >> line >> >> 148, in <module> >> >> configure_logging() >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/logging_conf >> ig.py", >> >> line 75, in configure_logging >> >> raise e >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/logging_conf >> ig.py", >> >> line 70, in configure_logging >> >> dictConfig(logging_config) >> >> File "/usr/lib/python3.5/logging/config.py", line 795, in dictConfig >> >> dictConfigClass(config).configure() >> >> File "/usr/lib/python3.5/logging/config.py", line 566, in configure >> >> '%r: %s' % (name, e)) >> >> ValueError: Unable to configure handler 'console': Cannot resolve >> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler': No module named >> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package >> >> HTTP/1.1 200 OK >> >> Unable to load the config, contains a configuration error. >> >> Traceback (most recent call last): >> >> File "/usr/lib/python3.5/logging/config.py", line 384, in resolve >> >> self.importer(used) >> >> ImportError: No module named 'airflow.utils.log.logging_mix >> in.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package >> >> >> >> The above exception was the direct cause of the following exception: >> >> >> >> Traceback (most recent call last): >> >> File "/usr/lib/python3.5/logging/config.py", line 558, in configure >> >> handler = self.configure_handler(handlers[name]) >> >> File "/usr/lib/python3.5/logging/config.py", line 708, in >> >> configure_handler >> >> klass = self.resolve(cname) >> >> File "/usr/lib/python3.5/logging/config.py", line 391, in resolve >> >> raise v >> >> File "/usr/lib/python3.5/logging/config.py", line 384, in resolve >> >> self.importer(used) >> >> ValueError: Cannot resolve 'airflow.utils.log.logging_mix >> in.RedirectStdHandler': >> >> No module named 'airflow.utils.log.logging_mixin.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package >> >> >> >> During handling of the above exception, another exception occurred: >> >> >> >> Traceback (most recent call last): >> >> File "/usr/local/bin/airflow", line 16, in <module> >> >> from airflow import configuration >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/__init__.py", >> line >> >> 31, in <module> >> >> from airflow import settings >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/settings.py", >> line >> >> 148, in <module> >> >> configure_logging() >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/logging_conf >> ig.py", >> >> line 75, in configure_logging >> >> raise e >> >> File "/usr/local/lib/python3.5/dist-packages/airflow/logging_conf >> ig.py", >> >> line 70, in configure_logging >> >> dictConfig(logging_config) >> >> File "/usr/lib/python3.5/logging/config.py", line 795, in dictConfig >> >> dictConfigClass(config).configure() >> >> File "/usr/lib/python3.5/logging/config.py", line 566, in configure >> >> '%r: %s' % (name, e)) >> >> ValueError: Unable to configure handler 'console': Cannot resolve >> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler': No module named >> >> 'airflow.utils.log.logging_mixin.RedirectStdHandler'; >> >> 'airflow.utils.log.logging_mixin' is not a package" >> >> >> >> Have you encountered this before? >> >> >> >> On Mon, Dec 18, 2017 at 8:53 PM, Feng Lu <[email protected]> >> >> wrote: >> >> >> >>> Hi Kevin, >> >>> >> >>> Kindly see my reply inline: >> >>> >> >>>> On Mon, Dec 18, 2017 at 3:28 PM, Kevin Lam <[email protected]> >> wrote: >> >>>> >> >>>> Hi, >> >>>> >> >>>> I'm trying to get airflow to use GCS for logging purposes and had a >> few >> >>>> questions. >> >>>> >> >>>> We're currently using Airflow 1.9rc2, running in a Kubernetes Airflow >> >>>> deployment (similar to https://github.com/mumoshu/kube-airflow) >> >>>> >> >>>> 1/ Seems like the logging code has been going through some changes in >> >>> the >> >>>> recent versions. What's the correct way to set up GCS for logging? Is >> >>> it by >> >>>> just specifying remote_base_log_folder and remote_log_conn_id in >> >>>> airflow.cfg? Or by following this guide: >> >>>> http://airflow.readthedocs.io/en/latest/integration.html#gcp, using >> the >> >>>> python based logging config? Is there an Airflow version that we >> should >> >>> use >> >>>> to be most stable? >> >>>> >> >>> The python based logging config is the right place to make changes, >> in our >> >>> test setup, we override the airflow_local_settings.py similarly to the >> >>> link >> >>> you pasted. >> >>> You may also want to config: [core]task_log_reader = gcs.task >> >>> >> >>> >> >>>> >> >>>> 2/ Is there a way to encode the connection for GCS in a file so that >> one >> >>>> doesn't have to open the webserver and create it from the admin >> panel? >> >>> It'd >> >>>> be nice if the GCS connection would be automatically created. >> >>>> >> >>> Unfortunately GCS connection ties to some GCP project and is >> impossible to >> >>> pre-populate. >> >>> Airflow1.9 should fix the gcp connection type issue ( >> >>> https://github.com/apache/incubator-airflow/commit/2f107d8a3 >> >>> 0910fd025774004d5c4c95407ed55c5), >> >>> so you can use airflow connections CLI directly. >> >>> >> >>> >> >>>> >> >>>> Thanks in advance for your help! >> >>>> >> >>> >> >> >> >> >> > >
