[
https://issues.apache.org/jira/browse/AIRFLOW-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ash Berlin-Taylor updated AIRFLOW-3449:
---------------------------------------
Fix Version/s: 1.10.4
> Airflow DAG parsing logs aren't written when using S3 logging
> -------------------------------------------------------------
>
> Key: AIRFLOW-3449
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3449
> Project: Apache Airflow
> Issue Type: Bug
> Components: logging, scheduler
> Affects Versions: 1.10.0, 1.10.1
> Reporter: James Meickle
> Assignee: Ash Berlin-Taylor
> Priority: Critical
> Fix For: 1.10.4
>
>
> The default Airflow logging class outputs provides some logs to stdout, some
> to "task" folders, and some to "processor" folders (generated during DAG
> parsing). The 1.10.0 logging update broke this, but only for users who are
> also using S3 logging. This is because of this feature in the default logging
> config file:
> {code:python}
> if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'):
> DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['s3'])
> {code}
> That replaces this functioning handlers block:
> {code:python}
> 'task': {
> 'class': 'airflow.utils.log.file_task_handler.FileTaskHandler',
> 'formatter': 'airflow',
> 'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> 'filename_template': FILENAME_TEMPLATE,
> },
> 'processor': {
> 'class':
> 'airflow.utils.log.file_processor_handler.FileProcessorHandler',
> 'formatter': 'airflow',
> 'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER),
> 'filename_template': PROCESSOR_FILENAME_TEMPLATE,
> },
> {code}
> With this non-functioning block:
> {code:python}
> 'task': {
> 'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
> 'formatter': 'airflow',
> 'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> 's3_log_folder': REMOTE_BASE_LOG_FOLDER,
> 'filename_template': FILENAME_TEMPLATE,
> },
> 'processor': {
> 'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
> 'formatter': 'airflow',
> 'base_log_folder': os.path.expanduser(PROCESSOR_LOG_FOLDER),
> 's3_log_folder': REMOTE_BASE_LOG_FOLDER,
> 'filename_template': PROCESSOR_FILENAME_TEMPLATE,
> },
> {code}
> The key issue here is that both "task" and "processor" are being given a
> "S3TaskHandler" class to use for logging. But that is not a generic S3 class;
> it's actually a subclass of FileTaskHandler!
> https://github.com/apache/incubator-airflow/blob/1.10.1/airflow/utils/log/s3_task_handler.py#L26
> Since the template vars don't match the template string, the path to log to
> evaluates to garbage. The handler then silently fails to log anything at all.
> It is likely that anyone using a default-like logging config, plus the remote
> S3 logging feature, stopped getting DAG parsing logs (either locally *or* in
> S3) as of 1.10.0
> Commenting out the DAG parsing section of the S3 block fixed this on my
> instance.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)