[
https://issues.apache.org/jira/browse/AIRFLOW-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601577#comment-16601577
]
Apache Spark commented on AIRFLOW-2325:
---------------------------------------
User 'fangpenlin' has created a pull request for this issue:
https://github.com/apache/incubator-airflow/pull/3229
> Task logging with AWS Cloud watch
> ---------------------------------
>
> Key: AIRFLOW-2325
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2325
> Project: Apache Airflow
> Issue Type: New Feature
> Components: logging
> Reporter: Fang-Pen Lin
> Priority: Minor
>
> In many cases, it's ideal to use remote logging while running Airflow in
> production, as the worker could be easily scale down or scale up. Or the
> worker is running in containers, where the local storage is not meant to be
> there forever. In that case, the S3 task logging handler could be used
> [https://github.com/apache/incubator-airflow/blob/master/airflow/utils/log/s3_task_handler.py]
> However, it comes with drawback. S3 logging handler only uploads the log when
> the task completed or failed. For long running tasks, it's hard to know
> what's going on with the process until it finishes.
> To make more real-time logging, I built a logging handler based on AWS
> CloudWatch. It uses a third party python package `watchtower`
>
> [https://github.com/kislyuk/watchtower/tree/master/watchtower]
>
> I created a PR here [https://github.com/apache/incubator-airflow/pull/3229],
> basically I just copy-pasted the code I wrote for my own project, it works
> fine with 1.9 release, but never tested with master branch. Also, there is a
> bug in watchtower causing task runner to hang forever when it completes. I
> created an issue in their repo
> [https://github.com/kislyuk/watchtower/issues/57]
> And a PR for addressing that issue
> [https://github.com/kislyuk/watchtower/pull/58]
>
> The PR is still far from ready to be reviewed, but I just want to get some
> feedback before I spend more time on it. I would like to see if youguys want
> this cloudwatch handler goes into the main repo, or do youguys prefer it to
> be a standalone third-party module. If it's that case, I can close this
> ticket and create a standalone repo on my own. If the PR is welcome, then I
> can spend more time on polishing it based on your feedback, add tests /
> documents and other stuff.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)