IAL32 commented on PR #30886:
URL: https://github.com/apache/airflow/pull/30886#issuecomment-1558701030
Even if this is closed, I would like to make a note on Glue logging.
I realized these issues as well, so I created a helper method to create a
logger specifically for Glue jobs, which works for both glueetl and pythonshell
jobs:
```
def get_logger(name: str = None, level: Any = logging.INFO, log_format: str
= DEFAULT_LOG_FORMAT) -> logging.Logger:
"""Returns a logger configured for Glue jobs"""
formatter = logging.Formatter(fmt=log_format)
# glue sets its own handlers by default, but they suck.
# this handler redirects INFO, WARNING and DEBUG to sys.stdout
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(logging.DEBUG)
stdout_handler.addFilter(lambda record: record.levelno < logging.ERROR)
stdout_handler.setFormatter(formatter)
# this handler redirects ERROR to sys.stderr
stderr_handler = logging.StreamHandler(sys.stderr)
stderr_handler.setLevel(logging.ERROR)
stderr_handler.setFormatter(formatter)
logger = logging.getLogger(name=name)
logger.handlers.clear()
logger.setLevel(level)
logger.addHandler(stdout_handler)
logger.addHandler(stderr_handler)
return logger
```
In effect, this will log all INFO, WARNING and DEBUG to /output and all
ERROR to /error.
The solution proposed by this PR still helps if one does not want to have
either boilerplate code in their jobs or install internal dependencies.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]