anteverse opened a new issue, #38105:
URL: https://github.com/apache/airflow/issues/38105
### Apache Airflow version
main (development)
### If "Other Airflow 2 version" selected, which one?
_No response_
### What happened?
Hello!
While running a KubernetesPodOperator task, with logs exported to S3, we may
get such logs:
```
[2024-02-29, 18:28:18 UTC] {pod_manager.py:483} INFO - [base] None
[2024-02-29, 18:33:19 UTC] {pod_manager.py:483} INFO - [base] None
[2024-02-29, 18:38:20 UTC] {pod_manager.py:483} INFO - [base] None
[2024-02-29, 18:43:21 UTC] {pod_manager.py:483} INFO - [base] None
```
This comes from the fact that the container hasn't produced any logs during
the read_timeout window (5 minutes), yet `pod_manager.py` module produces one,
with nothing to be logged. `None` value is then interpreted in a log info.
Although a small `if message_to_log:` looks like a promising fix in
`pod_manager.py:483`, I wonder if there coud be any side effects, as I'm
getting the issue only if the logs are exported to an external storage (aws s3
in my case).
Let me know if I need to provide anything else.
### What you think should happen instead?
As no lines were produced in the container, I feel like `pod_manager.py`
should not produce any.
### How to reproduce
With remote logging on S3 enabled, run a KubernetesPodOperator task that
does not produce any logs during a time window of at least 5 minutes.
### Operating System
Airflow on Kubernetes
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else?
This can also be reproduced in unit-tests, as such:
```
@mock.patch("airflow.providers.cncf.kubernetes.utils.pod_manager.PodManager.container_is_running")
def test_fetch_container_logs_do_not_log_none(self,
mock_container_is_running, caplog):
MockWrapper.reset()
caplog.set_level(logging.INFO)
def consumer_iter():
"""This will simulate a container that hasn't produced any logs
in the last read_timeout window"""
yield from ()
with mock.patch.object(PodLogsConsumer, "__iter__") as
mock_consumer_iter:
mock_consumer_iter.side_effect = consumer_iter
mock_container_is_running.side_effect = [True, True, False]
self.pod_manager.fetch_container_logs(mock.MagicMock(),
"container-name", follow=True)
assert "[container-name] None" not in (record.message for record
in caplog.records)
```
This particular test will fail as the log `[container-name] None` will be
produced 2 times.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]