The GitHub Actions job "Tests (ARM)" on airflow.git/main has failed. Run started by GitHub user potiuk (triggered by potiuk).
Head commit for run: fd950357d6b0485a962b66707be88d947c81bd06 / Kaxil Naik <[email protected]> Fix KubernetesPodOperator emitting orphan timestamps for empty container writes (#67652) When a container running under `KubernetesPodOperator` writes an empty line, kubelet streams it back (with `timestamps=True`) as `"<rfc3339-ts> \n"` -- a timestamp followed by a separator space and an empty message. `parse_log_line` in `pod_manager.py` called `line.strip().partition(" ")`, which removed the trailing separator space before partitioning, so the function returned `timestamp=None` and the caller treated the line as a continuation of the previous buffered log record. The bare RFC3339 string was then appended onto the previous message and emitted as a multi-line log where only the first line carried the Airflow `[ts] {pod_manager.py:N} INFO -` prefix, leaving unprefixed timestamp rows interleaved in task logs: ``` [2026-05-28T13:07:50.160+0000] {pod_manager.py:520} INFO - [base] first test line 2026-05-28T13:07:57.030578889Z 2026-05-28T13:07:57.030581518Z 2026-05-28T13:07:57.030642740Z [2026-05-28T13:07:57.034+0000] {pod_manager.py:520} INFO - [base] last test line ``` Downstream that breaks [`airflow.utils.log.file_task_handler._parse_timestamp`](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/utils/log/file_task_handler.py#L201-L203), which feeds the line to `pendulum.parse` after stripping `[]`: malformed fragments from these orphan rows can raise `ValueError: month must be in 1..12` and fail the task entirely. Closes #36571. ## Root cause and history Regressed in [#33675](https://github.com/apache/airflow/pull/33675) (merged 2023-08-24, shipped in cncf-kubernetes **7.5.0**) which replaced the original `line.find(\" \")` split with a `line.strip().partition(\" \")` pattern under the banner of a refactor: ```diff - split_at = line.find(\" \") - if split_at == -1: - ... - timestamp = line[:split_at] - message = line[split_at + 1 :].rstrip() + timestamp, sep, message = line.strip().partition(\" \") + if not sep: + ... ``` The pre-refactor implementation correctly handled `<ts> \n` because `find(\" \")` matched the separator space directly and the message-side `.rstrip()` produced an empty string. The new code strips the separator off before partitioning, so the function loses its only signal that the line is well-formed. This matches the regression window the original reporter described in [#36571](https://github.com/apache/airflow/issues/36571): the bug appeared after upgrading cncf-kubernetes from 7.4.2 (pre-refactor) to 7.12.0+ (post-refactor) and is still reproducible on current `main` (10.17.x). ## Fix * `parse_log_line` no longer pre-strips the line; it `rstrip(\"\\n\")` only and partitions on the original separator, so empty container writes are recognised as `(timestamp, \"\")` rather than as continuations. If the partition yields no separator the whole line is tried as a bare timestamp (some kubelet versions emit `<ts>\\n` with no trailing space), and parse failures fall through to the original return-the-raw-line path. It also catches `ValueError`, not just `ParserError`, so a malformed timestamp can never escape into Airflow's downstream parsers. * The sync (`PodManager.fetch_container_logs.consume_logs`) and async (`AsyncPodManager.fetch_container_logs_before_current_sec`) log consumer loops skip emit for empty messages -- the resume marker still advances in the sync path so reconnect-since-time stays correct, but no noisy `[base] ` row is written. ## Tests * Parametrized `test_parse_log_line_handles_empty_container_writes` covers `<ts> \\n`, `<ts>\\n`, and `<ts> ` (no newline). Verified RED on `main`, GREEN with the fix. * End-to-end `test_empty_container_lines_do_not_pollute_previous_message` drives `fetch_container_logs` with the exact log sequence from the issue and asserts no orphan timestamps land in `caplog`. Also RED on `main`, GREEN with the fix. ## Gotchas * Truly empty container output (just `\\n`) is no longer surfaced as a `[base]` row. That output carries no information for the task log reader and was previously the trigger for downstream pendulum failures, so dropping it is a net improvement; if a future use case needs to count blank container lines, that's separable work. Report URL: https://github.com/apache/airflow/actions/runs/26599604326 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
