safaehar opened a new pull request, #68584:
URL: https://github.com/apache/airflow/pull/68584

   ### What
   
   `TriggerRunnerSupervisor._process_log_messages_from_subprocess` primes 
itself by calling `airflow.sdk.log.configure_logging()` with **no arguments**. 
`json_output` defaults to `False`, so this call reconfigures structlog 
**globally** and installs the text `WriteLogger` factory — overwriting the 
bytes `BytesLogger` factory that startup set up when `[logging] json_logs = 
True`.
   
   The stdout/stderr forwarders (`_create_log_forwarder` → `forward_to_log`) 
are already wrapped with the JSON (bytes) processor chain but bind their 
underlying logger **lazily**. As soon as a trigger subprocess writes to 
stdout/stderr — e.g. an import-time `DeprecationWarning` from a provider 
trigger that imports a heavy client such as `KubernetesPodTrigger` (kubernetes) 
or `S3KeyTrigger` (boto3) — the lazy bind resolves against the now-text 
factory, and `WriteLogger.msg` does `message + "\n"` on bytes produced by the 
JSON renderer:
   
   ```
   TypeError: can't concat str to bytes
     File ".../structlog/_output.py", line 217, in msg
       self._write(message + "\n")
   ```
   
   This crashes the `TriggerRunnerSupervisor.run` loop and the triggerer enters 
CrashLoopBackOff. It only manifests when `json_logs` is enabled **and** the 
triggerer actually runs a trigger whose subprocess emits to stdout/stderr — 
idle triggerers and those running only quiet triggers (`DateTimeTrigger`, etc.) 
never reach `forward_to_log`, which is why the crash looks intermittent across 
deployments.
   
   ### Fix
   
   Pass `json_output` from the `[logging] json_logs` config (the same 
`conf.getboolean("logging", "json_logs", fallback=False)` used elsewhere, e.g. 
`airflow/logging_config.py`) so the global structlog factory stays consistent 
with the rest of the process. This matches the already-hardcoded 
`logging_processors(json_output=True)` a few lines below.
   
   ### Tests
   
   Adds a parametrized regression test asserting 
`_process_log_messages_from_subprocess` calls 
`configure_logging(json_output=<json_logs>)` for both `True` and `False`.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   -->
   
   ---
   
   ^ Add meaningful description above. Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)
 for more information.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to