Hi all,

I would like to propose a change to the get_log API [1]: removing the
log_pos metadata and discontinuing the JSON response format.

Following the update “Update useLog to support application/x-ndjson #54445”
[2], the frontend will adopt the application/x-ndjson format, which is more
efficient for streaming logs. Additionally, with the upcoming fix “Support
streaming log to the end for get_log API #54552” [3] (currently still a
draft PR), the get_log API will support streaming logs to completion. This
enhancement makes the log_pos metadata and the continuation_token logic [4]
unnecessary.

In the previous update, “Resolve OOM When Reading Large Logs in Webserver
#49470” [5], I introduced LogStreamAccumulator [6] to handle the log_pos
metadata by flushing the log stream to temporary files and reading them
back. However, with the new streaming support, we can now yield the log
stream directly to the end in a single API call, improving performance and
reducing complexity.

Benchmark results [7] show that, even after the #49470 refactor, using the
JSON format with get_log can still cause OOM issues when reading large
logs. Instead of continuing to support the JSON format, I suggest *replacing
the JSON format with a Zip format*. This would allow users to conveniently
download full logs while maintaining memory efficiency for the API server.

I appreciate your consideration of this proposal and look forward to your
feedback.

Thank you!

Best regards,
Jason

[1]:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/api_fastapi/core_api/routes/public/log.py#L75
[2]: https://github.com/apache/airflow/pull/54445
[3]: https://github.com/apache/airflow/pull/54552
[4]:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/api_fastapi/core_api/routes/public/log.py#L166
[5]: https://github.com/apache/airflow/pull/49470
[6]:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/utils/log/log_stream_accumulator.py
[7]: https://github.com/apache/airflow/pull/49470#issuecomment-2908306229

Reply via email to