antonio-mello-ai commented on PR #63467:
URL: https://github.com/apache/airflow/pull/63467#issuecomment-4170342953

   @bbovenzi — I've been thinking about this and dug into the log architecture 
to understand the trade-offs.
   
   You're right that backend search is the more complete approach. But looking 
at the current codebase, **all existing log filters (log level, source/logger) 
are already client-side** — the API returns the full log and the browser 
filters. So the client-side search in this PR follows the same pattern that's 
already established.
   
   That said, I think there's a natural phased path here:
   
   **Phase 1 (this PR):** Client-side search — covers the most common case 
(reviewing completed task logs, moderate size). Same architectural pattern as 
the existing level/source filters. Immediate value, zero backend changes.
   
   **Phase 2:** Add a `query` parameter to `GET /logs/{try_number}`. The 
`FileTaskHandler` already streams logs line-by-line via 
`_stream_lines_by_chunk`, so injecting a filter predicate there is surgical — 
not an architectural rewrite. The continuation token mechanism already supports 
paginated reads, it's just not used by the UI today.
   
   **Phase 3 (if needed):** Extend search to remote log handlers. 
Elasticsearch/OpenSearch have native query support; S3/GCS would need 
server-side download + grep.
   
   If you'd like to go down this path, we'd be happy to implement phases 2 and 
3 as follow-up PRs.
   
   Of course, if you'd rather skip straight to backend search or prefer a 
different direction entirely, happy to adjust.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to