heng-kuang-777 opened a new issue, #18077:
URL: https://github.com/apache/pinot/issues/18077

   ## Problem:
   
   The controller endpoint `GET /tables/{tableName}/size` performs a 
synchronous HTTP fan-out to every server hosting segments for the requested 
table. Each server computes sizes by recursively walking segment directories on 
disk via `FileUtils.sizeOfDirectory()`. The overall latency is bounded by the 
slowest server to respond, making this endpoint slow for tables with many 
segments spread across many servers.
   
   Meanwhile, SegmentStatusChecker — a periodic task running on every 
controller — already calls `TableSizeReader.getTableSizeDetails()` for every 
table that controller owns, performing the exact same server fan-out. However, 
it discards the computed TableSizeDetails result after emitting metrics, so the 
work is wasted from the REST API's perspective.
   
   For clients that don't need real-time size data, paying the full fan-out 
cost on every API call is unnecessary.
   
   ## Proposed Solution:
   
   1. Add an in-memory cache to TableSizeReader: Store the TableSizeDetails 
result after each computation. Since SegmentStatusChecker already calls 
`getTableSizeDetails()` periodically for its owned tables, the cache is 
automatically kept warm with staleness bounded by 
controller.statuschecker.frequencyInSeconds.
   2. Add a mode=cache query parameter to the REST endpoint: When mode=cache is 
specified, the endpoint returns the cached result instead of triggering a live 
fan-out. Default behavior (no mode parameter) remains unchanged for backward 
compatibility.
   3. Redirect to the owning controller: Tables are partitioned across 
controllers via the lead controller resource — each controller only owns ~1/N 
of the tables and only runs SegmentStatusChecker for those tables. When a 
mode=cache request lands on a controller that doesn't own the requested table, 
it proxy the request to the owning controller, which has the warm cache. Owning 
controller resolution uses existing infrastructure 
(LeadControllerUtils.getPartitionIdForTable() + Helix external view).
   
   ## Benefits:
     - Eliminates redundant server fan-out for clients that can tolerate 
slightly stale data (default to refresh every 5 min)
     - No new threads, timers, or background tasks — piggybacks on existing 
SegmentStatusChecker infrastructure
     - Fully backward-compatible (opt-in via query parameter)
     - Redirect pattern ensures the request always reaches the controller with 
the warm cache, regardless of which controller the VIP routes to
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to