jasonk000 commented on issue #11140: URL: https://github.com/apache/druid/issues/11140#issuecomment-843615701
I have done some profiling on our stack here, my analysis follows. We are configured with `HeapMemoryTaskStorage`. By issuing repeated SQL requests against broker (such as with `ab`) we can see the workload increase on `overlord`. Taking a CPU profile of the overlord host and focusing on the CPU related to the `/tasks` endpoint gives a view that over 50% of the CPU load is in `HeapMemoryTaskStorage::getTasks`, and only a small % of time in serialization. Notice specifically in the before/after below that the % of time (width of bar) of `getCompletedTaskInfo...` (highlighted in a magenta-ish colour), and that the bulk of the time is in `sortedCopy`. Before  After changes, `getCompletedTaskInfo...` is significantly reduced as a % of the overall CPU time, so much that serialization is now far larger than the query time.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
