FrankChen021 commented on issue #11140:
URL: https://github.com/apache/druid/issues/11140#issuecomment-825352836


   Yeah, I forgot that. 
   
   I checked the code, and think that LIMIT/OFFSET might have nothing to do 
solve the problem. 
   
   Web console issues a SQL to Druid router to get all tasks, and that SQL 
would be translated into a HTTP call to overlord to get ALL tasks, and then 
overlord searches the `druid_tasks` table in metadata storage. During this 
process, the LIMIT/OFFSET is not passed down to narrow the search range on 
metadata storage, which means a full scan on metadata storage , and `payload`, 
`status_payload` deserialization on all returned rows are performed.
   
   
   
   Based on the current logic, I need more information from you to make sure 
deserialization is the bottle neck.
   1. how many records in your druid_tasks?
   2. how many records  in druid_tasks table where create_date > now() -  
druid.indexer.storage.recentlyFinishedThreshold AND active = 0
   3. how long will it take if we execute the sql to get all fields on 
druid_tasks based on the condition given in 2 above ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to