julienlafont-tabmo opened a new issue #11741:
URL: https://github.com/apache/druid/issues/11741


   Hello,
   
   I noticed that the lookups (druid-lookups-cached-global) are loaded at the 
start of each task, whatever the type of task.
   
   I have the feeling (but maybe I'm wrong) that they are loaded in contexts 
where:
    - they are never necessary: kill task, compact task
    - they can be necessary but in most cases it won't be the case: index task
   
   For kill task, it seems obvious to me that the knowledge of lookup is not 
necessary.
   
   For indexing tasks, some lookups can be necessary if they are referenced in 
the spec. But in other cases, they are not necessary. It might be possible to 
load only the necessary lookups in the tasks that need them.
   
   And for compaction tasks, I'm not sure. I suppose lookups could be used in 
the DimensionSpec, so it'll be like the indexing case.
   
   Loading lookups can be a very heavy load. In my case, the initial load (50 
SQL lookups) takes several minutes and puts a huge strain on my RDS replica. If 
it is possible to avoid loading them unnecessarily, it will be a great saving 
of time and resources.
   
   Do you think an optimization could be done on this situation?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to