[GitHub] [incubator-druid] capistrant commented on issue #6988: [Improvement] historical fast restart by lazy load columns metadata(20X faster)

GitBox Fri, 20 Sep 2019 08:10:57 -0700

capistrant commented on issue #6988: [Improvement] historical fast restart by 
lazy load columns metadata(20X faster)
URL: https://github.com/apache/incubator-druid/pull/6988#issuecomment-533593456
 
 
   @clintropolis @gianm @pzhdfy We love this patch. It makes our lives much 
easier when rolling out changes and running through upgrades. 
   
   > @pzhdfy I did similar lazy load thing in my code base, but I found there 
is a consequence. If historical node is force killed(kill -9) when it is asked 
to download new segment, it is very likely unzip failure and left a corrupted 
segment folder. When lazy load=false, this segment can be ignored during 
historical startup(loading will fail), but when lazy load=true, this corruption 
is only known when a query comes in. Currently there is no interface to tell 
SegmentLoaderLocalCacheManager to unload this segment or reload this segment, 
thus queries always fail. Currently the only solution is to delete this segment 
folder and restart historical node to ignore this segment(because folder is 
gone).
   > 
   > Better introduce unload/reload interface together in this PR.
   
   I am nervous about this though. As of now we have accepted the risk and are 
willing to intervene when needed. Do you have any plans of addressing this 
comment in this PR or would this be an additional PR. I do think that it would 
get more support if we did analysis on this and solved it if it is indeed a 
problem.
   
   
   
   > Interesting idea, I too have felt the pain of multi-day rollouts for tiers 
of densely loaded historical servers.
   > 
   > I've only scanned changes so far, I'll try to do a full review later. In 
the mean time, if possible could you run query benchmarks before this patch and 
after this patch with `lazyLoadOnStart = false`, as a sanity check to make sure 
that there are not any odd performance effects of introducing the memoized 
suppliers? I don't really _expect_ any noticeable effect, but it would make buy 
in to this change easier. (No worries if you don't have a chance to get to it, 
i'll try to do this myself once I get around to full review).
   > 
   > I would also be interested in the performance cost for queries that _do_ 
have to eat the deserialization when `lazyLoadOnStart = true`, and whether it 
opens up scenarios where the memoizing supplier becomes a sort of choke point 
for the processing pool if the cost is high.
   
   Regarding Clint's comment on performance. We have been running it for nearly 
a week on a cluster with hundreds of thousands of segments and hundreds of 
thousands of queries a day and our metrics collection shows negligible change 
in performance (but this is vs druid 11. If druid 15 had large performance 
gains across the board before this patch there could be a bigger change). We 
added this as a part of our upgrade to druid 15 though so we have not seen 
performance with it off in druid 15, just in druid 11.
   
   Regardless, we are eager to get this reviewed and accepted upstream. Our 
experience with it so far has been great and we'd love to help in whatever way 
possible to get it merged.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

[GitHub] [incubator-druid] capistrant commented on issue #6988: [Improvement] historical fast restart by lazy load columns metadata(20X faster)

Reply via email to