imply-cheddar commented on PR #12571: URL: https://github.com/apache/druid/pull/12571#issuecomment-1206683385
I'm approving this. While the blog post does show a bimodal distribution where it looks like queries are faster, what we've seen most recently is that this bimodal distribution is effectively non-deterministic. I.e. some queries get impacted, others don't. When we are running Druid, supporting Druid, using Druid in product contexts, non-determinism basically equates to an inability to reason about how the system will behave. So, we have fallen into a common pattern 1) Be told that a cluster experiences slow queries sometimes 2) Look at metrics and identify that there is a non-deterministic case of queries being randomly delayed in query/wait, but only for a subset of segments and not holistically 3) Be like, "omg, is it fifo again?" and then be like, "oh yeah, we have no clue because it's non-deterministic" 4) Well, let's set it to fifo and then see how things work out 5) Now that fifo is set to true, we can actually aee what's going on, yay! 6) Improve the cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
