gianm commented on issue #5709: Broker resiliency to misbehaving historical nodes URL: https://github.com/apache/incubator-druid/issues/5709#issuecomment-414200342 > Don't these issues already exist today when druid.broker.retryPolicy.numTries > 1? The idea would be to respect this parameter, but in a smarter way, by trying a different replica (if available) on each try. IIRC, RetryQueryRunner only retries in a specific situation: when a segment has moved to another server since the time the broker made the query. Specifically it looks at the `X-Druid-Response-Context` header for the key that lists missing segments (as reported by ReportTimelineMissingSegmentQueryRunner). I don't think it retries on errors or anything else like that. So it shouldn't have those issues I mentioned, because it's not as general as what this issue aims to accomplish. > BTW, looking at RetryQueryRunner, it seems like we're trying numTries + 1 times total (one initial try + numTries retries), contrary to what the docs and the name of the parameter indicate. If there's indeed a bug there then everyone may already have retries enabled by default. I need to run a few unit tests to make sure I'm not missing anything there though. Interesting, well, either way more unit tests are valuable!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
