churromorales commented on issue #5709:
URL: https://github.com/apache/druid/issues/5709#issuecomment-1332739289

   @gianm I think what we have noticed in our k8s environment is that queries 
fail due to pods restarting and / or nodes going down.  I was thinking a fix 
could be this: 
   
   1. You have a list of servers you are going to send the query to. 
   2. You make the query, if there is a timeout then you check the list of 
available or current servers.
   3. If your original list is not contained in the list of current servers, 
just make the query again, it will go to a replica and you will be okay. 
   
   I think this would solve a lot of the issues we have currently with our 
druid users that have issues when pods restart and queries fail.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to