mounikanakkala opened a new issue #10939:
URL: https://github.com/apache/druid/issues/10939


   ### Affected Version
   
   0.20.0
   
   ### Description
   
   {
       "query/time": 60,
       "query/bytes": -1,
       "success": false,
       "identity": "allowAll",
       "exception": "QueryInterruptedException{msg=java.io.IOException: 
Connection reset by peer, code=Unknown exception, 
class=java.util.concurrent.ExecutionException, host=xx.xx.xx.xx:8105}",
       "interrupted": true,
       "reason": "QueryInterruptedException{msg=java.io.IOException: Connection 
reset by peer, code=Unknown exception, 
class=java.util.concurrent.ExecutionException, host=host=xx.xx.xx.xx:8105}"
   }
   
   - Cluster size configuration
       - 4 routers and 4 brokers processes, 6 middle managers instances. Each 
middle manager seems to be creating 4 peon tasks (although we haven't done any 
explicit configuration for this)
       
   - Steps to reproduce the problem
       - Happens intermittently. 
       - But what we observed is when a peon just finished running an ingestion 
task and broker at the same time runs a query to fetch data from middle manager 
for realtime data, broker seems to be querying that particular peon. However, 
since peon finished it's ingestion job, it might have got destroyed and broker 
could not connect to the peon on that port.
   - Unfortunately lost ingestion tasks logs on Druid console as it happened 
yesterday. If we can fetch any logs on the instances, please let us know.
   
   ### Summary
   - Our understanding is peons are processes that run on middle manager 
instances. If one Peon task is done, broker should be aware of that. 
   - It looks like Broker does not know that a peon is no longer available and 
hence runs a query on that Peon and fails the query due to exception of not 
being able to connect
   - May be the design for Broker should be to try another available peon for 
the same data.
   
   If that is not the correct summary, please help us on this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to