pchang388 opened a new issue, #12701:
URL: https://github.com/apache/druid/issues/12701

   Apologies if this breaks any rules, but I tried on the druid forums without 
much success so trying here to see if I can reach a different audience. 
Relevant information below and more details in the druid forum post. 
   
   * Druid Version: 0.22.1
   * Kafka Ingestion (idempotent producer)
   * Overlord type: remote
   
   
https://www.druidforum.org/t/kafka-ingestion-peon-tasks-success-but-overlord-shows-failure/7374
   
   In general when we run all our tasks, we start seeing issues between 
Overlord and MM/Peons. Often times, the Peon will show that the task was 
successful but the overlord believes it failed and tries to shut it down. And 
things start to get sluggish with the Overlord and it starts taking a while to 
recognize completed tasks and tasks that are trying to start which seems to be 
pointing at a communication/coordination failure between Overlord and MM/Peons. 
We even see TaskAssignment between Overlord and MM timeouts (PT10M - default is 
PT5M) occur. 
   
   The only thing that seems to be able to help is reducing the number of tasks 
we have running concurrently by suspending certain supervisors. Which also 
indicates an issue with the 3 Druid services handling the load of our current 
ingestion. But according to system metrics, resource usage is not hitting any 
limits and it still has more compute it can use. It's odd since we know there 
are probably a lot of users ingesting more data per hour than us and we don't 
see this type of issue in their discussions/white papers.
   
   Any help will definitely be appreciated.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to