tanisdlj opened a new issue #11841:
URL: https://github.com/apache/druid/issues/11841


   ### Affected Version
   
   0.22.2
   
   ### Description
   
   - Cluster size: 2 brokers, 2 routers, 2 coordinators, 37 historicals (15 
hot, 21 cold, 1 frozen), 2 overlords, 43 middlemanagers
   - Steps to reproduce the problem: One morning we found that during the 
night, a massive rebalanced happened leaving many servers at 100% disk usage 
while others in the same tier were left empty. After restarting the coordinator 
segments were better balanced but we started noticing this issue. Many servers 
in two different tiers had their disks full while reporting not being full.
   
   Coordinator log:
   
   ```
   Oct 25 08:59:53 druid-master-1 java[19246]: 2021-10-25T08:59:53,185 ERROR 
[Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - 
Server[http://stde2-hhot-10.stde2] Failed 
segment[datasource_2021-09-04T11:00:00.000Z_2021-09-04T12:00:00.000Z_2021-09-04T11:00:00.016Z_226]
 request[SegmentChangeRequestLoad] with cause [Exception loading 
segment[datasource_2021-09-04T11:00:00.000Z_2021-09-04T12:00:00.000Z_2021-09-04T11:00:00.016Z_226]].
   Oct 25 08:59:53 druid-master-1 java[19246]: 2021-10-25T08:59:53,475 ERROR 
[Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - 
Server[http://stde2-hhot-01.stde2] Failed 
segment[datasource_2021-08-31T19:00:00.000Z_2021-08-31T20:00:00.000Z_2021-08-31T19:00:00.019Z_449]
 request[SegmentChangeRequestLoad] with cause [Exception loading 
segment[datasource_2021-08-31T19:00:00.000Z_2021-08-31T20:00:00.000Z_2021-08-31T19:00:00.019Z_449]].
   Oct 25 08:59:55 druid-master-1 java[19246]: 2021-10-25T08:59:55,918 ERROR 
[Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - 
Server[http://stde2-hhot-01.stde2] Failed 
segment[datasource_2021-10-24T17:00:00.000Z_2021-10-24T18:00:00.000Z_2021-10-24T17:00:00.012Z_339]
 request[SegmentChangeRequestLoad] with cause [Exception loading 
segment[datasource_2021-10-24T17:00:00.000Z_2021-10-24T18:00:00.000Z_2021-10-24T17:00:00.012Z_339]].
   ```
   
   
![image](https://user-images.githubusercontent.com/1453135/138671787-c80dcd97-ce28-4e29-93ae-a47c501e6e66.png)
   
   
![image](https://user-images.githubusercontent.com/1453135/138672344-a186a940-9be9-4d24-b1d3-c9ba1261fd70.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to