DomGarguilo commented on issue #1791:
URL: https://github.com/apache/accumulo/issues/1791#issuecomment-764745745


   The issue may be caused by this section of code:
   ```java
         // Wait for all of the tablets to hosted ...
         log.info("Waiting on hosting and balance");
         TabletLocations ds;
         for (ds = TabletLocations.retrieve(ctx, tableName); ds.hostedCount != 
TABLETS;
             ds = TabletLocations.retrieve(ctx, tableName)) {
           Thread.sleep(1000);
         }
   
         // ... and balanced.
         ctx.instanceOperations().waitForBalance();
         do {
           // Give at least another 5 seconds for migrations to finish up
           Thread.sleep(5000);
           ds = TabletLocations.retrieve(ctx, tableName);
         } while (ds.hostedCount != TABLETS);
   
         // Pray all of our tservers have at least 1 tablet.
         assertEquals(TSERVERS, ds.hosted.keySet().size());
   
         // Kill two tablet servers hosting our tablets. This should put 
tablets into suspended state,
         // and thus halt balancing.
   
         TabletLocations beforeDeathState = ds;
   
         serverStopper.eliminateTabletServers(ctx, beforeDeathState, 2);
   ```
   The problem may be coming from the amount of time that is waited for 
migrations to finish. In the for loop, the break condition is the same as the 
do-while loop meaning once the for loop is finished the do-while will only 
execute once(essentially rendering it useless). This may be an issue since the 
migrations are only ever given 5 seconds to complete before the pre-suspension 
state of the tablets are recorded. After this happens, 2 servers are killed 
which sometimes takes a bit of time. This may be the window of time in which 
the tablets migrate under the radar.
   
   I have tried increasing this amount of time and haven't gotten the error 
since, however this error doesn't happen too often as is so its hard to tell if 
this is definitely the cause.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to