neils-dev opened a new pull request #2183:
URL: https://github.com/apache/ozone/pull/2183


   ## What changes were proposed in this pull request?
   
   Fixes for intermittent unit test failures found in  hdds.container-service 
_TestSchemaOneBackwardsCompatibility_ and in _TestBlockDeletingService_.  
Current implementation for unit tests uses the _BackgroundService_ for block 
deletion.  This service uses a thread pool executor that uses worker thread 
tasks that run asynchronously.  The unit test callers for this service 'start' 
the _PeriodicalTask_ of the _BackgroundService_ and expect the workers to 
finish deleting blocks when they do checks for expected results.  The thread 
pool executor runs the worker tasks, _however_ they do not run to completion 
prior to returning control to the unit test callers.  This leads to an 
**intermittent error** due to results not ready when the unit test caller 
checks for expected results.
   
   Patch _**extends**_ the _PeriodicalTask_ for the background service and runs 
the worker tasks async until completion prior to returning control to the unit 
test caller.  This is realized in the _BlockDeletingServiceTestImpl_ 
_**PeriodicalTaskTestImpl**_ implementation.
   
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-5099
   
   ## How was this patch tested?
   
   Patch was tested with unit tests : _**TestSchemaOneBackwardsCompatibility**_ 
and _**TestBlockDeletingService**_.
   
   to see error, reproduce the error reliably (it is intermittent), can force 
failure though injecting fault into the 
   _BlockDeletingService.java_ implementation - adding a small delay to the 
BackgroundTask call():
   ```
       public BackgroundTaskResult call() throws Exception {
         ContainerBackgroundTaskResult crr;
         final Container container = ozoneContainer.getContainerSet()
             .getContainer(containerData.getContainerID());
         container.writeLock();
         File dataDir = new File(containerData.getChunksPath());
         long startTime = Time.monotonicNow();
         // Scan container's db and get list of under deletion blocks
         try (ReferenceCountedDB meta = BlockUtils.getDB(containerData, conf)) {
           TimeUnit.MILLISECONDS.sleep(100); 
   ```
   **pre-patch**
   
   `hadoop-hdds/container-service$ mvn 
-Dtest=TestSchemaOneBackwardsCompatibility#testDelete test` 
   
![TestSchemaOneBackwardsCompatibility_prepatch](https://user-images.githubusercontent.com/81126310/116017324-86fcfb00-a5fc-11eb-8cd7-06924ea9b567.png)
   
   `hadoop-hdds/container-service$ mvn -Dtest=TestBlockDeletingService test
   `
   
![testBlockDeletingService_prepatch](https://user-images.githubusercontent.com/81126310/116017350-9714da80-a5fc-11eb-8f86-840727dc1c8e.png)
   
   **after patch is applied**
   
   `hadoop-hdds/container-service$ mvn 
-Dtest=TestSchemaOneBackwardsCompatibility#testDelete test` 
   
![TestSchemaOneBackwardsCompatibility_afterpatch](https://user-images.githubusercontent.com/81126310/116017336-8cf2dc00-a5fc-11eb-8a6d-a72d2166789b.png)
   
   `hadoop-hdds/container-service$ mvn -Dtest=TestBlockDeletingService test`
   
   
![TestBlockDeletingService_afterpatch](https://user-images.githubusercontent.com/81126310/116017380-a85de700-a5fc-11eb-829c-bd0973a900b4.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to