Gargi-jais11 opened a new pull request, #9465:
URL: https://github.com/apache/ozone/pull/9465

   ## What changes were proposed in this pull request?
   
   - EstimatedBytesToMoved and EstimatedTimeLeft should not be shown up if no 
container movement happens.
   - Improve threshold validation error message. When running the DiskBalancer 
update command with a threshold value of 100.0, the operation fails on all 
datanodes with the following error:
   ```
   bash> ozone admin datanode diskbalancer update -t 100.0 
--in-service-datanodes
   Error on node [DN-1]: Threshold must be a percentage(double) in the range 0 
to 100.
   ```
   A threshold of 0 means any deviation from ideal usage (even 0.01%) triggers 
     container movement
   
   This leads to excessive and continuous balancing operations and results in 
unnecessary I/O overhead and resource consumption
   A Threshold value can never be 100.0% as it would mean allow moving 100% of 
a disk's contents, effectively emptying one disk.
   Suggested improvement:
   Rather the error message should clarify that 0 and 100 is excluded. The 
validation is being updated to exclude 0, requiring threshold to be in 
   the range (0, 100) exclusive.
   new error msg:
   ```
   Error on node [DN-1]: Threshold must be a percentage(double) in the range 0 
to 100 both exclusive.
   ```
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-14110
   
   ## How was this patch tested?
   
   Added check for estimatedBytes in unit test `TestDiskBalancerService`.
   Tested manually:
   before patch:
   ```
   bash-5.1$ ozone admin datanode diskbalancer status --in-service-datanodes
   Status result:
   Datanode                            Status          Threshold(%)    
BandwidthInMB   Threads      StopAfterDiskEven    SuccessMove  FailureMove  
BytesMoved(MB)  EstBytesToMove(MB) EstTimeLeft(min)    
   ozone-datanode-5.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   638                2                   
   ozone-datanode-3.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   1                  1                   
   ozone-datanode-4.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   1                  1                   
   ozone-datanode-2.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   698                2                   
   ozone-datanode-1.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   3                  1                   
   
   Note: Estimated time left is calculated based on the estimated bytes to move 
and the configured disk bandwidth.
   ```
   After code chnages output fixed:
   ```
   bash-5.1$ ozone admin datanode diskbalancer report --in-service-datanodes
   Report result:
   Datanode                                           VolumeDensity
   ozone-datanode-2.ozone_default                     8.413243594594944E-4
   ozone-datanode-5.ozone_default                     8.296842069073773E-4
   ozone-datanode-3.ozone_default                     7.682500684380311E-4
   ozone-datanode-1.ozone_default                     7.585499413112762E-4
   ozone-datanode-4.ozone_default                     7.507898396098833E-4
   
   bash-5.1$ ozone admin datanode diskbalancer status --in-service-datanodes
   Status result:
   Datanode                            Status          Threshold(%)    
BandwidthInMB   Threads      StopAfterDiskEven    SuccessMove  FailureMove  
BytesMoved(MB)  EstBytesToMove(MB) EstTimeLeft(min)    
   ozone-datanode-1.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   0                  0                   
   ozone-datanode-4.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   0                  0                   
   ozone-datanode-3.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   0                  0                   
   ozone-datanode-5.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   0                  0                   
   ozone-datanode-2.ozone_default      RUNNING         0.0001          10       
       5            false                0            0            0            
   0                  0                   
   
   Note: Estimated time left is calculated based on the estimated bytes to move 
and the configured disk bandwidth.
   ```
   
   Threshold error output:
   ```
   bash-5.1$ ozone admin datanode diskbalancer start -t 0 --in-service-datanodes
   Error on node [172.18.0.11:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.10:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.8:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.9:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.7:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Failed to start DiskBalancer on nodes: [172.18.0.11:19864, 
172.18.0.10:19864, 172.18.0.8:19864, 172.18.0.9:19864, 172.18.0.7:19864]
   bash-5.1$ ozone admin datanode diskbalancer start -t 100 
--in-service-datanodes
   Error on node [172.18.0.11:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.10:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.8:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.9:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Error on node [172.18.0.7:19864]: Threshold must be a percentage(double) in 
the range 0 to 100 both exclusive.
   Failed to start DiskBalancer on nodes: [172.18.0.11:19864, 
172.18.0.10:19864, 172.18.0.8:19864, 172.18.0.9:19864, 172.18.0.7:19864]
   bash-5.1$ ozone admin datanode diskbalancer start -t 0.001 
--in-service-datanodes
   Started DiskBalancer on all IN_SERVICE nodes.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to