[ https://issues.apache.org/jira/browse/HDFS-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anu Engineer resolved HDFS-10904. --------------------------------- Resolution: Not A Problem > Need a new Result state for DiskBalancerWorkStatus to indicate the final Plan > step errors and stuck rebalancing > --------------------------------------------------------------------------------------------------------------- > > Key: HDFS-10904 > URL: https://issues.apache.org/jira/browse/HDFS-10904 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover > Affects Versions: 3.0.0-alpha2 > Reporter: Manoj Govindassamy > Assignee: Manoj Govindassamy > Fix For: 2.9.0 > > > * A DiskBalancer {{NodePlan}} might include a Single {{MoveStep}} or a list > of MoveSteps to perform the requested disk balancing operation. > * {{DiskBalancerWorkStatus}} tracks the current disk balancing operation > status for the {{Plan}} just submitted. > * {{DiskBalancerWorkStatus#Result}} has following states and the state > machine movement for the {{currentResult}} state doesn't seem to be a driven > totally from disk balancing operation. Especially, the state movement to DONE > is happening only upon QueryResult, which can be improved. {code} > /** Various result values. **/ > public enum Result { > NO_PLAN(0), > PLAN_UNDER_PROGRESS(1), > PLAN_DONE(2), > PLAN_CANCELLED(3); > DiskBalancer > cancelPlan(String) > this.currentResult = Result.PLAN_CANCELLED; > DiskBalancer(String, Configuration, BlockMover) > this.currentResult = Result.NO_PLAN; > queryWorkStatus() > this.currentResult = Result.PLAN_DONE; > shutdown() > this.currentResult = Result.NO_PLAN; > this.currentResult = Result.PLAN_CANCELLED; > submitPlan(String, long, String, String, boolean) > this.currentResult = Result.PLAN_UNDER_PROGRESS; > {code} > * More importantly, when the final {{MoveStep}} of the {{NodePlan}} fails, > the currentResult state is stuck in {{PLAN_UNDER_PROGRESS}} forever. User > querying the status will assume the operation is in progress when in reality > its not making any progress. User can also run {{Query}} command with > _verbose_ option which then will display more details about the operation > which includes details about errors encountered. > ** Query Output: {code} > Plan File: <_file_path_> > Plan ID: <_plan_hash_> > Result: PLAN_UNDER_PROGRESS > {code} > ** {code} > "sourcePath" : "/data/disk2/hdfs/dn", > "destPath" : "/data/disk3/hdfs/dn", > "workItem" : > .. .. .. > "errorCount" : 0, > "errMsg" : null, > .. .. > "maxDiskErrors" : 5, > .. .. .. > {code} > ** But, user has to decipher these details to make out that the disk > balancing operation is stuck as the top level Result still says > {{PLAN_UNDER_PROGRESS}}. So, we want the DiskBalancer differentiate between > the in-progress operation and the stuck or final error operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org