keith-turner commented on issue #4687:
URL: https://github.com/apache/accumulo/issues/4687#issuecomment-2253101413

   @kevinrr888 here are some of the things I was thinking about.  
   
   Each check could have a well defined name that allows only that check to be 
run and that a user could list all possible checks.  Some checks may take a 
long time to run and if a user wants to run a specific check then they would 
not want to wait on everything.
   
   Was also thinking that internally the checks could all implement a similar 
interface and declare dependencies on other checks, allowing the dependencies 
to be run first. 
   
   
   So maybe we could have something like the following to list checks that 
could be run.
   
   ```
   accumulo admin check --list
     Check Name        Description                                              
                        Depends on
   
     system_config     Validate the system config stored in zookeeper
     root_metadata     Checks integrity of the root tablet metadata stored in 
zookeeper                  system_config 
     root_table        Scans all the tablet metadata stored in the root table 
and checks integrity      root_metadata 
     metadata_table    Scans all the tablet metadata stored in the metadata 
table and checks integrity  root_table
     system_files      Checks that files in system tablet metadata exists in 
DFS                        root_table
     user_files        Checks that files in user tablet metadata exists in DFS  
                        metadata_table
   ```
   
   Then to run checks could do something like the following that starts running 
all checks following the dependency graph and does not run checks if their 
dependency check failed.  Was also thinking the command could print high level 
status to stdout and details about failures to stderr (so thinking details 
about the failure would go to check.log in example below).  Ideally what ends 
up in check.log would be some machine readable format like json, but not really 
sure how this would play out.
   
   ```
   accumulo admin check run 2> check.log
   
     Check Name       Status
   
     system_config    OK
     root_table       OK
     root_metadata    OK
     metadata_table   FAILED
     system_files     OK
     user_files       SKIPPED_DEPENDENCY_FAILED
   ```
   
   Could somehow support a regex pattern for selecting checks to run using 
their well known names.
   
   ```
   accumulo admin check --name_pattern "*.files"  2>check.log
   
     Check Name       Status
   
     system_files     OK
     user_files       OK
   ```
   
   
   In the code set of checks will be static, so we could use an enum to specify 
the top level set of all checks.  Then the list command could list and run 
commands could just operate on the all the enums.  The enum could offer method 
to get check objects and to get dependencies. Thinking the enum will make 
specifying the dependcy graph (which is also static) in code really easy.
   
   
   ```
   // Not sure what to name thjs
   enum Check {
      SYSTEM_CONFIG,
      ROOT_TABLE,
      ROOT_METADATA,
      METADATA_TABLE,
      SYSTEM_FILES,
      USER_FILES;
   
      // retunrs the list of other checks the check depends on
       List<Check> getDependicies();
   
       // returns a well defined interface for running a check
       CheckRunner getCheckRunner();
   }
   
   ```
   
   Not sure if this overall structure is workable.  If we can get a structure 
for the command laid down along with a few initial checks, then we can start 
adding more checks as individual PRs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to