keith-turner commented on issue #4687:
URL: https://github.com/apache/accumulo/issues/4687#issuecomment-2253101413
@kevinrr888 here are some of the things I was thinking about.
Each check could have a well defined name that allows only that check to be
run and that a user could list all possible checks. Some checks may take a
long time to run and if a user wants to run a specific check then they would
not want to wait on everything.
Was also thinking that internally the checks could all implement a similar
interface and declare dependencies on other checks, allowing the dependencies
to be run first.
So maybe we could have something like the following to list checks that
could be run.
```
accumulo admin check --list
Check Name Description
Depends on
system_config Validate the system config stored in zookeeper
root_metadata Checks integrity of the root tablet metadata stored in
zookeeper system_config
root_table Scans all the tablet metadata stored in the root table
and checks integrity root_metadata
metadata_table Scans all the tablet metadata stored in the metadata
table and checks integrity root_table
system_files Checks that files in system tablet metadata exists in
DFS root_table
user_files Checks that files in user tablet metadata exists in DFS
metadata_table
```
Then to run checks could do something like the following that starts running
all checks following the dependency graph and does not run checks if their
dependency check failed. Was also thinking the command could print high level
status to stdout and details about failures to stderr (so thinking details
about the failure would go to check.log in example below). Ideally what ends
up in check.log would be some machine readable format like json, but not really
sure how this would play out.
```
accumulo admin check run 2> check.log
Check Name Status
system_config OK
root_table OK
root_metadata OK
metadata_table FAILED
system_files OK
user_files SKIPPED_DEPENDENCY_FAILED
```
Could somehow support a regex pattern for selecting checks to run using
their well known names.
```
accumulo admin check --name_pattern "*.files" 2>check.log
Check Name Status
system_files OK
user_files OK
```
In the code set of checks will be static, so we could use an enum to specify
the top level set of all checks. Then the list command could list and run
commands could just operate on the all the enums. The enum could offer method
to get check objects and to get dependencies. Thinking the enum will make
specifying the dependcy graph (which is also static) in code really easy.
```
// Not sure what to name thjs
enum Check {
SYSTEM_CONFIG,
ROOT_TABLE,
ROOT_METADATA,
METADATA_TABLE,
SYSTEM_FILES,
USER_FILES;
// retunrs the list of other checks the check depends on
List<Check> getDependicies();
// returns a well defined interface for running a check
CheckRunner getCheckRunner();
}
```
Not sure if this overall structure is workable. If we can get a structure
for the command laid down along with a few initial checks, then we can start
adding more checks as individual PRs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]