keith-turner commented on issue #5051:
URL: https://github.com/apache/accumulo/issues/5051#issuecomment-2466837949
Not proposing any changes to the goal of this issue in this comment, just
sharing some thoughts. I like the goal of making the default view a summary
view. This issue just made me think of summarizing more generally and I am
just sharing those toughts.
There are different questions I want to answer about the servers sometimes,
which can be answered by different summarizations of the data. In 4.0 the
new getServers API+jshell can be pretty powerful tool to answer these question
and summarize the data in different ways. For example with following single
node accumulo cluster.
```
$ accumulo admin serviceStatus
Report time: 2024-11-10T17:56:46.492Z
ZooKeeper read errors: 0
Managers: count: 1
resource group: (default)
localhost:9999
Monitors: count: 1
resource group: (default)
localhost:9995
Garbage Collectors: count: 1
resource group: (default)
localhost:9998
Tablet Servers: count: 1
resource group: (default)
localhost:9997
Scan Servers: count: 1
resource group: (default)
localhost:9996
COORDINATOR: unavailable
Compactors: count: 5
resource groups:
cg1
default
hosts (by group):
cg1 (3):
localhost:9135
localhost:9136
localhost:9137
default (2):
localhost:9133
localhost:9134
```
Can use the following `accumulo jshell` command to count the number of
servers per `<Resource group>:<hostname>` where hostname does not include the
output. So this lets me know how many servers are running per RG+host.
```
jshell> Stream.of(ServerId.Type.values())
...>
.flatMap(serverType->client.instanceOperations().getServers(serverType).stream())
...> .collect(Collectors.groupingBy(
...>
serverId->serverId.getType()+":"+serverId.getResourceGroup()+":"+serverId.getHost(),
...> Collectors.counting()))
$11 ==> {SCAN_SERVER:default:localhost=1, COMPACTOR:cg1:localhost=3,
TABLET_SERVER:default:localhost=1, MANAGER:default:localhost=1,
COMPACTOR:default:localhost=2}
```
The following is an example of using jshell to answer the question are there
any host that have servers for more than one RG and if so what are the RGs?
```
jshell> Stream.of(ServerId.Type.values())
...>
.flatMap(serverType->client.instanceOperations().getServers(serverType).stream())
...> .collect(Collectors.groupingBy(
...> serverId->serverId.getType()+":"+serverId.getHost(),
...> Collectors.mapping(serverId ->
serverId.getResourceGroup(), Collectors.toSet())))
...> .forEach((server, groups)-> {
...> if(groups.size() > 1){
...> System.out.println(server+" "+groups);
...> }
...> })
COMPACTOR:localhost [default, cg1]
```
There are many question that someone may want to answer using the raw
servers data and one way this can be done is via jshell in 4.0. However jshell
may not be as accessible as being able to summarize using bash commands. I
wanted to take the json output and convert it to csv like data of the format
`<server type>,<resource group>,<hostname>,<port>` and then use sort,awk,etc
commands on the data. However I could not figure out how to make jq do this
mapping.
So I like the idea of having some summary view as the default output for the
command. I am also wondering how we can make it easy for and admin to
summarize the data in different ways that the summary view we chose. Not
finding a great way to do this w/ the current command. This out of scope for
this issue, but in general if there is not easy way to make the json usable on
the command line to answer basic questions about the data I am wondering of we
should add a `--csv` option to the command that just ouputs data like `<server
type>,<resource group>,<hostname>,<port>` which is the structure of the data in
jshell.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]