keith-turner commented on issue #5051:
URL: https://github.com/apache/accumulo/issues/5051#issuecomment-2466837949

   Not proposing any changes to the goal of this issue in this comment, just 
sharing some thoughts.  I like the goal of making the default view a summary 
view.  This issue just made me think of summarizing more generally and I am 
just sharing those toughts.
   
   There are different questions I want to answer about the servers sometimes, 
which can be answered by different summarizations of the data.    In 4.0 the 
new getServers API+jshell can be pretty powerful tool to answer these question 
and summarize the data in different ways.  For example with following  single 
node accumulo cluster.
   
   ```
   $ accumulo admin serviceStatus
   Report time: 2024-11-10T17:56:46.492Z
   ZooKeeper read errors: 0
   Managers: count: 1
     resource group: (default)
       localhost:9999
   Monitors: count: 1
     resource group: (default)
       localhost:9995
   Garbage Collectors: count: 1
     resource group: (default)
       localhost:9998
   Tablet Servers: count: 1
     resource group: (default)
       localhost:9997
   Scan Servers: count: 1
     resource group: (default)
       localhost:9996
   COORDINATOR: unavailable
   Compactors: count: 5
     resource groups:
       cg1
       default
     hosts (by group):
       cg1 (3):
         localhost:9135
         localhost:9136
         localhost:9137
       default (2):
         localhost:9133
         localhost:9134
   ```
   
   Can use the following `accumulo jshell` command to count the number of 
servers per `<Resource group>:<hostname>` where hostname does not include the 
output.  So this lets me know how many servers are running per RG+host.
   
   ```
   jshell> Stream.of(ServerId.Type.values())
      ...>         
.flatMap(serverType->client.instanceOperations().getServers(serverType).stream())
      ...>         .collect(Collectors.groupingBy(
      ...>                 
serverId->serverId.getType()+":"+serverId.getResourceGroup()+":"+serverId.getHost(),
      ...>                 Collectors.counting()))
   $11 ==> {SCAN_SERVER:default:localhost=1, COMPACTOR:cg1:localhost=3, 
TABLET_SERVER:default:localhost=1, MANAGER:default:localhost=1, 
COMPACTOR:default:localhost=2}
   ```
   
   The following is an example of using jshell to answer the question are there 
any host that have servers for more than one RG and if so what are the RGs?
   
   ```
   jshell> Stream.of(ServerId.Type.values())
      ...>         
.flatMap(serverType->client.instanceOperations().getServers(serverType).stream())
      ...>         .collect(Collectors.groupingBy(
      ...>                 serverId->serverId.getType()+":"+serverId.getHost(),
      ...>                 Collectors.mapping(serverId -> 
serverId.getResourceGroup(), Collectors.toSet())))
      ...>         .forEach((server, groups)-> {
      ...>             if(groups.size() > 1){
      ...>                 System.out.println(server+" "+groups);
      ...>             }
      ...>         })
   COMPACTOR:localhost [default, cg1]
   ```
   
   There are many question that someone may want to answer using the raw 
servers data and one way this can be done is via jshell in 4.0.  However jshell 
may not be as accessible as being able to summarize using bash commands.   I 
wanted to take the json output and convert it to csv like data of the format 
`<server type>,<resource group>,<hostname>,<port>` and then use sort,awk,etc 
commands on the data.  However I could not figure out how to make jq do this 
mapping.
   
   So I like the idea of having some summary view as the default output for the 
command.  I am also wondering how we can make it easy for and admin to 
summarize the data in different ways that the summary view we chose.  Not 
finding a great way to do this w/ the current command.  This out of scope for 
this issue, but in general if there is not easy way to make the json usable on 
the command line to answer basic questions about the data I am wondering of we 
should add a `--csv` option to the command that just ouputs data like `<server 
type>,<resource group>,<hostname>,<port>` which is the structure of the data in 
jshell.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to