dlmarion commented on issue #2290:
URL: https://github.com/apache/accumulo/issues/2290#issuecomment-934312920


   IIRC the external compactions *are* included currently in the number of 
running and queued compactions displayed on the monitor. This is because the 
external compactions are queued up from the tserver and the tserver knows when 
they are running/finished/failed. There are several places where external 
compaction information is stored.
   
   1. When an external compaction is running an `ecomp` column is added in the 
metadata table for the Tablet. It lists the files included in the compaction, 
the compactor address, the compaction settings, etc.
   
   2. When a compaction is finished the entry for the tablet in the metadata 
table is removed and an `~ecomp` entry is added to signify that the external 
compaction is finshed (succeeded or failed).
   
   The 
[blog](https://accumulo.apache.org/blog/2021/07/08/external-compactions.html#compaction-coordinator)
 provides a description for both 1 and 2. However, I'm not sure that this 
provides a complete picture. There are two data structures in the 
CompactionCoordinator that may also be of use. Specifically the 
[QueueSummaries](https://github.com/apache/accumulo/blob/main/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L90)
 and the 
[RUNNING](https://github.com/apache/accumulo/blob/main/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L93)
 collections. As their name implies, the QueueSummaries collection contains a 
summary of the external compaction queues for each tserver and the RUNNING 
collection has information about what is active now.
   
   To populate the QueueSummaries, the CompactionCoordinator asks each TServer 
about the external compactions that are queued up. When a Compactor is free to 
do work, the CompactionCoordinator assigns it the highest priority compaction 
for the queue that it is working on. An entry is put into the RUNNING 
collection for that compaction and then the Compactor calls back periodically 
to update the state using 
[this](https://github.com/apache/accumulo/blob/main/server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java#L358)
 method. If you look at where this is used, you will see that it tell the 
CompactionCoordinator when the compaction starts/succeeds/fails and also 
updates it periodically on how many entries it has written out to show progress 
information.
   
   I don't think we have exposed the information in the CompactionCoordinator 
yet, except for in the tests. In some of the tests I start up the 
[TestCompactionCoordinator](https://github.com/apache/accumulo/blob/main/test/src/main/java/org/apache/accumulo/test/compaction/TestCompactionCoordinator.java)
 so that I can use the information in the RUNNING collection to determine 
external compaction success / failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to