keith-turner opened a new issue, #4978:
URL: https://github.com/apache/accumulo/issues/4978

   **Is your feature request related to a problem? Please describe.**
   
   Currently in the accumulo 4.0-SNAPSHOT branch a compactor process will walk 
through the following process to execute a compaction.
   
    1. Request a compaction job from the compaction coordinator, with the 
coordinator doing the following
       1.  Get a job from an in memory priority queue
       2. Generate the next filename and create the directory if needed
       3. Use conditional mutations to reserve the files for compaction in the 
metadata table
       4. Return the job to the compactor
    2. Run the compaction
    3. Report compaction completion to the coordinator which does the following
       1. Create a fate operation to commit the compaction (because its a 
multistep process)
       2. Another thread in the manager will eventually run this fate operation
   
   When compactors do this, many compactors talk to a single manager and then 
the single manager talks to many tservers to execute the reservation and commit.
   
   **Describe the solution you'd like**
   
   The reservation and compaction commit could move from the coordinator to the 
compactor process. This would result in many compactors talking to many tablet 
servers with the compactor doing the following.
   
    1. Request a compaction job from the compaction coordinator, with the 
coordinator doing the following
       1.  Get a job from an in memory priority queue
       2. Generate the next filename and create the directory if needed (these 
operation work on in memory data in the manager and would be best done there)
       3. Return the job to the compactor, output file name, and current steady 
time
    2. Use conditional mutations to reserve the files for compaction in the 
metadata table
    3. Run the compaction
    4. Create and execute the fate operation to commit the compaction in the 
compactor process.  This is possible because of the changes in #4524.  For this 
to work well, would need to presplit the fate table to allow multiple tablet 
servers to host it.
   
   These changes would work well with #4664.  The amount of work the manager is 
doing on behalf of a compactor would be greatly reduced making the async 
handling much simpler.  Also if the compactor immediately runs the fate 
operation to commit a compaction it could reduce the overall latency of a small 
compaction.
   
   One risk with this change is if a large number of compactors compact tablets 
that use the same metadata tablet it could create a lot of load on that 
metadata tablet.  In some ways the manager is currently limiting the amount of 
concurrent compaction commits and reservation operations.  This should not be a 
problem for the fate table as its keyed on a uuid, so many compactors accessing 
it would always evenly spread across tablets.
   
   **Describe alternatives you've considered**
   
   There was discussion of running multiple managers with each executing fate 
operations, #4524 was done as prerequisite for this.  The uses cases for this 
were compaction commit and system initiated split operation.  If this change 
were made it would remove one of those use cases, leaving system splits as 
something that may still need that functionality.  However a single manager 
process may be able to work through all of the split operations that 
realistically happen. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to