keith-turner opened a new pull request, #3513:
URL: https://github.com/apache/accumulo/pull/3513
This commit has the following changes :
* Adds a new selected files set to tablet metadata. See the javadoc on the
new column for more details.
* These changes leverage the existing tablet column for storing running
external compactions and remove some fields from that object that are no longer
needed because that info is in the new selected set stored in the tablet
metadata.
* Updates the Fate code that runs user compactions to select files for each
tablet and write the selected files using conditional mutations. See the
CompactionDriver class. The code no longer sends RPCs to tablet servers asking
them to compact.
* Updates the compaction coordinator to reserve and commit compactions
using conditional mutations. During commit the set of selected files is
adjusted based on the files compacted. When the set of selected files is empty
the tablet is marked as done for the user compaction.
* Updates the tablet management iterator that looks for tablets that need
compactions to support user compactions. When a tablet has a selected set of
files the compaction planner is called to generate compaction jobs based on the
selected files. Most changes related to this are in the CompactionJobGenerator
class.
* A new conditional mutation abstraction was added to ample called
requireSame. This new method allows code to pass in a tablet metadata object
and require that updates are only made if the tablet matches the tablet
metadata object at the time of update.
* Added a new TabletMetadata object builder. This new builder reuses code
in Ample used to make tablet updates. The builder was needed to test the new
requireSame method.
* Made some minor updates to mini accumulo related to running compactors.
This was done so that some ITs could run.
The code for making decisions about tablets require examining a few file
sets stored in the tablet and then making updates. The new requireSame method
makes it easy to write code like the following.
1. Read a tablets current files, selected files, and compacting files.
2. Examine the three sets of files and generate updates to the tablet
based on that examination.
3. Only make the the updates if the current files, selected files, and
compacting files are the same as in step 1. If not the same go back to step 1.
These compaction related metadata updates are the most complex so far in the
elasticity branch. The new requireSame ample primitive made this complexity
easy to manage.
Not all functionality related to user compactions is done in this commit.
The major areas that are missing are :
* Compaction cancellation was not implemented. This was intentional and
a follow on issue will be opened with the specifics.
* The refreshing of a hosted tablets files metadata is currently not
guaranteed before the compact user API call returns. A follow on issue exists,
but needs some updates.
* The compactionSelector plugin that decides which files to compact does
not currently support examining file metadata like samples and summaries. A
follow on issue will be opened.
* When the FATE op selects files for user compaction it waits until there
are no other FATE ops with selected files and no other running compactions. In
the case where a tablet is always compacting the FATE operation could wait
forever. A new metadata column that indicates a FATE op is waiting to select
and prevents new system compactions from starting is needed. A follow on issue
will be opened. The tablet server had functionality like this in its in memory
state.
A subset of the compaction integration test are passing with these changes.
The following test in CompactionIT were successfully run.
* testCompactionWithTableIterator()
* testPartialCompaction()
* testConfigurer()
* testSuccessfulCompaction()
* testMultiStepCompactionThatDeletesAll()
* testSelectNoFiles()
Other test were not attempted as the functionality is known to be missing.
fixes #3464
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]