keith-turner opened a new issue #1648:
URL: https://github.com/apache/accumulo/issues/1648
After the changes in #1605
TableOperationsIT.testCompactEmptyTableWithGeneratorIterator started to fail.
This is because this test initiates a compaction with user iterators on an
empty table. Before the changes in #1605 the compaction would run even though
there was no data and the iterators generated data. After the changes in #1605
a tablet with no data will not run a compaction.
Should this behavior be implemented, it would require multiple special
cases. While starting to look into implementing empty compactions, I found the
following cases. There may be more.
1. Only allow a single concurrent empty compaction to run. The changes in
#1605 enable multiple concurrent compactions to run against a single tablet as
long they are compacting a disjoint set of files. Two empty compactions have a
disjoint set of files, so the current per tablet tracking of compacting files
would not be sufficient to prevent concurrent empty compactions. Would need
special handling just for the empty set.
2. Only commit empty compaction when there are no files in tablet. Its
possible that while an empty compaction is running, that a new files arrives
for the tablet. Given that empty compactions are special cases, should this
compaction be allowed to complete in this case? The only reason the special
empty compaction was started was because there were no files, so it seems it
should only complete successfully if there are no files when the compaction
finishes. I don't think the old code handled this case in anyway, but it could
have happened in the old code. The old code would have committed an empty
compaction where a file arrived after it started.
3. User compactions can select a subset of files to compact. When a tablet
has files, but the user selects none for compaction should an empty compaction
run? I don't think so.
These special cases will require specialized code, increasing the complexity
of the compaction code. I am not sure the complexity is worth it. I was
unaware of this capability until I ran into the test. I have not yet found any
user facing documentation that advertises this feature.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]