keith-turner opened a new issue #1648:
URL: https://github.com/apache/accumulo/issues/1648


   After the changes in #1605 
TableOperationsIT.testCompactEmptyTableWithGeneratorIterator started to fail.  
This is because this test initiates a compaction with user iterators on an 
empty table.  Before the changes in #1605 the compaction would run even though 
there was no data and the iterators generated data.  After the changes in #1605 
a tablet with no data will not run a compaction.
   
   Should this behavior be implemented, it would require multiple special 
cases. While starting to look into implementing empty compactions, I found the 
following cases. There may be more.
   
    1. Only allow a single concurrent empty compaction to run. The changes in 
#1605 enable multiple concurrent compactions to run against a single tablet as 
long they are compacting a disjoint set of files.  Two empty compactions have a 
disjoint set of files, so the current per tablet tracking of compacting files 
would not be sufficient to prevent concurrent empty compactions.  Would need 
special handling just for the empty set. 
    2. Only commit empty compaction when there are no files in tablet. Its 
possible that while an empty compaction is running, that a new files arrives 
for the tablet.  Given that empty compactions are special cases, should this 
compaction be allowed to complete in this case?  The only reason the special 
empty compaction was started was because there were no files, so it seems it 
should only complete successfully if there are no files when the compaction 
finishes.  I don't think the old code handled this case in anyway, but it could 
have happened in the old code.  The old code would have committed an empty 
compaction where a file arrived after it started.
    3. User compactions can select a subset of files to compact.  When a tablet 
has files, but the user selects none for compaction should an empty compaction 
run? I don't think so.
   
   These special cases will require specialized code, increasing the complexity 
of the compaction code.  I am not sure the complexity is worth it.  I was 
unaware of this capability until I ran into the test.  I have not yet found any 
user facing documentation that advertises this feature.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to