cshannon commented on code in PR #3640:
URL: https://github.com/apache/accumulo/pull/3640#discussion_r1298545382


##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
       for (Entry<Key,Value> entry : scanner) {
         Key key = entry.getKey();
         Value value = entry.getValue();
+
+        final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+        // Keep track of the last Key Extent seen so we can use it to fence
+        // of RFiles when merging the metadata
+        if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+          previousKeyExtent = lastExtent;
+        }
+
+        // Special case for now to handle the highest/stop tablet which is 
where files are
+        // merged to so we need to handle the deletes on update here as it 
won't be handled later
+        // TODO: Can this be re-written to not have a special edge case and 
make it simpler?
+        if (keyExtent.equals(stopExtent)) {
+          if (previousKeyExtent != null
+              && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+            // Fence off existing files by the end row of the previous tablet 
and current tablet
+            final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+            // The end row should be inclusive for the current tablet and the 
previous end row
+            // should be exclusive for the start row
+            Range fenced = new Range(previousKeyExtent.endRow(), false, 
keyExtent.endRow(), true);
+
+            // Clip range if exists
+            fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+            // If null the range is disjoint with the new tablet range so need 
to handle
+            if (fenced == null) {

Review Comment:
   The root cause of this is the fact that the second split I'm doing in my 
test causes a disjoint file in the Tablet, so based on your comment I think you 
are right that we should probably try and just prevent this. The following 
scenario shows it happening:
   
   This case happens when you do multiple merges/splits as shown in the test I 
have called noChopMergeTestMultipleMerges() in MergeIt which demonstrates the 
issue. 
   
   **In the test, the initial split/merge is done. After the first merge the 
following is in metadata with one tablet and ranged files:**
   
   ```
   Extent: 1<<; File Name: F0000001.rf; Range: [row_0000000500%00; : [] 
9223372036854775807 false,row_0000000750%00; : [] 9223372036854775807 false); 
Entries: 222, Size: 114
   Extent: 1<<; File Name: F0000001.rf; Range: [row_0000000750%00; : [] 
9223372036854775807 false,+inf); Entries: 445, Size: 229
   Extent: 1<<; File Name: A0000007.rf; Range: [row_0000000250%00; : [] 
9223372036854775807 false,row_0000000500%00; : [] 9223372036854775807 false); 
Entries: 150, Size: 354
   Extent: 1<<; File Name: A0000008.rf; Range: (-inf,row_0000000250%00; : [] 
9223372036854775807 false); Entries: 150, Size: 343
   ```
   
   **I then add 3 splits at 250, 500, 750 and we end up with the following 
fenced files and their tablets:**
   ```
   Extent: 1;row_0000000250<; File Name: A0000008.rf; Range: 
(-inf,row_0000000250%00; : [] 9223372036854775807 false); Entries: 150, Size: 
343
   Extent: 1;row_0000000500;row_0000000250; File Name: A0000007.rf; Range: 
[row_0000000250%00; : [] 9223372036854775807 false,row_0000000500%00; : [] 
9223372036854775807 false); Entries: 150, Size: 354
   Extent: 1;row_0000000750;row_0000000500; File Name: F0000001.rf; Range: 
[row_0000000500%00; : [] 9223372036854775807 false,row_0000000750%00; : [] 
9223372036854775807 false); Entries: 74, Size: 38
   Extent: 1<;row_0000000750; File Name: F0000001.rf; Range: 
[row_0000000500%00; : [] 9223372036854775807 false,row_0000000750%00; : [] 
9223372036854775807 false); Entries: 149, Size: 77
   Extent: 1<;row_0000000750; File Name: F0000001.rf; Range: 
[row_0000000750%00; : [] 9223372036854775807 false,+inf); Entries: 445, Size: 
229
   ```
   
   **And here lies the problem, when the split ran it kept an extra file around 
that is disjoint so that when we then perform a merge again and we end up with 
the disjoint issue.**
   
   The following file is disjoint:
   
   `Extent: 1<;row_0000000750; File Name: F0000001.rf; Range: 
[row_0000000500%00; : [] 9223372036854775807 false,row_0000000750%00; : [] 
9223372036854775807 false); Entries: 149, Size: 77`
   
   As you can see with that Tablet, the range of the tablet is 750 (exclusive) 
to infinite. But the file that was added during the split's range is 500 
(exclusive) to 750 (inclusive), so we either need to remove it during the split 
or handle it later during the merge.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to