cshannon commented on code in PR #3640:
URL: https://github.com/apache/accumulo/pull/3640#discussion_r1298876396


##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
       for (Entry<Key,Value> entry : scanner) {
         Key key = entry.getKey();
         Value value = entry.getValue();
+
+        final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+        // Keep track of the last Key Extent seen so we can use it to fence
+        // of RFiles when merging the metadata
+        if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+          previousKeyExtent = lastExtent;
+        }
+
+        // Special case for now to handle the highest/stop tablet which is 
where files are
+        // merged to so we need to handle the deletes on update here as it 
won't be handled later
+        // TODO: Can this be re-written to not have a special edge case and 
make it simpler?
+        if (keyExtent.equals(stopExtent)) {
+          if (previousKeyExtent != null
+              && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+            // Fence off existing files by the end row of the previous tablet 
and current tablet
+            final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+            // The end row should be inclusive for the current tablet and the 
previous end row
+            // should be exclusive for the start row
+            Range fenced = new Range(previousKeyExtent.endRow(), false, 
keyExtent.endRow(), true);
+
+            // Clip range if exists
+            fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+            // If null the range is disjoint with the new tablet range so need 
to handle
+            if (fenced == null) {
+              // Mark this file for possible deletion since it is disjoint 
with this tablet
+              // It will be removed later if not marked in the 
movedMetadataEntries set
+              // We need to check later after processing before deleting as 
it's possible this same
+              // metadata entry (file with range) is not disjoint across other 
tablets being merged
+              // into this highest tablet so we can't delete until the end
+              possibleMetadataDeletes.add(existing.getMetadata());
+            } else {
+              final StoredTabletFile newFile = 
StoredTabletFile.of(existing.getPath(), fenced);
+              // If the existing metadata does not match then we need to 
delete the old
+              // and replace with a new range
+              if (!existing.equals(newFile)) {
+                possibleMetadataDeletes.add(existing.getMetadata());
+                movedMetadataEntries.add(newFile.getMetadata());
+                m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+              }
+            }
+
+            fileCount++;
+          }
+          // For the highest tablet we only care about the DataFileColumnFamily
+          continue;
+        }
+
         if (key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
-          m.put(key.getColumnFamily(), key.getColumnQualifier(), value);
+          final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+          // TODO: Should we try and be smart and eventually collapse 
overlapping ranges to reduce
+          // the metadata? The fenced reader will already collapse ranges when 
reading.
+
+          // Fence off files by the previous tablet and current tablet that is 
being merged
+          // The end row should be inclusive for the current tablet and the 
previous end row should
+          // be exclusive for the start row.
+          Range fenced = new Range(previousKeyExtent != null ? 
previousKeyExtent.endRow() : null,
+              false, keyExtent.endRow(), true);
+
+          // Clip range with the tablet range if the range already exists
+          fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+          // If null then don't need to forward as it is disjoint and doesn't 
fall in the current
+          // merged range and will just get removed later when the tablet is 
deleted after merge
+          if (fenced != null) {
+            StoredTabletFile newFile = StoredTabletFile.of(existing.getPath(), 
fenced);
+            m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+            movedMetadataEntries.add(newFile.getMetadata());
+          }

Review Comment:
   This bug with the split is due to getLastKey() returning a key outside the 
range (the key after since the last key is exlusive) so the split incorrectly 
includes the file in the wrong tablet. The simplest thing was to just require 
the end key of the range contain the 0x00 byte (which is generated 
automatically in some cases) so we can figure out the last real key in the 
range. This fix is part of #3710 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to