keith-turner commented on code in PR #3640:
URL: https://github.com/apache/accumulo/pull/3640#discussion_r1297890513


##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -803,8 +884,17 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
           var allVolumesDir = new AllVolumesDirectory(range.tableId(), 
value.toString());
           
bw.addMutation(manager.getContext().getAmple().createDeleteMutation(allVolumesDir));
         }
+
+        lastExtent = keyExtent;
       }
 
+      // Process deletes for the highest Tablet for any files that are not 
part of
+      // movedMetadataEntries
+      Sets.difference(possibleMetadataDeletes, 
movedMetadataEntries).forEach(delete -> {

Review Comment:
   What input data would cause possibleMetadataDeletes and movedMetadataEntries 
to not be disjoint sets?



##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
       for (Entry<Key,Value> entry : scanner) {
         Key key = entry.getKey();
         Value value = entry.getValue();
+
+        final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+        // Keep track of the last Key Extent seen so we can use it to fence
+        // of RFiles when merging the metadata
+        if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+          previousKeyExtent = lastExtent;
+        }
+
+        // Special case for now to handle the highest/stop tablet which is 
where files are
+        // merged to so we need to handle the deletes on update here as it 
won't be handled later
+        // TODO: Can this be re-written to not have a special edge case and 
make it simpler?
+        if (keyExtent.equals(stopExtent)) {
+          if (previousKeyExtent != null
+              && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+            // Fence off existing files by the end row of the previous tablet 
and current tablet
+            final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+            // The end row should be inclusive for the current tablet and the 
previous end row
+            // should be exclusive for the start row
+            Range fenced = new Range(previousKeyExtent.endRow(), false, 
keyExtent.endRow(), true);
+
+            // Clip range if exists
+            fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+            // If null the range is disjoint with the new tablet range so need 
to handle
+            if (fenced == null) {
+              // Mark this file for possible deletion since it is disjoint 
with this tablet
+              // It will be removed later if not marked in the 
movedMetadataEntries set
+              // We need to check later after processing before deleting as 
it's possible this same
+              // metadata entry (file with range) is not disjoint across other 
tablets being merged
+              // into this highest tablet so we can't delete until the end
+              possibleMetadataDeletes.add(existing.getMetadata());
+            } else {
+              final StoredTabletFile newFile = 
StoredTabletFile.of(existing.getPath(), fenced);
+              // If the existing metadata does not match then we need to 
delete the old
+              // and replace with a new range
+              if (!existing.equals(newFile)) {
+                possibleMetadataDeletes.add(existing.getMetadata());
+                movedMetadataEntries.add(newFile.getMetadata());
+                m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+              }
+            }
+
+            fileCount++;
+          }
+          // For the highest tablet we only care about the DataFileColumnFamily
+          continue;
+        }
+
         if (key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
-          m.put(key.getColumnFamily(), key.getColumnQualifier(), value);
+          final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+          // TODO: Should we try and be smart and eventually collapse 
overlapping ranges to reduce
+          // the metadata? The fenced reader will already collapse ranges when 
reading.
+
+          // Fence off files by the previous tablet and current tablet that is 
being merged
+          // The end row should be inclusive for the current tablet and the 
previous end row should
+          // be exclusive for the start row.
+          Range fenced = new Range(previousKeyExtent != null ? 
previousKeyExtent.endRow() : null,
+              false, keyExtent.endRow(), true);
+
+          // Clip range with the tablet range if the range already exists
+          fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+          // If null then don't need to forward as it is disjoint and doesn't 
fall in the current
+          // merged range and will just get removed later when the tablet is 
deleted after merge
+          if (fenced != null) {
+            StoredTabletFile newFile = StoredTabletFile.of(existing.getPath(), 
fenced);
+            m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+            movedMetadataEntries.add(newFile.getMetadata());
+          }

Review Comment:
   Maybe there should be an else that logs something or throws an exception.



##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
       for (Entry<Key,Value> entry : scanner) {
         Key key = entry.getKey();
         Value value = entry.getValue();
+
+        final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+        // Keep track of the last Key Extent seen so we can use it to fence
+        // of RFiles when merging the metadata
+        if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+          previousKeyExtent = lastExtent;
+        }
+
+        // Special case for now to handle the highest/stop tablet which is 
where files are
+        // merged to so we need to handle the deletes on update here as it 
won't be handled later
+        // TODO: Can this be re-written to not have a special edge case and 
make it simpler?
+        if (keyExtent.equals(stopExtent)) {
+          if (previousKeyExtent != null

Review Comment:
   It seems like previousKeyExtent being null would only happen if merging a 
single tablet? Does the merge code not have special handling for that case an 
exit much earlier?



##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws 
AccumuloException {
       for (Entry<Key,Value> entry : scanner) {
         Key key = entry.getKey();
         Value value = entry.getValue();
+
+        final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+        // Keep track of the last Key Extent seen so we can use it to fence
+        // of RFiles when merging the metadata
+        if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+          previousKeyExtent = lastExtent;
+        }
+
+        // Special case for now to handle the highest/stop tablet which is 
where files are
+        // merged to so we need to handle the deletes on update here as it 
won't be handled later
+        // TODO: Can this be re-written to not have a special edge case and 
make it simpler?
+        if (keyExtent.equals(stopExtent)) {
+          if (previousKeyExtent != null
+              && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+            // Fence off existing files by the end row of the previous tablet 
and current tablet
+            final StoredTabletFile existing = 
StoredTabletFile.of(key.getColumnQualifier());
+            // The end row should be inclusive for the current tablet and the 
previous end row
+            // should be exclusive for the start row
+            Range fenced = new Range(previousKeyExtent.endRow(), false, 
keyExtent.endRow(), true);
+
+            // Clip range if exists
+            fenced = existing.hasRange() ? existing.getRange().clip(fenced, 
true) : fenced;
+
+            // If null the range is disjoint with the new tablet range so need 
to handle
+            if (fenced == null) {

Review Comment:
   For this case to happen, would a tablet have to contain a file with a 
disjoint range?  Thinking Accumulo should never create this situation in the 
first place.  It feels like this case could hide bugs, wondering if it should 
throw an exception or log something instead. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to