keith-turner commented on code in PR #3640:
URL: https://github.com/apache/accumulo/pull/3640#discussion_r1297890513
##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -803,8 +884,17 @@ private void mergeMetadataRecords(MergeInfo info) throws
AccumuloException {
var allVolumesDir = new AllVolumesDirectory(range.tableId(),
value.toString());
bw.addMutation(manager.getContext().getAmple().createDeleteMutation(allVolumesDir));
}
+
+ lastExtent = keyExtent;
}
+ // Process deletes for the highest Tablet for any files that are not
part of
+ // movedMetadataEntries
+ Sets.difference(possibleMetadataDeletes,
movedMetadataEntries).forEach(delete -> {
Review Comment:
What input data would cause possibleMetadataDeletes and movedMetadataEntries
to not be disjoint sets?
##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws
AccumuloException {
for (Entry<Key,Value> entry : scanner) {
Key key = entry.getKey();
Value value = entry.getValue();
+
+ final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+ // Keep track of the last Key Extent seen so we can use it to fence
+ // of RFiles when merging the metadata
+ if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+ previousKeyExtent = lastExtent;
+ }
+
+ // Special case for now to handle the highest/stop tablet which is
where files are
+ // merged to so we need to handle the deletes on update here as it
won't be handled later
+ // TODO: Can this be re-written to not have a special edge case and
make it simpler?
+ if (keyExtent.equals(stopExtent)) {
+ if (previousKeyExtent != null
+ && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+ // Fence off existing files by the end row of the previous tablet
and current tablet
+ final StoredTabletFile existing =
StoredTabletFile.of(key.getColumnQualifier());
+ // The end row should be inclusive for the current tablet and the
previous end row
+ // should be exclusive for the start row
+ Range fenced = new Range(previousKeyExtent.endRow(), false,
keyExtent.endRow(), true);
+
+ // Clip range if exists
+ fenced = existing.hasRange() ? existing.getRange().clip(fenced,
true) : fenced;
+
+ // If null the range is disjoint with the new tablet range so need
to handle
+ if (fenced == null) {
+ // Mark this file for possible deletion since it is disjoint
with this tablet
+ // It will be removed later if not marked in the
movedMetadataEntries set
+ // We need to check later after processing before deleting as
it's possible this same
+ // metadata entry (file with range) is not disjoint across other
tablets being merged
+ // into this highest tablet so we can't delete until the end
+ possibleMetadataDeletes.add(existing.getMetadata());
+ } else {
+ final StoredTabletFile newFile =
StoredTabletFile.of(existing.getPath(), fenced);
+ // If the existing metadata does not match then we need to
delete the old
+ // and replace with a new range
+ if (!existing.equals(newFile)) {
+ possibleMetadataDeletes.add(existing.getMetadata());
+ movedMetadataEntries.add(newFile.getMetadata());
+ m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+ }
+ }
+
+ fileCount++;
+ }
+ // For the highest tablet we only care about the DataFileColumnFamily
+ continue;
+ }
+
if (key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
- m.put(key.getColumnFamily(), key.getColumnQualifier(), value);
+ final StoredTabletFile existing =
StoredTabletFile.of(key.getColumnQualifier());
+ // TODO: Should we try and be smart and eventually collapse
overlapping ranges to reduce
+ // the metadata? The fenced reader will already collapse ranges when
reading.
+
+ // Fence off files by the previous tablet and current tablet that is
being merged
+ // The end row should be inclusive for the current tablet and the
previous end row should
+ // be exclusive for the start row.
+ Range fenced = new Range(previousKeyExtent != null ?
previousKeyExtent.endRow() : null,
+ false, keyExtent.endRow(), true);
+
+ // Clip range with the tablet range if the range already exists
+ fenced = existing.hasRange() ? existing.getRange().clip(fenced,
true) : fenced;
+
+ // If null then don't need to forward as it is disjoint and doesn't
fall in the current
+ // merged range and will just get removed later when the tablet is
deleted after merge
+ if (fenced != null) {
+ StoredTabletFile newFile = StoredTabletFile.of(existing.getPath(),
fenced);
+ m.put(key.getColumnFamily(), newFile.getMetadataText(), value);
+ movedMetadataEntries.add(newFile.getMetadata());
+ }
Review Comment:
Maybe there should be an else that logs something or throws an exception.
##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws
AccumuloException {
for (Entry<Key,Value> entry : scanner) {
Key key = entry.getKey();
Value value = entry.getValue();
+
+ final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+ // Keep track of the last Key Extent seen so we can use it to fence
+ // of RFiles when merging the metadata
+ if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+ previousKeyExtent = lastExtent;
+ }
+
+ // Special case for now to handle the highest/stop tablet which is
where files are
+ // merged to so we need to handle the deletes on update here as it
won't be handled later
+ // TODO: Can this be re-written to not have a special edge case and
make it simpler?
+ if (keyExtent.equals(stopExtent)) {
+ if (previousKeyExtent != null
Review Comment:
It seems like previousKeyExtent being null would only happen if merging a
single tablet? Does the merge code not have special handling for that case an
exit much earlier?
##########
server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java:
##########
@@ -789,8 +801,77 @@ private void mergeMetadataRecords(MergeInfo info) throws
AccumuloException {
for (Entry<Key,Value> entry : scanner) {
Key key = entry.getKey();
Value value = entry.getValue();
+
+ final KeyExtent keyExtent = KeyExtent.fromMetaRow(key.getRow());
+
+ // Keep track of the last Key Extent seen so we can use it to fence
+ // of RFiles when merging the metadata
+ if (lastExtent != null && !keyExtent.equals(lastExtent)) {
+ previousKeyExtent = lastExtent;
+ }
+
+ // Special case for now to handle the highest/stop tablet which is
where files are
+ // merged to so we need to handle the deletes on update here as it
won't be handled later
+ // TODO: Can this be re-written to not have a special edge case and
make it simpler?
+ if (keyExtent.equals(stopExtent)) {
+ if (previousKeyExtent != null
+ && key.getColumnFamily().equals(DataFileColumnFamily.NAME)) {
+
+ // Fence off existing files by the end row of the previous tablet
and current tablet
+ final StoredTabletFile existing =
StoredTabletFile.of(key.getColumnQualifier());
+ // The end row should be inclusive for the current tablet and the
previous end row
+ // should be exclusive for the start row
+ Range fenced = new Range(previousKeyExtent.endRow(), false,
keyExtent.endRow(), true);
+
+ // Clip range if exists
+ fenced = existing.hasRange() ? existing.getRange().clip(fenced,
true) : fenced;
+
+ // If null the range is disjoint with the new tablet range so need
to handle
+ if (fenced == null) {
Review Comment:
For this case to happen, would a tablet have to contain a file with a
disjoint range? Thinking Accumulo should never create this situation in the
first place. It feels like this case could hide bugs, wondering if it should
throw an exception or log something instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]