keith-turner commented on code in PR #5044:
URL: https://github.com/apache/accumulo/pull/5044#discussion_r1834551469
##########
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/bulkVer2/LoadFiles.java:
##########
@@ -338,11 +338,15 @@ private long loadFiles(TableId tableId, Path bulkDir,
LoadMappingIterator loadMa
TabletsMetadata.builder(manager.getContext()).forTable(tableId).overlapping(startRow,
null)
.checkConsistency().fetch(PREV_ROW, LOCATION, LOADED,
TIME).build()) {
+ // The tablet iterator and load mapping iterator are both iterating over
data that is sorted
+ // in the same way. The two iterators are each independently advanced to
find common points in
+ // the sorted data.
Review Comment:
> Is there a possibility that an admin could "fix" the bulk import file
under some failure condition that would cause it to not be sorted? Should we
re-sort after reading it back in?
There is a check in PrepBulkImport.validateLoadMapping() that runs in the
initial fate step. However subsequent fate steps do not validate. We could
add validation to the LoadMappingIterator that checks the extents it reads are
in sorted order, that would probably be a simple change to make and could be
done in 2.1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]