keith-turner commented on code in PR #5044:
URL: https://github.com/apache/accumulo/pull/5044#discussion_r1834551469


##########
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/bulkVer2/LoadFiles.java:
##########
@@ -338,11 +338,15 @@ private long loadFiles(TableId tableId, Path bulkDir, 
LoadMappingIterator loadMa
         
TabletsMetadata.builder(manager.getContext()).forTable(tableId).overlapping(startRow,
 null)
             .checkConsistency().fetch(PREV_ROW, LOCATION, LOADED, 
TIME).build()) {
 
+      // The tablet iterator and load mapping iterator are both iterating over 
data that is sorted
+      // in the same way. The two iterators are each independently advanced to 
find common points in
+      // the sorted data.

Review Comment:
   > Is there a possibility that an admin could "fix" the bulk import file 
under some failure condition that would cause it to not be sorted? Should we 
re-sort after reading it back in?
   
   There is a check in PrepBulkImport.validateLoadMapping() that runs in the 
initial fate step.  However subsequent fate steps do not validate.  We could 
add validation to the LoadMappingIterator that checks the extents it reads are 
in sorted order, that would probably be a simple change to make and could be 
done in 2.1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to