sodonnel commented on a change in pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548#discussion_r728414217
##########
File path:
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
##########
@@ -245,8 +245,7 @@ public static long getSafeLength(ErasureCodingPolicy
ecPolicy,
Arrays.sort(cpy);
// full stripe is a stripe has at least dataBlkNum full cells.
// lastFullStripeIdx is the index of the last full stripe.
- int lastFullStripeIdx =
- (int) (cpy[cpy.length - dataBlkNum] / cellSize);
+ long lastFullStripeIdx = cpy[cpy.length - dataBlkNum] / cellSize;
Review comment:
I know this is existing code, but I'd like to understand what is
happening here to review this.
This method receives an array of internal block lengths, so for 3-2 it will
have 5 entries, 6-3 it will have 9 etc.
Then it sorts the lengths smallest to largest. Then it selects the one at
position num_blocks - numDataUnits.
Why does it not just pick the first one, which would be the smallest, as the
smallest data block in the group indicates the last full stripe.
Why is the safe length based on the full stripe, and not a potentially
partial last stripe?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]