sodonnel commented on a change in pull request #2838:
URL: https://github.com/apache/hadoop/pull/2838#discussion_r604765744
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
##########
@@ -1161,10 +1176,12 @@ public Void call() throws IOException {
*/
@Override
public int compare(LinkArgs a, LinkArgs b) {
+ File asrc = a.src();
+ File bsrc = b.src();
return ComparisonChain.start().
- compare(a.src.getName(), b.src.getName()).
- compare(a.src, b.src).
- compare(a.dst, b.dst).
+ compare(asrc.getName(), bsrc.getName()).
+ compare(asrc, bsrc).
Review comment:
This method already had the two compares, but your comment caused me to
look at it more closely.
What this seems to be doing is sorting by the block_filename, then by the
path, then by the dst.
Eg, if we had this as src:
/a/blk1
/b/blk2
/c/blk1
The natural sort order by path is as above. However the method would return:
/a/blk1
/c/blk1
/b/blk2
This puts duplicate blocks beside each other in the list, even though they
are in different paths. Later in the method it uses that to check for duplicate
blocks and then handles them.
Therefore I think we need the chained compares like this. However rather
than `a.src.getName()` I could use `a.blockFile()` which should be slightly
more efficient.
I will push another commit with that change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]