[ 
https://issues.apache.org/jira/browse/HDFS-17098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853852#comment-17853852
 ] 

ASF GitHub Bot commented on HDFS-17098:
---------------------------------------

slfan1989 commented on PR #6840:
URL: https://github.com/apache/hadoop/pull/6840#issuecomment-2159600932

   > @slfan1989 I only disabled the [Apache Hadoop Multibranch pipeline for 
Windows 
10](https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-multibranch-windows-10/)
 -
   > 
   > 
![image](https://private-user-images.githubusercontent.com/10280768/337979643-1b328072-051c-4327-9312-0d402ce55f74.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgwNjk2OTEsIm5iZiI6MTcxODA2OTM5MSwicGF0aCI6Ii8xMDI4MDc2OC8zMzc5Nzk2NDMtMWIzMjgwNzItMDUxYy00MzI3LTkzMTItMGQ0MDJjZTU1Zjc0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjExVDAxMjk1MVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTJiMTU4YzNiNWZjZTQzYjNhYTdlY2M5NTExY2RhOTM2NjNlNWNhNGM5ZGI3YTgzNjkxYjUwNjgzZGYzOTNmZWEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.7fObgN_KfaTAPbi6Ld_ubRMlYNTQ-7DEUtE03Rm_qHc)
   > 
   > I didn't touch the [CI for 
Linux](https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-multibranch/) and 
it's enabled.
   
   Thank you for your reply! After discussion, we have found the cause of CI 
failure, which should be related to upgrading surefire (#6664), and the related 
pr has been rolled back.




> DatanodeManager does not handle null storage type properly
> ----------------------------------------------------------
>
>                 Key: HDFS-17098
>                 URL: https://issues.apache.org/jira/browse/HDFS-17098
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ConfX
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: reproduce.sh
>
>
> h2. What happened:
> Got a {{NullPointerException}} without message when sorting datanodes in 
> {{{}NetworkTopology{}}}.
> h2. Where's the bug:
> In line 654 of {{{}DatanodeManager{}}}, the manager creates a second sorter 
> using the standard {{Comparator}} class:
> {noformat}
> Comparator<DatanodeInfoWithStorage> comp =
>         Comparator.comparing(DatanodeInfoWithStorage::getStorageType);
> secondarySort = list -> Collections.sort(list, comp);{noformat}
> This comparator is then used in {{NetworkTopology}} as a secondary sort to 
> break ties:
> {noformat}
> if (secondarySort != null) {
>         // a secondary sort breaks the tie between nodes.
>         secondarySort.accept(nodesList);
> }{noformat}
> However, if the storage type is {{{}null{}}}, a {{NullPointerException}} 
> would be thrown since the default {{Comparator.comparing}} cannot handle 
> comparison between null values.
> h2. How to reproduce:
> (1) Set {{dfs.heartbeat.interval}} to {{{}1753310367{}}}, and 
> {{dfs.namenode.read.considerStorageType}} to {{true}}
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock#testAviodStaleAndSlowDatanodes}}
> h2. Stacktrace:
> {noformat}
> java.lang.NullPointerException
>     at 
> java.base/java.util.Comparator.lambda$comparing$77a9974f$1(Comparator.java:469)
>     at java.base/java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>     at java.base/java.util.TimSort.sort(TimSort.java:220)
>     at java.base/java.util.Arrays.sort(Arrays.java:1515)
>     at java.base/java.util.ArrayList.sort(ArrayList.java:1750)
>     at java.base/java.util.Collections.sort(Collections.java:179)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.lambda$createSecondaryNodeSorter$0(DatanodeManager.java:654)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistance(NetworkTopology.java:983)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistanceUsingNetworkLocation(NetworkTopology.java:946)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlock(DatanodeManager.java:637)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:554)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock.testAviodStaleAndSlowDatanodes(TestSortLocatedBlock.java:144){noformat}
> For an easy reproduction, run the reproduce.sh in the attachment. We are 
> happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to