[
https://issues.apache.org/jira/browse/HDFS-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18053743#comment-18053743
]
ASF GitHub Bot commented on HDFS-17870:
---------------------------------------
teamconfx opened a new pull request, #8201:
URL: https://github.com/apache/hadoop/pull/8201
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
Fix for [HDFS-17870](https://issues.apache.org/jira/browse/HDFS-17870)
MiniDFSCluster.restartDataNode() throws ArrayIndexOutOfBoundsException when
attempting to restart a DataNode that was added to the cluster without
specifying storage capacities.
This bug is because MiniDFSCluster.setDataNodeStorageCapacities() did not
validate that the DataNode index is within the bounds of the storageCapacities
array before accessing it.
### How was this patch tested?
In
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java
add a new test method testRestartDataNodeWithoutStorageCapacities().
This test method will throw ArrayIndexOutOfBoundsException without the fix
and pass with the fix.
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
### AI Tooling
If an AI tool was used:
- [ ] The PR includes the phrase "Contains content generated by <tool>"
where <tool> is the name of the AI tool used.
- [ ] My use of AI contributions follows the ASF legal policy
https://www.apache.org/legal/generative-tooling.html
> ArrayIndexOutOfBoundsException when DataNode was restarted without
> storageCapacities
> ------------------------------------------------------------------------------------
>
> Key: HDFS-17870
> URL: https://issues.apache.org/jira/browse/HDFS-17870
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Reporter: rstest
> Priority: Major
>
> h2. Summary
> `MiniDFSCluster.restartDataNode()` throws `ArrayIndexOutOfBoundsException`
> when attempting to restart a DataNode that was added to the cluster without
> specifying storage capacities.
>
> h2. Description
> When a DataNode is added to a MiniDFSCluster using `startDataNodes()` with
> `storageCapacities = null`, and later that DataNode is restarted using
> `restartDataNode()`, an `ArrayIndexOutOfBoundsException` is thrown.
> The root cause is that `MiniDFSCluster.setDataNodeStorageCapacities()` does
> not validate that the DataNode index is within the bounds of the
> `storageCapacities` array before accessing it.
>
> h2. Steps to Reproduce
> 1. Create a MiniDFSCluster with explicit storage capacities:
> {code:java}
> cluster = new MiniDFSCluster.Builder(conf)
> .numDataNodes(1)
> .storageCapacities(new long[] { CAPACITY })
> .build(); {code}
> 2. Add another DataNode without specifying storage capacities:
> {code:java}
> cluster.startDataNodes(conf, 1, true, null, null); // storageCapacities
> is null {code}
> 3. Restart the second DataNode:
> {code:java}
> cluster.restartDataNode(1); // Throws ArrayIndexOutOfBoundsException {code}
> h2. Expected Behavior
> `restartDataNode()` should successfully restart the DataNode regardless of
> whether storage capacities were specified when the DataNode was originally
> added.
> h2. Actual Behavior
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.setDataNodeStorageCapacities(MiniDFSCluster.java:1882)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2557)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2596)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2576)
> {code}
> h2. Proposed Fix
> Add a bounds check in `setDataNodeStorageCapacities()` to handle the case
> where the DataNode index exceeds the storage capacities array length.
> {code:java}
> private synchronized void setDataNodeStorageCapacities(
> final int curDnIdx,
> final DataNode curDn,
> long[][] storageCapacities) throws IOException {
> - if (storageCapacities == null || storageCapacities.length == 0) {
> + // Check for null/empty array AND ensure index is within bounds.
> + // DataNodes added without explicit storageCapacities won't have
> + // an entry in the storageCap list.
> + if (storageCapacities == null || storageCapacities.length == 0
> + || curDnIdx >= storageCapacities.length) {
> return; {code}
> I'm happy to send a PR for this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]