[
https://issues.apache.org/jira/browse/HDFS-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDFS-17870:
----------------------------------
Labels: pull-request-available (was: )
> ArrayIndexOutOfBoundsException when DataNode was restarted without
> storageCapacities
> ------------------------------------------------------------------------------------
>
> Key: HDFS-17870
> URL: https://issues.apache.org/jira/browse/HDFS-17870
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Reporter: rstest
> Priority: Major
> Labels: pull-request-available
>
> h2. Summary
> `MiniDFSCluster.restartDataNode()` throws `ArrayIndexOutOfBoundsException`
> when attempting to restart a DataNode that was added to the cluster without
> specifying storage capacities.
>
> h2. Description
> When a DataNode is added to a MiniDFSCluster using `startDataNodes()` with
> `storageCapacities = null`, and later that DataNode is restarted using
> `restartDataNode()`, an `ArrayIndexOutOfBoundsException` is thrown.
> The root cause is that `MiniDFSCluster.setDataNodeStorageCapacities()` does
> not validate that the DataNode index is within the bounds of the
> `storageCapacities` array before accessing it.
>
> h2. Steps to Reproduce
> 1. Create a MiniDFSCluster with explicit storage capacities:
> {code:java}
> cluster = new MiniDFSCluster.Builder(conf)
> .numDataNodes(1)
> .storageCapacities(new long[] { CAPACITY })
> .build(); {code}
> 2. Add another DataNode without specifying storage capacities:
> {code:java}
> cluster.startDataNodes(conf, 1, true, null, null); // storageCapacities
> is null {code}
> 3. Restart the second DataNode:
> {code:java}
> cluster.restartDataNode(1); // Throws ArrayIndexOutOfBoundsException {code}
> h2. Expected Behavior
> `restartDataNode()` should successfully restart the DataNode regardless of
> whether storage capacities were specified when the DataNode was originally
> added.
> h2. Actual Behavior
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.setDataNodeStorageCapacities(MiniDFSCluster.java:1882)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2557)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2596)
> at
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2576)
> {code}
> h2. Proposed Fix
> Add a bounds check in `setDataNodeStorageCapacities()` to handle the case
> where the DataNode index exceeds the storage capacities array length.
> {code:java}
> private synchronized void setDataNodeStorageCapacities(
> final int curDnIdx,
> final DataNode curDn,
> long[][] storageCapacities) throws IOException {
> - if (storageCapacities == null || storageCapacities.length == 0) {
> + // Check for null/empty array AND ensure index is within bounds.
> + // DataNodes added without explicit storageCapacities won't have
> + // an entry in the storageCap list.
> + if (storageCapacities == null || storageCapacities.length == 0
> + || curDnIdx >= storageCapacities.length) {
> return; {code}
> I'm happy to send a PR for this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]