[jira] [Comment Edited] (HDFS-11251) ConcurrentModificationException during DataNode#refreshVolumes
[ https://issues.apache.org/jira/browse/HDFS-11251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781823#comment-15781823 ] Yiqun Lin edited comment on HDFS-11251 at 12/28/16 2:21 AM: Thanks [~manojg] for updating the patch. The latest patch looks pretty good now. Two minor comments: * Can we define a var named {{DEFAULT_STORAGES_PER_DATANODE}} to replace {{2}}? That will be easily understood. {code} private void startDFSCluster(int numNameNodes, int numDataNodes) throws IOException { +startDFSCluster(numNameNodes, numDataNodes, 2); + } {code} * The delay time of {{addVolume}} is a little short. I tested your patch in my local many times, the most of the results were still passed with the {{ArrayList}}. {code} if (r.nextInt(10) > 4) { int s = r.nextInt(10) + 1; Thread.sleep(s); } {code} I increased the delay here, change {{Thread.sleep(s)}} to {{Thread.sleep(s * 100)}}, then the tests runs as we expected, +1 once these are addressed. Thanks. was (Author: linyiqun): Thanks [~manojg] for updating the patch. The latest patch looks pretty good now. Two minor comments: * Can we define a var named {{DEFAULT_STORAGES_PER_DATANODE}} to replace {{2}}? That will be easily understood. {quote} private void startDFSCluster(int numNameNodes, int numDataNodes) throws IOException { +startDFSCluster(numNameNodes, numDataNodes, 2); + } {quote} * The delay time of {{addVolume}} is a little short. I tested your patch in my local many times, the most of the results were still passed with the {{ArrayList}}. {quote} if (r.nextInt(10) > 4) { int s = r.nextInt(10) + 1; Thread.sleep(s); } {quote} I increased the delay here, change {{Thread.sleep(s)}} to {{Thread.sleep(s * 100)}}, then the tests runs as we expected, +1 once these are addressed. Thanks. > ConcurrentModificationException during DataNode#refreshVolumes > -- > > Key: HDFS-11251 > URL: https://issues.apache.org/jira/browse/HDFS-11251 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Manoj Govindassamy > Attachments: HDFS-11251.01.patch, HDFS-11251.02.patch > > > The testAddVolumesDuringWrite case failed with a ReconfigurationException > which appears to have been caused by a ConcurrentModificationException. > Stacktrace details to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11251) ConcurrentModificationException during DataNode#refreshVolumes
[ https://issues.apache.org/jira/browse/HDFS-11251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765882#comment-15765882 ] Yiqun Lin edited comment on HDFS-11251 at 12/21/16 2:26 AM: Thanks [~manojg] for the analysis. I think that's the reason of the failure case. Here the add volume or remove volume is a asynchronized operation so there is a chance to lead the CME. {quote} Want to look at logs to find the parallel operations on the storageDir {quote} Here it's the {{addVolume}} operation caused this as you can see the stack info that [~jlowe] provided above. Hope this can help you. {code} org.apache.hadoop.conf.ReconfigurationException: Could not change property dfs.datanode.data.dir from '[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4' to '[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data3,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4' at org.apache.hadoop.hdfs.server.datanode.DataNode.refreshVolumes(DataNode.java:777) at org.apache.hadoop.hdfs.server.datanode.DataNode.reconfigurePropertyImpl(DataNode.java:532) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.addVolumes(TestDataNodeHotSwapVolumes.java:310) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumesDuringWrite(TestDataNodeHotSwapVolumes.java:404) {code} was (Author: linyiqun): Thanks [~manojg] for the analysis. I think that's the reason of the failure case. {quote} Want to look at logs to find the parallel operations on the storageDir {quote} Here it's the {{addVolume}} operation caused this as you can see the stack info that [~jlowe] provided above. Hope this can help you. {code} org.apache.hadoop.conf.ReconfigurationException: Could not change property dfs.datanode.data.dir from '[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4' to '[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data3,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4' at org.apache.hadoop.hdfs.server.datanode.DataNode.refreshVolumes(DataNode.java:777) at org.apache.hadoop.hdfs.server.datanode.DataNode.reconfigurePropertyImpl(DataNode.java:532) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.addVolumes(TestDataNodeHotSwapVolumes.java:310) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumesDuringWrite(TestDataNodeHotSwapVolumes.java:404) {code} > ConcurrentModificationException during DataNode#refreshVolumes > -- > > Key: HDFS-11251 > URL: https://issues.apache.org/jira/browse/HDFS-11251 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0-alpha2 >Reporter: Jason Lowe >Assignee: Manoj Govindassamy > > The testAddVolumesDuringWrite case failed with a ReconfigurationException > which appears to have been caused by a ConcurrentModificationException. > Stacktrace details to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org