[jira] [Comment Edited] (HDFS-11251) ConcurrentModificationException during DataNode#refreshVolumes

2016-12-27 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781823#comment-15781823
 ] 

Yiqun Lin edited comment on HDFS-11251 at 12/28/16 2:21 AM:


Thanks [~manojg] for updating the patch. The latest patch looks pretty good 
now. Two minor comments:

* Can we define a var named {{DEFAULT_STORAGES_PER_DATANODE}} to replace {{2}}? 
That will be easily understood.

{code}
   private void startDFSCluster(int numNameNodes, int numDataNodes)
   throws IOException {
+startDFSCluster(numNameNodes, numDataNodes, 2);
+  }
{code}

* The delay time of {{addVolume}} is a little short. I tested your patch in my 
local many times, the most of the results were still passed with the 
{{ArrayList}}. 

{code}
if (r.nextInt(10) > 4) {
  int s = r.nextInt(10) + 1;
  Thread.sleep(s);
 }
{code}

I increased the delay here, change {{Thread.sleep(s)}} to {{Thread.sleep(s * 
100)}}, then the tests runs as we expected,

+1 once these are addressed. Thanks.


was (Author: linyiqun):
Thanks [~manojg] for updating the patch. The latest patch looks pretty good 
now. Two minor comments:

* Can we define a var named {{DEFAULT_STORAGES_PER_DATANODE}} to replace {{2}}? 
That will be easily understood.
{quote}
   private void startDFSCluster(int numNameNodes, int numDataNodes)
   throws IOException {
+startDFSCluster(numNameNodes, numDataNodes, 2);
+  }
{quote}
* The delay time of {{addVolume}} is a little short. I tested your patch in my 
local many times, the most of the results were still passed with the 
{{ArrayList}}. 
{quote}
if (r.nextInt(10) > 4) {
  int s = r.nextInt(10) + 1;
  Thread.sleep(s);
 }
{quote}
I increased the delay here, change {{Thread.sleep(s)}} to {{Thread.sleep(s * 
100)}}, then the tests runs as we expected,

+1 once these are addressed. Thanks.

> ConcurrentModificationException during DataNode#refreshVolumes
> --
>
> Key: HDFS-11251
> URL: https://issues.apache.org/jira/browse/HDFS-11251
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11251.01.patch, HDFS-11251.02.patch
>
>
> The testAddVolumesDuringWrite case failed with a ReconfigurationException 
> which appears to have been caused by a ConcurrentModificationException.  
> Stacktrace details to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11251) ConcurrentModificationException during DataNode#refreshVolumes

2016-12-20 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765882#comment-15765882
 ] 

Yiqun Lin edited comment on HDFS-11251 at 12/21/16 2:26 AM:


Thanks [~manojg] for the analysis. I think that's the reason of the failure 
case. Here the add volume or remove volume is a  asynchronized operation so 
there is a chance to lead the CME.
{quote}
Want to look at logs to find the parallel operations on the storageDir
{quote}
Here it's the {{addVolume}} operation caused this as you can see the stack info 
that [~jlowe] provided above. Hope this can help you.
{code}
org.apache.hadoop.conf.ReconfigurationException: Could not change property 
dfs.datanode.data.dir from 
'[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4'
 to 
'[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data3,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4'
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.refreshVolumes(DataNode.java:777)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reconfigurePropertyImpl(DataNode.java:532)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.addVolumes(TestDataNodeHotSwapVolumes.java:310)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumesDuringWrite(TestDataNodeHotSwapVolumes.java:404)
{code}


was (Author: linyiqun):
Thanks [~manojg] for the analysis. I think that's the reason of the failure 
case.
{quote}
Want to look at logs to find the parallel operations on the storageDir
{quote}
Here it's the {{addVolume}} operation caused this as you can see the stack info 
that [~jlowe] provided above. Hope this can help you.
{code}
org.apache.hadoop.conf.ReconfigurationException: Could not change property 
dfs.datanode.data.dir from 
'[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4'
 to 
'[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data2,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data3,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data4'
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.refreshVolumes(DataNode.java:777)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reconfigurePropertyImpl(DataNode.java:532)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.addVolumes(TestDataNodeHotSwapVolumes.java:310)
at 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumesDuringWrite(TestDataNodeHotSwapVolumes.java:404)
{code}

> ConcurrentModificationException during DataNode#refreshVolumes
> --
>
> Key: HDFS-11251
> URL: https://issues.apache.org/jira/browse/HDFS-11251
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Manoj Govindassamy
>
> The testAddVolumesDuringWrite case failed with a ReconfigurationException 
> which appears to have been caused by a ConcurrentModificationException.  
> Stacktrace details to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org