[
https://issues.apache.org/jira/browse/HDFS-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086415#comment-13086415
]
Eric Payne commented on HDFS-1257:
----------------------------------
Hi Nicholas. Thanks for your patience in getting through the reviews of this.
I'm confused as to why 1) you are seeing this error and 2) it is timing out for
you. I'm not seeing that error in my environment. And, as for the timeout, even
before when it was taking 3 minutes, it should not have timed out. There are a
lot of unit tests that take longer than 3 minutes.
Anyway, as for taking it out, the reason for doing so would be that the test is
not sufficient to thoroughly test the race condition. A unit test just can't
stress the namenode in the MiniDFSCluster enough to exercise this race
condition. To hit this race condition, a test must be in a large cluster with a
very active set of DFS actions happening over an extended period of time. There
just isn't enough memory on a single host to create enough DNs in the
MiniDFSCluster. And, even if there were enoubh memory, a unit test should not
be running for a very long period.
> Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
> -----------------------------------------------------------------------------
>
> Key: HDFS-1257
> URL: https://issues.apache.org/jira/browse/HDFS-1257
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.23.0
> Reporter: Ramkumar Vadali
> Assignee: Eric Payne
> Fix For: 0.23.0
>
> Attachments: HDFS-1257.1.20110810.patch, HDFS-1257.2.20110812.patch,
> HDFS-1257.3.20110815.patch, HDFS-1257.4.20110816.patch, HDFS-1257.patch
>
>
> HADOOP-5124 provided some improvements to FSNamesystem#recentInvalidateSets.
> But it introduced unprotected access to the data structure
> recentInvalidateSets. Specifically, FSNamesystem.computeInvalidateWork
> accesses recentInvalidateSets without read-lock protection. If there is
> concurrent activity (like reducing replication on a file) that adds to
> recentInvalidateSets, the name-node crashes with a
> ConcurrentModificationException.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira