Possible race condition in BlocksMap.NodeIterator.
--------------------------------------------------

                 Key: HDFS-889
                 URL: https://issues.apache.org/jira/browse/HDFS-889
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.22.0
            Reporter: Steve Loughran


Hudson's test run for HDFS-165 is showing an NPE in 
{{org.apache.hadoop.hdfs.server.namenode.TestNodeCount.testNodeCount()}}
One problem could be in {{BlocksMap.NodeIterator}}. It's {{hasNext()}} method 
checks the next entry isn't null. But what if between the {{hasNext() call and 
the next() operation, the map changes and an entry goes away? In that 
situation, the node returned from next() will be null. 

This is potentially serious as a quick look through the code shows that the 
iterator gets retrieved a lot and everywhere hadoop does so, it assumes the 
value is not null. It's also one of those problems that doesn't have a simple 
"make it go away" fix.

Options
# Ignore it, hope it doesn't happen very often and the test failing was a one 
off that will never happen in a production datacentre. This is the default. The 
iterator is only used in the namenode, so while it does depend on the # of 
datanodes, it isn't running in 4000 machines in a big cluster.
# Leave the iterator as is, have all the in-Hadoop code check for a null-value 
and break the loop
# Patch the {{NodeIterator}} to be consistent with the {{Iterator}} 
specification and throw a {{NoSuchElementException}} if the next value is null. 
This does not make the problem go away, but now it is handled by having every 
use in-Hadoop catching the exception at the right point and exiting the loop. 

Testing. This should be possible.
# Create a block map
# iterate over a block
# while the iterator is in progress remove the next block in the list. Expect 
the next call to next() to fail in whatever way you choose. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to