zhaoyunjiong created HDFS-5944:
----------------------------------

             Summary: LeaseManager:findLeaseWithPrefixPath didn't handle path 
like /a/b/ right cause SecondaryNameNode failed do checkpoint
                 Key: HDFS-5944
                 URL: https://issues.apache.org/jira/browse/HDFS-5944
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 2.2.0, 1.2.0
            Reporter: zhaoyunjiong
            Assignee: zhaoyunjiong


In our cluster, we encountered error like this:
java.io.IOException: saveLeases found path /XXX/20140206/04_30/_SUCCESS.slc.log 
but is not under construction.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:6217)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:607)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1004)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:949)

What happened:
Client A open file /XXX/20140206/04_30/_SUCCESS.slc.log for write.
And Client A continue refresh it's lease.
Client B deleted /XXX/20140206/04_30/
Client C open file /XXX/20140206/04_30/_SUCCESS.slc.log for write
Client C closed the file /XXX/20140206/04_30/_SUCCESS.slc.log
Then secondaryNameNode try to do checkpoint and failed due to failed to delete 
lease hold by Client A when Client B deleted /XXX/20140206/04_30/.

The reason is this a bug in findLeaseWithPrefixPath:
     int srclen = prefix.length();
     if (p.length() == srclen || p.charAt(srclen) == Path.SEPARATOR_CHAR) {
        entries.put(entry.getKey(), entry.getValue());
      }
Here when prefix is /XXX/20140206/04_30/, and p is 
/XXX/20140206/04_30/_SUCCESS.slc.log, p.charAt(srcllen) is '_'.
The fix is simple, I'll upload patch later.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to