[ https://issues.apache.org/jira/browse/HDFS-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HDFS-2632. ------------------------------- Resolution: Duplicate > existing in_use.lock file is removed after failing to lock this file > -------------------------------------------------------------------- > > Key: HDFS-2632 > URL: https://issues.apache.org/jira/browse/HDFS-2632 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.21.0 > Environment: Scientific Linux 5.3 > Reporter: Dan Bradley > > If an attempt is made to start the namenode when it is already running, an > exception is generated on failure to lock in_use.lock. However, there is a > bug: in_use.lock is deleted! After that, if another attempt is made to start > the namenode, there is no in_use.lock file, so the new instance goes ahead > and starts messing with the namenode state files. It eventually fails to > bind to the TCP port, but it has already done damage by that time. > Specifically, the 'edits' file being written to by the running instance is > moved to 'previous.checkpoint' so all further transactions are lost when the > HDFS service is next restarted. We observed a case of data loss because of > this. > This issue relates to HDFS-1690, but the problem in HDFS-1690 was stated in a > way that is specific to -format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira