[
https://issues.apache.org/jira/browse/HDFS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729713#comment-13729713
]
Konstantin Shvachko commented on HDFS-2994:
-------------------------------------------
Looks like the problem is still there.
In case of opening for append if softLimit expired recoverLeaseInternal() may
finalize file and replace myFile with the closed one.
Then prepareFileForWrite() will try to replace the same file again, which will
fail because myFile is an outdated / invalid reference to the old indode.
The right fix is to refresh myFile after recoverLeaseInternal() rather than
setting its parent field as proposed in attached patch.
> If lease is recovered successfully inline with create, create can fail
> ----------------------------------------------------------------------
>
> Key: HDFS-2994
> URL: https://issues.apache.org/jira/browse/HDFS-2994
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.24.0
> Reporter: Todd Lipcon
> Assignee: amith
> Attachments: HDFS-2994_1.patch, HDFS-2994_1.patch
>
>
> I saw the following logs on my test cluster:
> {code}
> 2012-02-22 14:35:22,887 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile: recover lease
> [Lease. Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1,
> pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 from client
> DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1
> 2012-02-22 14:35:22,887 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease.
> Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1,
> pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: BLOCK*
> internalReleaseLease: All existing blocks are COMPLETE, lease removed, file
> closed.
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR*
> FSDirectory.replaceNode: failed to remove
> /benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR*
> NameSystem.startFile: FSDirectory.replaceNode: failed to remove
> /benchmarks/TestDFSIO/io_data/test_io_6
> {code}
> It seems like, if {{recoverLeaseInternal}} succeeds in {{startFileInternal}},
> then the INode will be replaced with a new one, meaning the later
> {{replaceNode}} call can fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira