[
https://issues.apache.org/jira/browse/HDFS-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295796#comment-13295796
]
Kihwal Lee commented on HDFS-3534:
----------------------------------
0.23/2.0 is no different from 0.20.205/1.0.x on this. If you don't see it in
your test case, that's because the timings are different. I have seen the same
thing happening on 0.23.
This is commonly seen when users forget to close the output streams of their
task output. The temporary name of the parent directory will be renamed to the
final one before the last block is completed in the shutdown hook in this case.
Depending on the timing, there can be data loss as well. Namenode will
eventually notice the state of the final block and take care of it.
> LeaseExpiredException on NameNode if file is moved while being created.
> -----------------------------------------------------------------------
>
> Key: HDFS-3534
> URL: https://issues.apache.org/jira/browse/HDFS-3534
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.20.2, 0.20.205.0
> Reporter: Mitesh Singh Jat
>
> If a file (big_file.txt size=512MB) being created (or uploaded) on hdfs, and
> a rename (fs -mv) of that file is done. Then following exception occurs:-
> {noformat}
> 12/06/13 08:56:42 WARN hdfs.DFSClient: DataStreamer Exception:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /user/mitesh/temp/big_file.txt File does not exist. [Lease. Holder:
> DFSClient_-2105467303, pendingcreates: 1]
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1604)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1595)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1511)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:685)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1082)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
> at org.apache.hadoop.ipc.Client.call(Client.java:1066)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> at $Proxy6.addBlock(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy6.addBlock(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3324)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3188)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2300(DFSClient.java:2406)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2646)
> 12/06/13 08:56:42 WARN hdfs.DFSClient: Error Recovery for block
> blk_-5525713112321593595_679317395 bad datanode[0] nodes == null
> 12/06/13 08:56:42 WARN hdfs.DFSClient: Could not get block locations. Source
> file "/user/mitesh/temp/big_file.txt" - Aborting...
> ...
> {noformat}
> Whereas this issue is not seen on *Hadoop 0.23*.
> I have used following shell script to simulate the issue.
> {code:title=run_parallely.sh}
> #!/bin/bash
> hadoop="hadoop"
> filename=big_file.txt
> dest=/user/mitesh/temp/$filename
> dest2=/user/mitesh/xyz/$filename
> ## Clean up
> hadoop fs -rm -skipTrash $dest
> hadoop fs -rm -skipTrash $dest2
> ## Copy big_file.txt onto hdfs
> hadoop fs -put $filename $dest > cmd1.log 2>&1 &
> ## sleep until entry is created, hoping copying is not finished
> until $(hadoop fs -test -e $dest)
> do
> sleep 1
> done
> ## Now move
> hadoop fs -mv $dest $dest2 > cmd2.log 2>&1 &
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira