You are likely hitting this: https://issues.apache.org/jira/browse/HDFS-3848
On Mon, Jun 16, 2014 at 10:17 PM, Bogdan Raducanu <lrd...@gmail.com> wrote: > Thanks. I tried to call recoverLease before doing fs.append. Now I'm getting > only the AlreadyBeingCreatedException > ("org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file /lease_fix for DFSClient_NONMAPREDUCE_394503315_1 for > client 10.0.0.1 because current leaseholder is trying to recreate file.") > once and then it seems to work. > > But it's curious why I'm getting that exception now. I traced it to this > code, in FSNamesystem.java: > > // > // We found the lease for this file. And surprisingly the original > // holder is trying to recreate this file. This should never occur. > // > if (!force && lease != null) { > Lease leaseFile = leaseManager.getLeaseByPath(src); > if ((leaseFile != null && leaseFile.equals(lease)) || > lease.getHolder().equals(holder)) { > throw new AlreadyBeingCreatedException( > "failed to create file " + src + " for " + holder + > " for client " + clientMachine + > " because current leaseholder is trying to recreate file."); > } > } > > It seems to me that that exception will always be thrown because > lease.getHolder().equals(holder) is always true. It should've been > leaseFile.getHolder().equals(holder) perhaps. > > > On Mon, Jun 16, 2014 at 5:47 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> Please take a look at the following method in DFSClient: >> >> /** >> >> * Recover a file's lease >> >> * @param src a file's path >> >> * @return true if the file is already closed >> >> * @throws IOException >> >> */ >> >> boolean recoverLease(String src) throws IOException { >> >> Cheers >> >> >> >> On Mon, Jun 16, 2014 at 8:26 AM, Anonymous <lrdbgy+h...@gmail.com> wrote: >>> >>> Hello, >>> >>> I have a long running application that opens a file and periodically >>> appends to it. If this application is killed and then restarted it cannot >>> open the same file again for some time (~ 1minute). First, it gets the >>> AlreadyBeingCreated exception (which I guess means namenode doesn't yet know >>> the program crashed) and then the RecoveryInProgress exception (which I >>> guess means the namenode proceeded to close and release the file after >>> inactivity). After about 1 minute it starts to work again. >>> >>> What is the correct way to recover from this? Is there API for recovering >>> the lease and resuming appending faster? DFSClient sets a randomized client >>> name. If it were to send the same client name as before the crash, would it >>> receive a lease on the file faster? >>> >>> Thanks >> >> > -- Harsh J