[ 
https://issues.apache.org/jira/browse/HADOOP-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968148#comment-14968148
 ] 

Chris Nauroth commented on HADOOP-11685:
----------------------------------------

Hello [~onpduo].

There is something I don't quite understand about this analysis.

bq. So it is possible there are two threads trying to create this folder at the 
same time, but in WASB there is no lock for this operation. This causes the 
"blob lease id" issue.

>From the stack trace, we are definitely in the {{mkdirs}} code path.  However, 
>my understanding is that leases are not acquired on {{mkdirs}} operations.  
>They are acquired in the {{rename}} code path if atomic rename is enabled for 
>the path (as it should be for an HBase WAL).  Therefore, I don't understand 
>the theory that the root cause was concurrent {{mkdirs}} calls.

If there was concurrent {{rename}} activity happening at the same time, then 
this explanation would make more sense.  The other thread would have held a 
lease throughout the {{rename}}, and then this thread would have failed on the 
{{mkdirs}} for lack of a lease.  Is there {{rename}} activity when this error 
occurs?

Please let me know your thoughts on this.

> StorageException complaining " no lease ID" during HBase distributed log 
> splitting
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-11685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11685
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools
>            Reporter: Duo Xu
>            Assignee: Duo Xu
>         Attachments: HADOOP-11685.01.patch, HADOOP-11685.02.patch, 
> HADOOP-11685.03.patch
>
>
> This is similar to HADOOP-11523, but in a different place. During HBase 
> distributed log splitting, multiple threads will access the same folder 
> called "recovered.edits". However, lots of places in our WASB code did not 
> acquire lease and simply passed null to Azure storage, which caused this 
> issue.
> {code}
> 2015-02-26 03:21:28,871 WARN 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of 
> WALs/workernode4.hbaseproddm2001.g6.internal.cloudapp.net,60020,1422071058425-splitting/workernode4.hbaseproddm2001.g6.internal.cloudapp.net%2C60020%2C1422071058425.1424914216773
>  failed, returning error
> java.io.IOException: org.apache.hadoop.fs.azure.AzureException: 
> java.io.IOException
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.checkForErrors(HLogSplitter.java:633)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$000(HLogSplitter.java:121)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWriting(HLogSplitter.java:964)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(HLogSplitter.java:1019)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:359)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:223)
>       at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:142)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.fs.azure.AzureException: java.io.IOException
>       at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1477)
>       at 
> org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1862)
>       at 
> org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1812)
>       at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:502)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1211)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843)
>       at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813)
> Caused by: java.io.IOException
>       at 
> com.microsoft.windowsazure.storage.core.Utility.initIOException(Utility.java:493)
>       at 
> com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:282)
>       at 
> org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1472)
>       ... 10 more
> Caused by: com.microsoft.windowsazure.storage.StorageException: There is 
> currently a lease on the blob and no lease ID was specified in the request.
>       at 
> com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163)
>       at 
> com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306)
>       at 
> com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229)
>       at 
> com.microsoft.windowsazure.storage.blob.CloudBlockBlob.commitBlockList(CloudBlockBlob.java:248)
>       at 
> com.microsoft.windowsazure.storage.blob.BlobOutputStream.commit(BlobOutputStream.java:319)
>       at 
> com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:279)
>       ... 11 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to