[ 
https://issues.apache.org/jira/browse/HADOOP-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961464#comment-14961464
 ] 

Chris Nauroth commented on HADOOP-12334:
----------------------------------------

[~gouravk], thank you for taking a look.

I was hopeful that we might be able to hook in some failure simulation in a 
test, similar to the 
{{TestAzureFileSystemErrorConditions#injectTransientError}} method.  I just 
spent some time experimenting with this and trying to hook on to the various 
event listeners exposed by the Azure Storage SDK.  Unfortunately, these don't 
appear to give us a deep enough hook for the failure injection I had in mind.  
I don't see a way to rewrite the whole outbound HTTP response to make it look 
like a server-busy error.  I think your manual testing will have to suffice for 
this patch.

There is still one more piece of unresolved feedback on patch v06.  Please see 
my comment from 21/Sep/15 about guaranteeing that the streams get closed.  
After that is addressed, I expect this patch will be ready to go.  Thanks!

> Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage 
> Throttling after retries
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12334
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12334
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools
>            Reporter: Gaurav Kanade
>            Assignee: Gaurav Kanade
>         Attachments: HADOOP-12334.01.patch, HADOOP-12334.02.patch, 
> HADOOP-12334.03.patch, HADOOP-12334.04.patch, HADOOP-12334.05.patch, 
> HADOOP-12334.06.patch
>
>
> HADOOP-11693 mitigated the problem of HMaster aborting regionserver due to 
> Azure Storage Throttling event during HBase WAL archival. The way this was 
> achieved was by applying an intensive exponential retry when throttling 
> occurred.
> As a second level of mitigation we will change the mode of copy operation if 
> the operation fails even after all retries -i.e. we will do a client side 
> copy of the blob and then copy it back to destination. This operation will 
> not be subject to throttling and hence should provide a stronger mitigation. 
> However it is more expensive, hence we do it only in the case we fail after 
> all retries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to