[
https://issues.apache.org/jira/browse/HADOOP-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961464#comment-14961464
]
Chris Nauroth commented on HADOOP-12334:
----------------------------------------
[~gouravk], thank you for taking a look.
I was hopeful that we might be able to hook in some failure simulation in a
test, similar to the
{{TestAzureFileSystemErrorConditions#injectTransientError}} method. I just
spent some time experimenting with this and trying to hook on to the various
event listeners exposed by the Azure Storage SDK. Unfortunately, these don't
appear to give us a deep enough hook for the failure injection I had in mind.
I don't see a way to rewrite the whole outbound HTTP response to make it look
like a server-busy error. I think your manual testing will have to suffice for
this patch.
There is still one more piece of unresolved feedback on patch v06. Please see
my comment from 21/Sep/15 about guaranteeing that the streams get closed.
After that is addressed, I expect this patch will be ready to go. Thanks!
> Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage
> Throttling after retries
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-12334
> URL: https://issues.apache.org/jira/browse/HADOOP-12334
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools
> Reporter: Gaurav Kanade
> Assignee: Gaurav Kanade
> Attachments: HADOOP-12334.01.patch, HADOOP-12334.02.patch,
> HADOOP-12334.03.patch, HADOOP-12334.04.patch, HADOOP-12334.05.patch,
> HADOOP-12334.06.patch
>
>
> HADOOP-11693 mitigated the problem of HMaster aborting regionserver due to
> Azure Storage Throttling event during HBase WAL archival. The way this was
> achieved was by applying an intensive exponential retry when throttling
> occurred.
> As a second level of mitigation we will change the mode of copy operation if
> the operation fails even after all retries -i.e. we will do a client side
> copy of the blob and then copy it back to destination. This operation will
> not be subject to throttling and hence should provide a stronger mitigation.
> However it is more expensive, hence we do it only in the case we fail after
> all retries
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)