[
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601727#comment-16601727
]
Thomas Marquardt commented on HADOOP-15703:
-------------------------------------------
I addressed the feedback and am attaching 002 patch.
[^HADOOP-15703-HADOOP-15407-002.patch]
In addition to addressing the above feedback, I made the following changes:
# Updated ClientThrottling timer to use daemon threads.
# Changed config property name from fs.azure.autothrottling.enable to
"fs.azure.enable.autothrottling" with default of true. Note this
feature results in as much as 34% improvement in throughput when
the egress or ingress account limits are exceeded.
# Simplified detection of the REST operation type. See AbfsRestOperationType.
# Fixed import ordering.
# Fixed formatting (we use 2 space indent, not 4 space indent)
Regarding the test case, please note that it passes reliably and is duplicating
the similar WASB test case.
All tests pass against my US storage account:
*mvn -T 1C -Pparallel-tests-wasb -Dscale -DtestsThreadCount=8 clean verify*
Tests run: 240, Failures: 0, Errors: 0, Skipped: 11
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
Tests run: 619, Failures: 0, Errors: 0, Skipped: 65
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Total time: 19:13 min (Wall Clock)
*mvn -T 1C -Pparallel-tests-abfs -Dscale -DtestsThreadCount=8 clean verify*
Tests run: 29, Failures: 0, Errors: 0, Skipped: 0
Tests run: 260, Failures: 0, Errors: 0, Skipped: 182
Tests run: 167, Failures: 0, Errors: 0, Skipped: 27
Total time: 03:53 min (Wall Clock)
> ABFS - Implement client-side throttling
> ----------------------------------------
>
> Key: HADOOP-15703
> URL: https://issues.apache.org/jira/browse/HADOOP-15703
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Sneha Varma
> Assignee: Sneha Varma
> Priority: Major
> Attachments: HADOOP-15703-HADOOP-15407-001.patch,
> HADOOP-15703-HADOOP-15407-002.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress
> limits
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
> For example, the max ingress limit for a GRS account in the United States is
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails
> a percentage of incoming requests, and this causes the client to initiate the
> retry policy. The retry policy delays requests by sleeping, but the sleep
> duration is independent of the client throughput and account limit. This
> results in low throughput, due to the high number of failed requests and
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed
> requests and maximizes throughput.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]