[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

Thomas Marquardt (JIRA) Sun, 02 Sep 2018 18:47:44 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601727#comment-16601727
 ]


Thomas Marquardt commented on HADOOP-15703:
-------------------------------------------

I addressed the feedback and am attaching 002 patch.  
[^HADOOP-15703-HADOOP-15407-002.patch]

In addition to addressing the above feedback, I made the following changes:
 # Updated ClientThrottling timer to use daemon threads.
 # Changed config property name from fs.azure.autothrottling.enable to
 "fs.azure.enable.autothrottling" with default of true. Note this 
 feature results in as much as 34% improvement in throughput when
 the egress or ingress account limits are exceeded.
 # Simplified detection of the REST operation type. See AbfsRestOperationType.
 # Fixed import ordering.
 # Fixed formatting (we use 2 space indent, not 4 space indent)

Regarding the test case, please note that it passes reliably and is duplicating 
the similar WASB test case.

All tests pass against my US storage account:

*mvn -T 1C -Pparallel-tests-wasb -Dscale -DtestsThreadCount=8 clean verify*
 Tests run: 240, Failures: 0, Errors: 0, Skipped: 11
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 619, Failures: 0, Errors: 0, Skipped: 65
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
 Total time: 19:13 min (Wall Clock)

*mvn -T 1C -Pparallel-tests-abfs -Dscale -DtestsThreadCount=8 clean verify*
 Tests run: 29, Failures: 0, Errors: 0, Skipped: 0
 Tests run: 260, Failures: 0, Errors: 0, Skipped: 182
 Tests run: 167, Failures: 0, Errors: 0, Skipped: 27
 Total time: 03:53 min (Wall Clock)

 

> ABFS - Implement client-side throttling 
> ----------------------------------------
>
>                 Key: HADOOP-15703
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15703
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sneha Varma
>            Assignee: Sneha Varma
>            Priority: Major
>         Attachments: HADOOP-15703-HADOOP-15407-001.patch, 
> HADOOP-15703-HADOOP-15407-002.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress 
> limits 
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
>  For example, the max ingress limit for a GRS account in the United States is 
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails 
> a percentage of incoming requests, and this causes the client to initiate the 
> retry policy. The retry policy delays requests by sleeping, but the sleep 
> duration is independent of the client throughput and account limit. This 
> results in low throughput, due to the high number of failed requests and 
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed 
> requests and maximizes throughput. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

Reply via email to