[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

Steve Loughran (JIRA) Wed, 29 Aug 2018 08:51:09 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596519#comment-16596519
 ]


Steve Loughran commented on HADOOP-15703:
-----------------------------------------

Looks good, excluding what I'm about to say about the test.

* needs docs
* not needed for this patch, but know in future that you can use {@code 
ClientThrottlingAnalyzer} for easier 
insertion of code into the javadocs.

AbfsClientThrottlingAnalyzer 

* import ordering should be java.*, other, org.apache.*, import statics
* L141: can you put braces around the innermost bit of the equation; I'm not 
sure of the ordering of that? 
I think its 0 : (percentageConversionFactor * bytesFailed / (bytesFailed + 
bytesSuccessful); , but would like the braces for all to see.

AbfsClientThrottlingOperationDescriptor: add newline to EOF

TestAbfsClientThrottlingAnalyzer

* again, import ordering.

-1 to the test as is; it's going to be way too brittle, especially in parallel 
test runs, because of all its expectations that the durations of sleeps are 
less than an expected range. This is especially the case on overloaded systems, 
like the ASF Jenkins build VMs.


The throttling is testable without going into sleep() calls or relying on 
elapsed times.


# Use {{org.apache.hadoop.util.Timer}} for time, creating it in some protected 
method {{createTimer()}}

# have a subclass of AbfsClientThrottlingAnalyzer in the test suite which 
returns {{new FakeTimer())}} in its timer, and whose (subclassed) {{sleep()}} 
method simply increments that timer rather than sleeping. 



> ABFS - Implement client-side throttling 
> ----------------------------------------
>
>                 Key: HADOOP-15703
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15703
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Sneha Varma
>            Priority: Major
>         Attachments: HADOOP-15703-HADOOP-15407-001.patch
>
>
> Big data workloads frequently exceed the AzureBlobFS max ingress and egress 
> limits 
> (https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets).
>  For example, the max ingress limit for a GRS account in the United States is 
> currently 10 Gbps. When the limit is exceeded, the AzureBlobFS service fails 
> a percentage of incoming requests, and this causes the client to initiate the 
> retry policy. The retry policy delays requests by sleeping, but the sleep 
> duration is independent of the client throughput and account limit. This 
> results in low throughput, due to the high number of failed requests and 
> thrashing causes by the retry policy.
> To fix this, we introduce a client-side throttle which minimizes failed 
> requests and maximizes throughput. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15703) ABFS - Implement client-side throttling

Reply via email to