[ 
https://issues.apache.org/jira/browse/HADOOP-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643864#comment-17643864
 ] 

ASF GitHub Bot commented on HADOOP-18146:
-----------------------------------------

anmolanmol1234 commented on code in PR #4039:
URL: https://github.com/apache/hadoop/pull/4039#discussion_r1040936094


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java:
##########
@@ -314,18 +314,21 @@ public void sendRequest(byte[] buffer, int offset, int 
length) throws IOExceptio
     if (this.isTraceEnabled) {
       startTime = System.nanoTime();
     }
-    try (OutputStream outputStream = this.connection.getOutputStream()) {
-      // update bytes sent before they are sent so we may observe
-      // attempted sends as well as successful sends via the
-      // accompanying statusCode
-      this.bytesSent = length;
+    OutputStream outputStream;
+    try {
+      try {
+        outputStream = this.connection.getOutputStream();
+      } catch (IOException e) {
+        // If getOutputStream fails with an exception due to 100-continue

Review Comment:
   1. The first point is valid, I have made the change where getOutputStream 
throws exception for the cases where 100 continue is not enabled and returns 
back to the caller when it catches an IOException due to 100 continue being 
enabled which leads to processResponse getting the correct status code and then 
eventually the retry logic coming into play. 
   
   2. We need to update the bytes sent for failed as well as passed cases. The 
current change will not swallow any exceptions.
   The handling for various status code with 100 continue enabled is as follows 
   
   1. Case 1 :- getOutputStream doesn't throw any exception, response is 
processed and it gives status code of 200, no retry is needed and hence the 
request succeeds.
   2. Case 2:- getOutputSteam throws exception, we return to the caller and in 
processResponse in this.connection.getResponseCode() it gives status code of 
404 (user error), exponential retry is not needed. We retry without 100 
continue enabled.
   3. Case 3:- getOutputSteam throws exception, we return to the caller and in 
processResponse  it gives status code of 503,
   which shows throttling so we backoff accordingly with exponential retry. 
Since each append request waits for 100 continue response, the stress on the 
server gets reduced.





> ABFS: Add changes for expect hundred continue header with append requests
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-18146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18146
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: 3.3.1
>            Reporter: Anmol Asrani
>            Assignee: Anmol Asrani
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
>  Heavy load from a Hadoop cluster lead to high resource utilization at FE 
> nodes. Investigations from the server side indicate payload buffering at 
> Http.Sys as the cause. Payload of requests that eventually fail due to 
> throttling limits are also getting buffered, as its triggered before FE could 
> start request processing.
> Approach: Client sends Append Http request with Expect header, but holds back 
> on payload transmission until server replies back with HTTP 100. We add this 
> header for all append requests so as to reduce.
> We made several workload runs with and without hundred continue enabled and 
> the overall observation is that :-
>  # The ratio of TCP SYN packet count with and without expect hundred continue 
> enabled is 0.32 : 3 on average.
>  #  The ingress into the machine at TCP level is almost 3 times lesser with 
> hundred continue enabled which implies a lot of bandwidth save.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to