Rajesh Balamohan created HIVE-16692:
---------------------------------------

             Summary: LLAP: Keep alive connection in shuffle handler should not 
be closed until entire data is flushed out
                 Key: HIVE-16692
                 URL: https://issues.apache.org/jira/browse/HIVE-16692
             Project: Hive
          Issue Type: Bug
          Components: llap
            Reporter: Rajesh Balamohan
            Priority: Minor


In corner cases with keep-alive enabled, it is possible that the headers are 
written out in the response and downstream was able to read the headers.  

But possible that the mapOutput construction took a lot longer time (due to 
disk or any other issue) in server side. In the mean time, keep alive timeout 
can kick in and close the connection from server side. In such cases, there is 
a possibility that downstream can get "connection reset". Ideally keep alive 
should kick in only after flushing entire response downstream.

e.g error msg in client side
{noformat}
java.net.SocketException: Connection reset
        at java.net.SocketInputStream.read(SocketInputStream.java:209) 
~[?:1.8.0_112]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) 
~[?:1.8.0_112]
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) 
~[?:1.8.0_112]
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) 
~[?:1.8.0_112]
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
~[?:1.8.0_112]
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) 
~[?:1.8.0_112]
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) 
~[?:1.8.0_112]
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:675) 
~[?:1.8.0_112]
        at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
 ~[?:1.8.0_112]
        at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
 ~[?:1.8.0_112]
        at 
org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260) 
~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.setupConnection(Fetcher.java:460)
 ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:492)
 ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:417)
 ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:215)
 ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:73)
 ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
~[tez-common-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_112]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_112]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_112]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
{noformat}

This corner case handling was not pulled in earlier from MR handler fixes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to