[ 
https://issues.apache.org/jira/browse/TEZ-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106747#comment-17106747
 ] 

Rajesh Balamohan commented on TEZ-4174:
---------------------------------------

Committed to master. Thanks [~prasanth_j]

> [Kubernetes] Fetcher should connection failure on SocketException
> -----------------------------------------------------------------
>
>                 Key: TEZ-4174
>                 URL: https://issues.apache.org/jira/browse/TEZ-4174
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>             Fix For: 0.10.1
>
>         Attachments: TEZ-4174.1.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fetcher considers connection failure only when http.connect throws exception. 
> In kubernetes environment, where there can be intermediate proxies, 
> getInputStream from http connection can throw connection reset error (5xx). 
> These errors should be considered as connection failures as well.
> {code:java}
> 2020-05-08 17:03:54.080  WARN [Fetcher_B {Map_3} #3] shuffle.Fetcher: Fetch 
> Failure while connecting from 10.117.155.27 to: 10.117.154.115:25551, 
> attempt: InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, 
> pathComponent=attempt_1588982534035_0000_1_00_000000_0_10030, spillType=0, 
> spillId=-1] Informing ShuffleManager:
> java.net.SocketException: Connection reset
>         at java.net.SocketInputStream.read(SocketInputStream.java:210)
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>         at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
>         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
>         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:706)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1593)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
>         at 
> org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.setupConnection(Fetcher.java:530)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:563)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:487)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:285)
>         at 
> org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>         at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>         at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to