prasanthj opened a new pull request #65:
URL: https://github.com/apache/tez/pull/65
Fetcher considers connection failure only when http.connect throws
exception. In kubernetes environment, where there can be intermediate proxies,
getInputStream from http connection can throw connection reset error (5xx).
These errors should be considered as connection failures as well.
```
2020-05-08 17:03:54.080 WARN [Fetcher_B {Map_3} #3] shuffle.Fetcher: Fetch
Failure while connecting from 10.117.155.27 to: 10.117.154.115:25551, attempt:
InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0,
pathComponent=attempt_1588982534035_0000_1_00_000000_0_10030, spillType=0,
spillId=-1] Informing ShuffleManager:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:706)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1593)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at
org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260)
at
org.apache.tez.runtime.library.common.shuffle.Fetcher.setupConnection(Fetcher.java:530)
at
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:563)
at
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:487)
at
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:285)
at
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76)
at
org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]