Yongjun Zhang created HADOOP-14198:
--------------------------------------
Summary: Should have a way to let PingInputStream to abort
Key: HADOOP-14198
URL: https://issues.apache.org/jira/browse/HADOOP-14198
Project: Hadoop Common
Issue Type: Bug
Reporter: Yongjun Zhang
We observed a case that RPC call get stuck, since PingInputStream does the
following
{code}
/** This class sends a ping to the remote side when timeout on
* reading. If no failure is detected, it retries until at least
* a byte is read.
*/
private class PingInputStream extends FilterInputStream {
{code}
It seems that in this case no data is ever received, and it keeps pinging.
Should we ping forever here? Maybe we should introduce a config to stop the
ping after pinging for certain number of times, and report back timeout, let
the caller to retry the RPC?
Wonder if there is chance the RPC get dropped somehow by the server so no
response is ever received.
See
{code}
Thread 16127: (state = BLOCKED)
- sun.nio.ch.SocketChannelImpl.readerCleanup() @bci=6, line=279 (Compiled
frame)
- sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer) @bci=205, line=390
(Compiled frame)
-
org.apache.hadoop.net.SocketInputStream$Reader.performIO(java.nio.ByteBuffer)
@bci=5, line=57 (Compiled frame)
- org.apache.hadoop.net.SocketIOWithTimeout.doIO(java.nio.ByteBuffer, int)
@bci=35, line=142 (Compiled frame)
- org.apache.hadoop.net.SocketInputStream.read(java.nio.ByteBuffer) @bci=6,
line=161 (Compiled frame)
- org.apache.hadoop.net.SocketInputStream.read(byte[], int, int) @bci=7,
line=131 (Compiled frame)
- java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled
frame)
- java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled
frame)
- org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int,
int) @bci=4, line=521 (Compiled frame)
- java.io.BufferedInputStream.fill() @bci=214, line=246 (Compiled frame)
- java.io.BufferedInputStream.read() @bci=12, line=265 (Compiled frame)
- java.io.DataInputStream.readInt() @bci=4, line=387 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse() @bci=19,
line=1081 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=62, line=976 (Compiled
frame)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]