[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003154#comment-17003154
 ] 

Ayush Saxena commented on HDFS-15078:
-------------------------------------

Thanx [~ferhui] for the details. 
TBH I have hard feeling for this fix, and I don't consider this as an 
Improvement too. This finds attention as a bug only for the scenario you told, 
otherwise I don't think there should be anything like router receiving the call 
and not sending to Namenode. Router is supposed to just receive the call, and 
if it has received a valid call, it should in any case send to namenode. For a 
client he is sending the request to the NN itself, Call vanishing in between at 
Router doesn't make sense to me.

I would rather like to fix this problem as whole what HDFS-15079 tends to do, 
rather than just handling one possibility which can minimize the effect. 
Moreover, having checks at router for every call is also an added overhead for 
normal calls. We have lately faced perfomance issues recently too regarding 
calls taking non-trivial amount of time at the Router itself.

Anyway, I am not blocking this in anyway, If others are Ok with this, I pose no 
objections. :)


> RBF: Should check connection channel before sending rpc to namenode
> -------------------------------------------------------------------
>
>                 Key: HDFS-15078
>                 URL: https://issues.apache.org/jira/browse/HDFS-15078
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: rbf
>    Affects Versions: 3.3.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>            Priority: Major
>         Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on 8888, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on 8888 caught an exception
> java.nio.channels.ClosedChannelException
>         at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
>         at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
>         at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
>         at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
>         at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
>         at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
>         at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
>         at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to