[jira] [Commented] (HDFS-6569) OOB message can't be sent to the client when DataNode shuts down for upgrade

Brandon Li (JIRA) Wed, 25 Jun 2014 11:57:11 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043915#comment-14043915
 ]


Brandon Li commented on HDFS-6569:
----------------------------------

The current code looks good logically and it tries not closing streams before 
the OOB is sent.

I think problem is triggered by the NIO implementation. When DataNode is 
shutdown for restart, it interrupts all the DataXceiver threads.  The NIO 
channel in NioInetPeer are bound to these threads doing the block receiving. If 
these threads are interrupted, the stream / channel is closed due to IO safety 
issues.

So once the DataXceiver thread is interrupted, rarely the OOB can be sent 
before NIO channel is closed automatically. 
One possible fix is to send OOB message before interrupting DataXceiver threads.
Thoughts?

> OOB message can't be sent to the client when DataNode shuts down for upgrade
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-6569
>                 URL: https://issues.apache.org/jira/browse/HDFS-6569
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Brandon Li
>
> The socket is closed too early before the OOB message can be sent to client, 
> which causes the write pipeline failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6569) OOB message can't be sent to the client when DataNode shuts down for upgrade

Reply via email to