[
https://issues.apache.org/jira/browse/HDFS-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kumar Vavilapalli updated HDFS-5500:
------------------------------------------
Target Version/s: (was: 2.8.0)
Not much going on here for a long time, dropping from 2.8.0.
Not putting any target-version either anymore, let's target this depending on
when there is patch activity.
> Critical datanode threads may terminate silently on uncaught exceptions
> -----------------------------------------------------------------------
>
> Key: HDFS-5500
> URL: https://issues.apache.org/jira/browse/HDFS-5500
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Priority: Critical
>
> We've seen refreshUsed (DU) thread disappearing on uncaught exceptions. This
> can go unnoticed for a long time. If OOM occurs, more things can go wrong.
> On one occasion, Timer, multiple refreshUsed and DataXceiverServer thread had
> terminated.
> DataXceiverServer catches OutOfMemoryError and sleeps for 30 seconds, but I
> am not sure it is really helpful. In once case, the thread did it multiple
> times then terminated. I suspect another OOM was thrown while in a catch
> block. As a result, the server socket was not closed and clients hung on
> connect. If it had at least closed the socket, client-side would have been
> impacted less.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]