[ https://issues.apache.org/jira/browse/HADOOP-6073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Boudnik updated HADOOP-6073: --------------------------------------- Status: Patch Available (was: Open) M.b. something along these lines (see attachment)? Moving responder.interrupt() invocation to the finally {...} doesn't make us much good, because responder thread has to be stopped only in case of error. Catching Throwable and wrapping it into IOException (as Raghu suggestion) seems to do the trick. I can confirm that I don't see any more of the 'hanging' behavior in the tests with this patch applied. > Unchecked exception thrown inside of BlockReceiver cause some threads hang > -------------------------------------------------------------------------- > > Key: HADOOP-6073 > URL: https://issues.apache.org/jira/browse/HADOOP-6073 > Project: Hadoop Core > Issue Type: Bug > Reporter: Konstantin Boudnik > Attachments: copy.txt.log, HADOOP-6073.patch, x2 > > > One is able to inject all sorts of faults into Hadoop's classes using new > fault injection framework (HADOOP-6003). > I've been injecting unchecked exception (RuntimeException) into > BlockReceiver.receivePacket() method before any > of write() operations (e.g. line 401, 449, 463, 529) and running some of > the existing HDFS tests. The injection of unchecked exceptions causes > DataXceiver to die silently and without any traces. > From a debugger run it seems like some threads are being left alive or not > notified about the exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.