[
https://issues.apache.org/jira/browse/HADOOP-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650704#action_12650704
]
Raghu Angadi commented on HADOOP-4679:
--------------------------------------
After talking to Hairong:
# DataXceiverServer should handle SocketTimeoutException. Right now an idle
DN prints exception every 10 seconds.
# the timeout for serever socket could be lower.. that test will finish
faster.
# The unit test need not create files in a tight loop.
# immedateShutdown is not really necessary. The way shutdown() works, it
should only be called from offerService() thread. I think javadoc JavaDoc
should state it explicitly.
# The reason log was printed in a tight infinite loop (with out sleep) is
that thread inturrupts itself before calling sleep().. so sleep returns
immediately!
I think this should go into 0.18. No one likes disks filling up with these log
messages.
> Datanode prints tons of log messages: Waiting for threadgroup to exit, active
> theads is XX
> ------------------------------------------------------------------------------------------
>
> Key: HADOOP-4679
> URL: https://issues.apache.org/jira/browse/HADOOP-4679
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Attachments: diskError.patch, diskError1.patch
>
>
> When a data receiver thread sees a disk error, it immediately calls shutdown
> to shutdown DataNode. But the shutdown method does not return before all data
> receiver threads exit, which will never happen. Therefore the DataNode gets
> into a dead/live lock state, emitting tons of log messages: Waiting for
> threadgroup to exit, active threads is XX.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.