[ https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakob Homan updated HDFS-1203: ------------------------------ Status: Patch Available (was: Open) Re-submitting to Hudson to get another run of the tests, just for completeness, since the original run has expired. However, Hudon's not been around a lot lately, and so it may be more expedient for Todd to run the tests locally and report the results here, if he wishes. > DataNode should sleep before reentering service loop after an exception > ----------------------------------------------------------------------- > > Key: HDFS-1203 > URL: https://issues.apache.org/jira/browse/HDFS-1203 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1203.txt > > > When the DN gets an exception in response to a heartbeat, it logs it and > continues, but there is no sleep. I've occasionally seen bugs produce a case > where heartbeats continuously produce exceptions, and thus the DN floods the > NN with bad heartbeats. Adding a 1 second sleep at least throttles the error > messages for easier debugging and error isolation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.