[
https://issues.apache.org/jira/browse/CHUKWA-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865923#action_12865923
]
Ari Rabkin commented on CHUKWA-487:
-----------------------------------
Ugh. We actually DO have code to do exactly that. It's even in 0.4 It calls
System.exit() and all.
Is some shutdown hook gumming up the works?
> Collector left in a bad state after temprorary NN outage
> --------------------------------------------------------
>
> Key: CHUKWA-487
> URL: https://issues.apache.org/jira/browse/CHUKWA-487
> Project: Hadoop Chukwa
> Issue Type: Bug
> Components: data collection
> Affects Versions: 0.4.0
> Reporter: Bill Graham
>
> When the name node returns errors to the collector, at some point the
> collector dies half way. This behavior should be changed to either resemble
> the agents and keep trying, or to completely shutdown. Instead, what I'm
> seeing is that the collector logs that it's shutting down, and the
> var/pidDir/Collector.pid file gets removed, but the collector continues to
> run, albeit not handling new data. Instead, this log entry is repeated ad
> infinitum:
> 2010-05-06 17:35:06,375 INFO Timer-1 root -
> stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
> 2010-05-06 17:36:06,379 INFO Timer-1 root -
> stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
> 2010-05-06 17:37:06,384 INFO Timer-1 root -
> stats:ServletCollector,numberHTTPConnection:0,numberchunks:0
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.