[
https://issues.apache.org/jira/browse/CHUKWA-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920314#action_12920314
]
Bill Graham commented on CHUKWA-534:
------------------------------------
Looking more closely at DemuxManager, it seems {{globalErrorcounter}} is never
reset to 0, so > 5 non-consecutive errors in the life of the daemon would kill
the process. I propose we reset that counter upon a successful demux run.
Also, for consistency with other demux params, we should drop 'chukwa.' from
what I show above:
{noformat}
demux.max.error.count.before.shutdown
{noformat}
> Improve fault-tolerance of DemuxManager.
> ----------------------------------------
>
> Key: CHUKWA-534
> URL: https://issues.apache.org/jira/browse/CHUKWA-534
> Project: Chukwa
> Issue Type: Improvement
> Reporter: Bill Graham
> Assignee: Bill Graham
>
> If the DemuxManager received more than 5 consecutive errors, it dies with the
> message "Too many errors, Bail out!".
> Let's change to this introduce a configurable number of concurrent exceptions
> to be encountered before dying. If the value is set to -1, expected behavior
> is to keep retrying ad infinitum.
> Also as part if this bug is to improve logging of how many consecutive errors
> have occurred, as well as the time they started. A possible future
> enhancement could be to support an error time threshold as well as an
> absolute count.
> Suggesting the following new config setting. It's a bit verbose, but it's
> clear.
> {noformat}
> chukwa.demux.max.error.count.before.shutdown
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.