[
https://issues.apache.org/jira/browse/CHUKWA-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Graham updated CHUKWA-534:
-------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Thanks Ari, committed.
> Improve fault-tolerance of DemuxManager, PostProcessManager and
> ChukwaArchiveManager.
> -------------------------------------------------------------------------------------
>
> Key: CHUKWA-534
> URL: https://issues.apache.org/jira/browse/CHUKWA-534
> Project: Chukwa
> Issue Type: Improvement
> Reporter: Bill Graham
> Assignee: Bill Graham
> Attachments: CHUKWA-534_1.patch, CHUKWA-534_2.patch,
> CHUKWA-534_3.patch
>
>
> If the any of these processes receives more than N consecutive errors, it
> dies with the message "Too many errors, Bail out!".
> Let's change to this introduce a configurable number of concurrent exceptions
> to be encountered before dying. If the value is set to -1, expected behavior
> is to keep retrying ad infinitum.
> Also as part if this bug is to improve logging of how many consecutive errors
> have occurred, as well as the time they started. A possible future
> enhancement could be to support an error time threshold as well as an
> absolute count.
> Suggesting the following new config setting. It's a bit verbose, but it's
> clear.
> {noformat}
> demux.max.error.count.before.shutdown
> post.process.max.error.count.before.shutdown
> archive.max.error.count.before.shutdown
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.