[
https://issues.apache.org/jira/browse/HTRACE-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Patrick McCabe updated HTRACE-200:
----------------------------------------
Affects Version/s: 3.2.0
> Reduce rate of logged errors if Zipkin Collector service is down
> ----------------------------------------------------------------
>
> Key: HTRACE-200
> URL: https://issues.apache.org/jira/browse/HTRACE-200
> Project: HTrace
> Issue Type: Improvement
> Components: zipkin
> Affects Versions: 3.2.0
> Reporter: Andrew Olson
> Priority: Minor
>
> We see a flood of errors logged by the ZipkinSpanReceiver when our Zipkin
> Collector service is not running - about one error every second or two, by
> each of our processes that are instrumented with HTrace and configured to
> send traces to Zipkin. Exacerbating the problem for us, it seems that with
> commons-logging, every line of the exception stack trace includes a prefix
> like "2015-06-29 09:03:25 zipkinSpanReceiver-0 STDIO [ERROR]", so that Splunk
> parses it as a separate error message. Here [1] is an example log file. It
> would be nice if this error logging could be rate-limited to something like
> no more than one per minute, or possibly only the initial occurrence logged
> until a successful send occurs to reset the state.
> [1] http://pastebin.com/AieewfhF
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)