[jira] [Updated] (HTRACE-200) Reduce rate of logged errors if Zipkin Collector service is down

Colin Patrick McCabe (JIRA) Tue, 13 Oct 2015 18:04:34 -0700

     [ 
https://issues.apache.org/jira/browse/HTRACE-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Colin Patrick McCabe updated HTRACE-200:
----------------------------------------
    Affects Version/s: 3.2.0

> Reduce rate of logged errors if Zipkin Collector service is down
> ----------------------------------------------------------------
>
>                 Key: HTRACE-200
>                 URL: https://issues.apache.org/jira/browse/HTRACE-200
>             Project: HTrace
>          Issue Type: Improvement
>          Components: zipkin
>    Affects Versions: 3.2.0
>            Reporter: Andrew Olson
>            Priority: Minor
>
> We see a flood of errors logged by the ZipkinSpanReceiver when our Zipkin 
> Collector service is not running - about one error every second or two, by 
> each of our processes that are instrumented with HTrace and configured to 
> send traces to Zipkin. Exacerbating the problem for us, it seems that with 
> commons-logging, every line of the exception stack trace includes a prefix 
> like "2015-06-29 09:03:25 zipkinSpanReceiver-0 STDIO [ERROR]", so that Splunk 
> parses it as a separate error message. Here [1] is an example log file. It 
> would be nice if this error logging could be rate-limited to something like 
> no more than one per minute, or possibly only the initial occurrence logged 
> until a successful send occurs to reset the state.
> [1] http://pastebin.com/AieewfhF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HTRACE-200) Reduce rate of logged errors if Zipkin Collector service is down

Reply via email to