Josh Elser created ACCUMULO-3717:
------------------------------------

             Summary: Add trace instrumentation around recovery
                 Key: ACCUMULO-3717
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3717
             Project: Accumulo
          Issue Type: Improvement
          Components: tserver
            Reporter: Josh Elser
             Fix For: 1.8.0


Noticed this when looking into some tracing things with Billie: it doesn't 
appear that we have recovery instrumented with tracing.

It would be nice to know what the long pole in the tent is for recovery since 
it typically represents a period of unavailability of some data for users. We 
should be aware of why it takes as long as it does and try to reduce it as much 
as possible.

Because spans are delivered via ZK, I *think* it will be ok if we're performing 
recovery on a WAL which contains updates for the trace table. As long as the 
serialization to the trace table doesn't cause problems (it should just create 
back-pressure in the tracer, but not throw exceptions), I think it should be 
fine. Some testing would be needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to