Josh Elser created ACCUMULO-3717:
------------------------------------
Summary: Add trace instrumentation around recovery
Key: ACCUMULO-3717
URL: https://issues.apache.org/jira/browse/ACCUMULO-3717
Project: Accumulo
Issue Type: Improvement
Components: tserver
Reporter: Josh Elser
Fix For: 1.8.0
Noticed this when looking into some tracing things with Billie: it doesn't
appear that we have recovery instrumented with tracing.
It would be nice to know what the long pole in the tent is for recovery since
it typically represents a period of unavailability of some data for users. We
should be aware of why it takes as long as it does and try to reduce it as much
as possible.
Because spans are delivered via ZK, I *think* it will be ok if we're performing
recovery on a WAL which contains updates for the trace table. As long as the
serialization to the trace table doesn't cause problems (it should just create
back-pressure in the tracer, but not throw exceptions), I think it should be
fine. Some testing would be needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)