Jeff Kubina wrote:
On Thu, Aug 13, 2015 at 2:52 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:
1. Regarding the information above about accumulo tracing, if
more than
one server is listed in $ACCUMULO_HOME/conf/tracers how do the
clients
select the trace server to send their trace data to?
Tracers register themselves in ZooKeepers and the client tracing
libraries know to look in ZooKeeper to find them. You as a user
shouldn't have to worry about it -- it should happen automagically
for you.
I wanted to know how well balanced the tracing data is processed.
Is there a recommended system design with respect to the tracing
servers? Should we dedicate a few nodes to being just tracing servers or
is it best to have each tablet server also be a tracing server? If we
make each tablet server also a tracing server will each tablet server
just send its tracing data to the local tracing server?
Of the available trace servers, they are chosen at random per trace.
Clients will cache the available trace server, and then as a new trace
comes into, it will chose one of those hosts.
If you're just using Accumulo's tracing, I think one server goes a very
very long way. If you're sending client traces (or have custom
applications also using it), you may want to add more. I don't have a
good way to quantify it, sorry.
2. As an admin what is the best way to determine which tables have
recently been traced?
I'm not entirely sure what you mean by "[tables that have been
recently traced]". You can look at the "Recent Traces" page on the
monitor to get a list of the traces in the last X minutes.
Many operations going on in Accumulo will be getting traced. If you
have an active system, you'll constantly see new traces for minor
compactions and major compactions.
Sometimes a trace will cause very high system CPU utilization (90%) and
system load on the tracing server. When this becomes detrimental to the
server I would like to determine what table was being traced at that
time (to get the user/developer to refine the trace).
Traces are tied to a specific table, perhaps that's where the confusion
is coming in. A trace is just _an operation_. If I have a client, I
could just want to time some general operation. Or, like I mentioned
before, maybe it's a compaction in a TabletServer.
I think that the traces include the client information (IP addr). Is
that sufficient for your case? If you have a collection of users sending
traces, you could consider enforcing that they all provide some
attribute on traces which includes some easily-identifiable information too.