Re: Accumulo Tracer?

2020-03-02 Thread Christopher
>From your screenshot, that's my interpretation, but I'm not that
familiar with the zipkin interface, or spark-shell.
However, it is possible that tserver is doing some sub-task that isn't
traced, or whose trace isn't properly closed and sent to the span
receiver that could explain the time. It's very hard to say.

On Sun, Mar 1, 2020 at 12:00 PM mhd wrk  wrote:
>
> Here's a follow-up question:
>
> Attached is the screenshot of a trace span called "count" starting from 
> spark-shell. The child spans all have very small duration which don't add up 
> anywhere close to 914 seconds reported for the parent.  Is it safe to say 
> that bottleneck is not inside Accumulo (including our own custom components)?
>
> Thanks,
>
> On Sat, Feb 29, 2020 at 8:17 PM mhd wrk  wrote:
>>
>> Added SpanReceiver for Zipkin and now trace entries make more sense.
>>
>> Thanks,
>>
>> On Fri, Feb 28, 2020 at 10:30 PM Christopher  wrote:
>>>
>>> The tracing in Accumulo is instrumented using HTrace. You can
>>> configure any HTrace sink for collecting the trace information. The
>>> built-in one that writes to an Accumulo table (called
>>> "ZooTraceClient") is the default, but you can easily change this by
>>> editing the configuration property for "trace.span.receivers".
>>>
>>> The built-in default trace sink was designed to emit data to an
>>> Accumulo table by sending it first to a separate tracer service that
>>> behaves as an Accumulo client and writes data back to a table. This
>>> trace data can then be read and interpreted by the Accumulo monitor.
>>> It may not be easy to read/interpret without using the monitor, as the
>>> schema is not well-documented (and possibly not stable between
>>> versions).
>>>
>>> Although it might be suitable for some modest needs, I don't
>>> personally recommend using the the built-in default trace sink if you
>>> are serious about tracing, and instead would advise you to use a more
>>> well-tested and stable trace sink service for HTrace, perhaps one that
>>> is designed specifically for that purpose. Others on the project may
>>> have a different opinion.
>>>
>>> On Fri, Feb 28, 2020 at 6:44 PM mhd wrk  wrote:
>>> >
>>> > Hi,
>>> >
>>> > Our Accumulo deployment uses custom Authenticator and Authorizer and also 
>>> > attaches few custom filters/iterators to tables during scan time. The 
>>> > challenge is that we are seeing very slow  response when loading the 
>>> > table inside a spark shell and doing a simple count.
>>> > I was thinking of adding logs to all our custom components to collect 
>>> > metrics then I came across Accumulo Tracer which seems, somehow, targets 
>>> > the same concerns but requires its own custom coding and also, so far, I 
>>> > don't find the content of the trace table very easy to read/interpret.
>>> >
>>> > Any suggestions/recommendations?
>>> >
>>> > Thanks,


Re: Accumulo Tracer?

2020-02-29 Thread mhd wrk
Added SpanReceiver for Zipkin and now trace entries make more sense.

Thanks,

On Fri, Feb 28, 2020 at 10:30 PM Christopher  wrote:

> The tracing in Accumulo is instrumented using HTrace. You can
> configure any HTrace sink for collecting the trace information. The
> built-in one that writes to an Accumulo table (called
> "ZooTraceClient") is the default, but you can easily change this by
> editing the configuration property for "trace.span.receivers".
>
> The built-in default trace sink was designed to emit data to an
> Accumulo table by sending it first to a separate tracer service that
> behaves as an Accumulo client and writes data back to a table. This
> trace data can then be read and interpreted by the Accumulo monitor.
> It may not be easy to read/interpret without using the monitor, as the
> schema is not well-documented (and possibly not stable between
> versions).
>
> Although it might be suitable for some modest needs, I don't
> personally recommend using the the built-in default trace sink if you
> are serious about tracing, and instead would advise you to use a more
> well-tested and stable trace sink service for HTrace, perhaps one that
> is designed specifically for that purpose. Others on the project may
> have a different opinion.
>
> On Fri, Feb 28, 2020 at 6:44 PM mhd wrk  wrote:
> >
> > Hi,
> >
> > Our Accumulo deployment uses custom Authenticator and Authorizer and
> also attaches few custom filters/iterators to tables during scan time. The
> challenge is that we are seeing very slow  response when loading the table
> inside a spark shell and doing a simple count.
> > I was thinking of adding logs to all our custom components to collect
> metrics then I came across Accumulo Tracer which seems, somehow, targets
> the same concerns but requires its own custom coding and also, so far, I
> don't find the content of the trace table very easy to read/interpret.
> >
> > Any suggestions/recommendations?
> >
> > Thanks,
>


Re: Accumulo Tracer?

2020-02-28 Thread Christopher
The tracing in Accumulo is instrumented using HTrace. You can
configure any HTrace sink for collecting the trace information. The
built-in one that writes to an Accumulo table (called
"ZooTraceClient") is the default, but you can easily change this by
editing the configuration property for "trace.span.receivers".

The built-in default trace sink was designed to emit data to an
Accumulo table by sending it first to a separate tracer service that
behaves as an Accumulo client and writes data back to a table. This
trace data can then be read and interpreted by the Accumulo monitor.
It may not be easy to read/interpret without using the monitor, as the
schema is not well-documented (and possibly not stable between
versions).

Although it might be suitable for some modest needs, I don't
personally recommend using the the built-in default trace sink if you
are serious about tracing, and instead would advise you to use a more
well-tested and stable trace sink service for HTrace, perhaps one that
is designed specifically for that purpose. Others on the project may
have a different opinion.

On Fri, Feb 28, 2020 at 6:44 PM mhd wrk  wrote:
>
> Hi,
>
> Our Accumulo deployment uses custom Authenticator and Authorizer and also 
> attaches few custom filters/iterators to tables during scan time. The 
> challenge is that we are seeing very slow  response when loading the table 
> inside a spark shell and doing a simple count.
> I was thinking of adding logs to all our custom components to collect metrics 
> then I came across Accumulo Tracer which seems, somehow, targets the same 
> concerns but requires its own custom coding and also, so far, I don't find 
> the content of the trace table very easy to read/interpret.
>
> Any suggestions/recommendations?
>
> Thanks,


Re: Accumulo Tracer?

2020-02-28 Thread Adam J. Shook
I've used the Accumulo Tracer API before to help identify bottlenecks in my
scans.  You can find the most recent traces in the Accumulo Monitor UI, and
there are also some tools you can use to view the contents of the trace
table.  See section 18.10.4 "Viewing Collected Traces" at
https://accumulo.apache.org/1.9/accumulo_user_manual.html.

On Fri, Feb 28, 2020 at 6:44 PM mhd wrk  wrote:

> Hi,
>
> Our Accumulo deployment uses custom Authenticator and Authorizer and also
> attaches few custom filters/iterators to tables during scan time. The
> challenge is that we are seeing very slow  response when loading the table
> inside a spark shell and doing a simple count.
> I was thinking of adding logs to all our custom components to collect
> metrics then I came across Accumulo Tracer which seems, somehow, targets
> the same concerns but requires its own custom coding and also, so far, I
> don't find the content of the trace table very easy to read/interpret.
>
> Any suggestions/recommendations?
>
> Thanks,
>


Accumulo Tracer?

2020-02-28 Thread mhd wrk
Hi,

Our Accumulo deployment uses custom Authenticator and Authorizer and also
attaches few custom filters/iterators to tables during scan time. The
challenge is that we are seeing very slow  response when loading the table
inside a spark shell and doing a simple count.
I was thinking of adding logs to all our custom components to collect
metrics then I came across Accumulo Tracer which seems, somehow, targets
the same concerns but requires its own custom coding and also, so far, I
don't find the content of the trace table very easy to read/interpret.

Any suggestions/recommendations?

Thanks,