Re: Need guidance on getting detailed elapsed times in every stage of processing a request

Enis Söztutar Wed, 15 Mar 2017 11:39:07 -0700

If I remember from the jira, it got stuck on perf testing the different
approaches of incrementing the histograms inline versus using a
multi-producer single consumer queue (blockingQueue, or Disruptor, etc),
and emitting the metrics inline and updating the histograms from a
background thread. Which approach is better should be tested.


Let's revive the jira.
Enis

On Tue, Mar 14, 2017 at 6:50 PM, Yu Li <[email protected]> wrote:

> Let me check with @Enis in JIRA and get back to you (maybe days, due to
> schedule) later.
>
> Best Regards,
> Yu
>
> On 15 March 2017 at 05:57, jeff saremi <[email protected]> wrote:
>
> > What's involved in getting this change merged into the main branch? These
> > 2 counters (fsReadLatency, fsWriteLatency) are super important to us
> > understanding what goes on behind every request. These are the minimum we
> > need to have especially in the absence of HTrace.
> >
> > I just checked the latest code and it does not seem to have these.
> >
> > thanks
> >
> > Jeff
> >
> >
> >
> > ------------------------------
> > *From:* jeff saremi <[email protected]>
> > *Sent:* Sunday, March 5, 2017 3:41:05 PM
> > *To:* Yu Li; [email protected]
> > *Subject:* Re: Need guidance on getting detailed elapsed times in every
> > stage of processing a request
> >
> >
> > HI Yu
> >
> > I applied your path (the one that didn't have any V number) to 1.2.2
> > manually. I'm waiting to see the results. I should circle back if there
> are
> > issues.
> > ------------------------------
> > *From:* Yu Li <[email protected]>
> > *Sent:* Sunday, March 5, 2017 7:06:27 AM
> > *To:* [email protected]
> > *Cc:* [email protected]
> > *Subject:* Re: Need guidance on getting detailed elapsed times in every
> > stage of processing a request
> >
> > @jeff normally it's recommended to try the latest one, say v5 in this
> > case.
> >
> > btw, one thing to make clear: only #4 in my list is missing in current
> > code base, all others are already there for usage.
> >
> > Best Regards,
> > Yu
> >
> > On 4 March 2017 at 04:02, Sean Busbey <[email protected]> wrote:
> >
> >> -user@hbase to bcc
> >> +dev@hbase
> >>
> >> Jeff,
> >>
> >> These kinds of questions get into things that we only handle within
> >> the development community. I've copied in the appropriate dev@hbase
> >> mailing list, please ensure you are subscribed there so you can see
> >> any further discussion.
> >>
> >> On Fri, Mar 3, 2017 at 1:13 PM, jeff saremi <[email protected]>
> >> wrote:
> >> > Yu
> >> > Of the patches attached to HBASE-15160, do I need to apply all (v2,
> v3,
> >> ...) or just  HBASE-15160.patch ?
> >> > Also how would I know against what version this patch was created?
> >> > thanks
> >> >
> >> >
> >> >
> >> >
> >> > ________________________________
> >> > From: jeff saremi <[email protected]>
> >> > Sent: Friday, March 3, 2017 10:34:00 AM
> >> > To: Hbase-User
> >> > Subject: Re: Need guidance on getting detailed elapsed times in every
> >> stage of processing a request
> >> >
> >> > Thanks a lot Yu
> >> >
> >> > These are truly the metrics we care about about at this point. It is
> >> sad to see that such important metrics were removed from the code.
> >> >
> >> > I will try to apply your patch on my own to the version of HBase we
> >> have. We definitely need these.
> >> >
> >> > Other solutions like HTrace are not as urgent as having these few
> >> metrics you talked about here. So if we can get these merged with the
> code
> >> we should be happy.
> >> >
> >> >
> >> > ________________________________
> >> > From: Yu Li <[email protected]>
> >> > Sent: Friday, March 3, 2017 9:54:29 AM
> >> > To: Hbase-User
> >> > Subject: Re: Need guidance on getting detailed elapsed times in every
> >> stage of processing a request
> >> >
> >> > Hi Jeff,
> >> >
> >> > If the question is simply monitoring HDFS read/write latencies, please
> >> > refer to HBASE-15160 <https://issues.apache.org/
> jira/browse/HBASE-15160
> >> >,
> >> > there's a patch but not committed yet, and probably cannot apply
> >> cleanly on
> >> > current code base, but still some good reference IMHO, so JFYI.
> >> >
> >> > To get an overview of how quickly the system could respond and what
> >> might
> >> > be the root cause of the spikes, we only need to monitor the
> >> > average/p99/p999 latency of below metrics (stages):
> >> > 1. totalCallTime: time from request arriving at server to sending
> >> response
> >> > 2. processCallTime: time for the server to process the call,
> regardless
> >> of
> >> > the time this call being queued
> >> > 3. queueCallTime: time the call has been queued
> >> > 4. HDFS read/pread/write time: time of HFile reading/writing, added in
> >> > HBASE-15160
> >> > 5. WAL sync time: time of WAL sync to HDFS, critical path of writing
> >> request
> >> >
> >> > However, for your original question, that to monitor the whole trace
> of
> >> a
> >> > single request, I'm afraid no mature solution for the time being just
> as
> >> > Stack mentioned.
> >> >
> >> > Hope my answer helps (smile).
> >> >
> >> > Best Regards,
> >> > Yu
> >> >
> >> > On 4 March 2017 at 00:48, jeff saremi <[email protected]> wrote:
> >> >
> >> >> anything would help. thanks
> >> >>
> >> >> ________________________________
> >> >> From: [email protected] <[email protected]> on behalf of Stack <
> >> >> [email protected]>
> >> >> Sent: Thursday, March 2, 2017 9:53:41 PM
> >> >> To: Hbase-User
> >> >> Subject: Re: Need guidance on getting detailed elapsed times in every
> >> >> stage of processing a request
> >> >>
> >> >> On Thu, Mar 2, 2017 at 10:26 PM, jeff saremi <[email protected]
> >
> >> >> wrote:
> >> >>
> >> >> > So i'd like to come back to my original question on how to get
> about
> >> >> > separating the latency of HDFS from HBase.
> >> >> >
> >> >> >
> >> >> That is a simple question to which we do not have an answer
> >> unfortunately
> >> >> (we should). If interested, I could describe how you might do it. I
> >> don't
> >> >> think it would take much work.
> >> >>
> >> >> St.Ack
> >> >>
> >> >>
> >> >>
> >> >> > Is there a most appropriate log4j TRACE option that could print out
> >> this
> >> >> > information to the logs?
> >> >> > Thanks
> >> >> >
> >> >> > ________________________________
> >> >> > From: jeff saremi <[email protected]>
> >> >> > Sent: Thursday, March 2, 2017 12:45:59 PM
> >> >> > To: Hbase-User
> >> >> > Subject: Re: Need guidance on getting detailed elapsed times in
> every
> >> >> > stage of processing a request
> >> >> >
> >> >> > Thanks so much for the advice! Looking forward to when Tracing gets
> >> >> picked
> >> >> > up again
> >> >> >
> >> >> > ________________________________
> >> >> > From: [email protected] <[email protected]> on behalf of
> Stack <
> >> >> > [email protected]>
> >> >> > Sent: Thursday, March 2, 2017 12:17:35 PM
> >> >> > To: Hbase-User
> >> >> > Subject: Re: Need guidance on getting detailed elapsed times in
> every
> >> >> > stage of processing a request
> >> >> >
> >> >> > HBase/HTrace integration once worked but has long since rotted.
> >> >> > Refactorings of internals without proper respect for trace
> >> connections is
> >> >> > the main culprit. Updates in htrace and hdfs that need attention
> >> >> > reconnecting spans, etc., is another. On top of this, zipkin
> project
> >> has
> >> >> > seen a burst of effort of late that would seem to offer much
> promise
> >> if
> >> >> > someone of us spent some time rejiggering how HTrace and Zipkin
> >> relate.
> >> >> >
> >> >> > I would not waste any time on trying to setup HTrace for HBase at
> >> least
> >> >> > until after HBASE-14451 goes in, an issue that has been put aside
> >> with a
> >> >> > while now. Sorry if you've burned time on this to date.
> >> >> >
> >> >> > Yours,
> >> >> > St.Ack
> >> >> >
> >> >> > On Thu, Mar 2, 2017 at 6:28 AM, jeff saremi <
> [email protected]>
> >> >> > wrote:
> >> >> >
> >> >> > > Where would i seek help for issues revolving around HTrace and
> >> zipkin?
> >> >> > > Here? Because I have configured everything the way documentation
> >> said
> >> >> > but i
> >> >> > > see nothing in the zipkin server or in the logs. nothing at all
> >> >> > >
> >> >> > > ________________________________
> >> >> > > From: jeff saremi <[email protected]>
> >> >> > > Sent: Tuesday, February 28, 2017 12:52:32 PM
> >> >> > > To: [email protected]
> >> >> > > Subject: Re: Need guidance on getting detailed elapsed times in
> >> every
> >> >> > > stage of processing a request
> >> >> > >
> >> >> > > No I had not. but it looks like what i needed. Thanks Ted.
> >> >> > >
> >> >> > > I'll see if I have any more questions after reading this.
> >> >> > >
> >> >> > > ________________________________
> >> >> > > From: Ted Yu <[email protected]>
> >> >> > > Sent: Tuesday, February 28, 2017 12:47:08 PM
> >> >> > > To: [email protected]
> >> >> > > Subject: Re: Need guidance on getting detailed elapsed times in
> >> every
> >> >> > > stage of processing a request
> >> >> > >
> >> >> > > Have you looked at:
> >> >> > > http://hbase.apache.org/book.html#tracing
> >> >> > >
> >> >> > > On Tue, Feb 28, 2017 at 12:37 PM, jeff saremi <
> >> [email protected]>
> >> >> > > wrote:
> >> >> > >
> >> >> > > > I think we need to get detailed information from HBase
> >> RegionServer
> >> >> > logs
> >> >> > > > on how a request (read or write) is processed. Specifically
> >> >> speaking, i
> >> >> > > > need to know of say 100 ms time spent in processing a write,
> how
> >> much
> >> >> > of
> >> >> > > it
> >> >> > > > was spent waiting for the HDFS?
> >> >> > > > What is the most efficient way of enabling this in log4j
> >> properties?
> >> >> > Are
> >> >> > > > there better mechanisms to get this information?
> >> >> > > >
> >> >> > > > If I can get this in the log, then I can process the logs
> >> offline or
> >> >> in
> >> >> > > > neartime and mount some dashboards on the top.
> >> >> > > >
> >> >> > > > thanks
> >> >> > > >
> >> >> > > >
> >> >> > > > Jeff
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> >
> >
>

Re: Need guidance on getting detailed elapsed times in every stage of processing a request

Reply via email to