Re: [DISCUSS] Tracing in the Hadoop ecosystem

Andrew Purtell Tue, 21 Aug 2018 10:09:28 -0700

What if someone built a HTrace facade for Zipkin / Brave? Hadoop, HBase,
Phoenix, and other HTrace API users would still need to move away from
embedding HTrace instrumentation points to whatever is the normal API of
the accepted replacement, but such a facade would give you a drop in
replacement requiring no code changes to currently shipping code lines, and
some time to do a hopefully coordinated replacement involving all upstreams
and downstreams. Just a thought. Zipkin / Brave has widespread adoption of
that option and the impending incubation here at the ASF will make it quite
attractive, I think.



On Tue, Aug 21, 2018 at 7:50 AM Stack <st...@duboce.net> wrote:

> On Tue, Aug 21, 2018 at 3:44 AM Tsuyoshi Ozawa <oz...@apache.org> wrote:
>
> > Thanks for starting discussion, Stack.
> >
> > The ZipKin seems to be coming to the Apache Incubator. As Andrew
> > Purtell said on HADOOP-15566, it would be good option since there is
> > no problem about licenses.
> > https://wiki.apache.org/incubator/ZipkinProposal
> >
> >
> Yes. This is nice to see.
>
>
>
> > Stack, do you have any knowledge about differences between Zipkin and
> > HTrace? Might measurable performance overhead be observed still in
> > Zipkin?
> >
> >
> I've not measured to see if disabled trace points are friction-free.
> Perhaps someone else has?
>
>
>
> > To decrease the overhead, we need to do additional work like ftrace,
> > well known dtrace implementation in Linux kernel. If I understand
> > correctly, ftrace replace its function calls with NOP operations of
> > CPU instruction when it is disabled. This ensures the lower overhead
> > by the tracer. By replacing the function calls for tracing to JVM's
> > NOP operation, can we achieve the minimum overhead?
> >
> >
> That'd be ideal. Makes sense inside the kernel. But up in our sloppy java
> context, we should be able to get away with something less exotic.
>
> Thanks Tsuyoshi,
> S
>
>
>
>
> > Regards
> > - Tsuyoshi
> > On Tue, Jul 31, 2018 at 9:59 AM Eric Yang <ey...@hortonworks.com> wrote:
> > >
> > > Most of code coverage tools can instrument java classes without make
> any
> > > source code changes, but tracing distributed system is more involved
> > because
> > > code execution via network interactions are not easy to match up.
> > > All interactions between sender and receiver have some form of session
> id
> > > or sequence id.  Hadoop had some logic to assist the stitching of
> > distributed
> > > interactions together in clienttrace log.  This information seems to
> > have been
> > > lost in the last 5-6 years of Hadoop evolutions.  Htrace is invented to
> > fill the void
> > > left behind by clienttrace as a programmable API to send out useful
> > tracing data for
> > > downstream analytical program to visualize the interaction.
> > >
> > > Large companies have common practice to enforce logging the session id,
> > and
> > > write homebrew tools to stitch together debugging logic for a specific
> > software.
> > > There are also growing set of tools from Splunk or similar companies to
> > write
> > > analytical tools to stitch the views together.  Hadoop does not seem to
> > be on
> > > top of the list for those company to implement the tracing because
> Hadoop
> > > networking layer is complex and changed more frequently than desired.
> > >
> > > If we go back to logging approach, instead of API approach, it will
> help
> > > someone to write the analytical program someday.  The danger of logging
> > > approach is that It is boring to write LOG.debug() everywhere, and we
> > > often forgot about it, and log entries are removed.
> > >
> > > API approach can work, if real time interactive tracing can be done.
> > > However, this is hard to realize in Hadoop because massive amount of
> > > parallel data is difficult to aggregate at real time without hitting
> > timeout.
> > > It has a higher chance to require changes to network protocol that
> might
> > cause
> > > more headache than it's worth.  I am in favor of removing Htrace
> support
> > > and redo distributed tracing using logging approach.
> > >
> > > Regards,
> > > Eric
> > >
> > > On 7/30/18, 3:06 PM, "Stack" <st...@duboce.net> wrote:
> > >
> > >     There is a healthy discussion going on over in HADOOP-15566 on
> > tracing
> > >     in the Hadoop ecosystem. It would sit better on a mailing list than
> > in
> > >     comments up on JIRA so here's an attempt at porting the chat here.
> > >
> > >     Background/Context: Bits of Hadoop and HBase had Apache HTrace
> trace
> > >     points added. HTrace was formerly "incubating" at Apache but has
> > since
> > >     been retired, moved to Apache Attic. HTrace and the efforts at
> > >     instrumenting Hadoop wilted for want of attention/resourcing. Our
> > Todd
> > >     Lipcon noticed that the HTrace instrumentation can add friction on
> > >     some code paths so can actually be harmful even when disabled.  The
> > >     natural follow-on is that we should rip out tracings of a "dead"
> > >     project. This then beggars the question, should something replace
> it
> > >     and if so what? This is where HADOOP-15566 is at currently.
> > >
> > >     HTrace took two or three runs, led by various Heros, at building a
> > >     trace lib for Hadoop (first). It was trying to build the trace
> lib, a
> > >     store, and a visualizer. Always, it had a mechanism for dumping the
> > >     traces out to external systems for storage and viewing (e.g.
> Zipkin).
> > >     HTrace started when there was little else but the, you guessed it,
> > >     Google paper that described the Dapper system they had internally.
> > >     Since then, the world of tracing has come on in leaps and bounds
> with
> > >     healthy alternatives, communities, and even commercialization.
> > >
> > >     If interested, take a read over HADOOP-15566. Will try and
> encourage
> > >     participants to move the chat here.
> > >
> > >     Thanks,
> > >     St.Ack
> > >
> > >
>  ---------------------------------------------------------------------
> > >     To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > >     For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: [DISCUSS] Tracing in the Hadoop ecosystem

Reply via email to