Re: [DISCUSS] Attic podling Apache HTrace?

Andrew Purtell Thu, 17 Aug 2017 14:28:31 -0700

> The different major versions of HTrace are indeed source code compatible.


Maybe the issue was going from 2 to 3. At the time it was a real problem,
change or removal of a span id constant, and another time something to do
with setting parent-child span relationships, IIRC. If this is better
between 3 and 4 then the point no longer applies.


On Thu, Aug 17, 2017 at 2:21 PM, Colin McCabe <cmcc...@apache.org> wrote:

> On Thu, Aug 17, 2017, at 12:25, Andrew Purtell wrote:
> > What about OpenTracing (http://opentracing.io/)? Is this the successor
> > project to ZipKin? In particular grpc-opentracing (
> > https://github.com/grpc-ecosystem/grpc-opentracing) seems to finally
> > fulfill in open source the tracing architecture described in the Dapper
> > paper.
>
> OpenTracing is essentially an API which sits on top of another tracing
> system.
>
> So you can instrument your code with the OpenTracing library, and then
> have that send the trace spans to OpenZipkin.
>
> Here are some thoughts here about this topic from a Zipkin developer:
> https://gist.github.com/wu-sheng/b8d51dda09d3ce6742630d1484fd55
> c7#what-is-the-relationship-between-zipkin-and-opentracing
> .  Probably Adrian Cole can chime in here as well.
>
> In general the OpenTracing folks have been friendly and respectful.  (If
> any of them are reading this, I apologize for not following some of the
> discussions on gitter more thoroughly-- my time is just split so many
> ways right now!)
>
> >
> > If one takes a step back and looks at all of the hand rolled RPC stacks
> > in
> > the Hadoop ecosystem it's a mess. It is a heavier lift but getting
> > everyone
> > migrated to a single RPC stack - gRPC - would provide the unified tracing
> > layer envisioned by HTrace. The tracing integration is then done exactly
> > in
> > one place. In contrast HTrace requires all of the components to sprinkle
> > spans throughout the application code.
> >
>
> That's not the issue.  We already have HTrace integration with Hadoop
> RPC, such that a Hadoop RPC creates a span.  Integration with any RPC
> system is actually very straightforward-- you just add two fields to the
> base RPC request definition, and patch the RPC system to use them.
>
> Just instrumenting RPC is not sufficient.  You need programmers to add
> explicit span annotations to your code so that you can have useful
> information beyond what a program like wireshark would find.  Things
> like what disk is a request hitting, what HBase PUT is an HDFS write
> associated with, and so forth.
>
> Also, this is getting off topic, but there is a new RPC system every
> year or two.  Java-RMI, CORBA, Thrift, Akka, SOAP, KRPC, Finagle, GRPC,
> REST/JSON, etc.  They all have advantages and disadvantages.  For
> example, GRPC depends on protobuf-- and Hadoop has a lot of deployment
> and performance problems with the protobuf-java library.  I wish GPRC
> luck, but I think it's good for people to experiment with different
> libraries.  It doesn't make sense to try to force everyone to use one
> thing, even if we could.
>
> > The Hadoop ecosystem is always partially at odds with itself, if for no
> > other reason than there is no shared vision among the projects. There are
> > no coordinated releases. There isn't even agreement on which version of
> > shared dependencies to use (hence the recurring pain in various places
> > with
> > downstream version changes of protobuf, guava, jackson, etc. etc).
> > Therefore HTrace is severely constrained on what API changes can be made.
> > Unfortunately the different major versions of HTrace do not interoperate
> > at
> > all. And are not even source compatible. While is not unreasonable at all
> > for a project in incubation, when combined with the inability of the
> > Hadoop
> > ecosystem to coordinate releases as a cross-cutting dependency ships a
> > new
> > version, this has reduced the utility of HTrace to effectively nil for
> > the
> > average user. I am sorry to say that. Only a commercial Hadoop vendor or
> > power user can be expected to patch and build a stack that actually
> > works.
>
> One correction: The different major versions of HTrace are indeed source
> code compatible.  You can build an application that can use both HTrace
> 3 and HTrace 4.  This was absolutely essential for us because of the
> version skew issues you mention.
>
> > On Thu, Aug 17, 2017 at 11:04 AM, lewis john mcgibbney <
> lewi...@apache.org> wrote:
> >
> > > Hi Mike,
> > > I think this is a fair question. We've probably all been associated
> with
> > > projects which just don't really make it. It would appear that HTrace
> is
> > > one of them. This is not to say that there is nothing going on with the
> > > tracing effort generally (as there is) but it looks like HTrace as a
> > > project may be headed to the Attic.
> > > I suppose the response to this thread will determine what happens...
>
> Thanks, Lewis.
>
> I think maybe we should try to identify the top tracing priorities for
> HBase and HDFS and see how HTrace / OpenTracing / OpenZipkin could fit
> into those.  Just start from a nice crisp set of requirements, like
> Stack suggested, and think about how we could make those a reality.  If
> we can advance the state of tracing in hadoop, that will be a good thing
> for our users, even if htrace goes to the attic.  I've been mostly
> working on Apache Kafka these days but I could drop by to brainstorm.
>
> best,
> Colin
>
>
> > > Lewis
> > > 
> > >
> > >
> > > On Wed, Aug 16, 2017 at 10:01 AM, <
> > > dev-digest-h...@htrace.incubator.apache.org> wrote:
> > >
> > > >
> > > > From: Mike Drob <md...@apache.org>
> > > > To: dev@htrace.incubator.apache.org
> > > > Cc:
> > > > Bcc:
> > > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > > Hi folks,
> > > >
> > > > Want to bring up a potentially uncofortable topic for some. Is it
> time to
> > > > retire/attic the project?
> > > >
> > > > We've seen a minimal amount of activity in the past year. The last
> > > release
> > > > had two bug fixes, and had been pending for several months before
> > > somebody
> > > > reminded me to push the artifacts to subversion from the staging
> > > directory.
> > > >
> > > > I'd love to see a renewed set of activity here, but I don't think
> there
> > > is
> > > > a ton of interest going on.
> > > >
> > > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on
> 4.1,
> > > > which is a good sign, but I haven't heard much from them recently. I
> > > > definitely do no think we are at the point where a lack of releases
> and
> > > > activity is a sign of super advanced maturity and stability.
> > > >
> > > > Your thoughts?
> > > >
> > > > Mike
> > > >
> > > >
> > >
> > >
> > > --
> > > http://home.apache.org/~lewismc/
> > > @hectorMcSpector
> > > http://www.linkedin.com/in/lmcgibbney
> > >
> >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >    - A23, Crosstalk
>



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: [DISCUSS] Attic podling Apache HTrace?

Reply via email to