Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-18 Thread Adrian Cole
> Thanks Adrian for the editorial on the landscape. Helps, especially coming
> from yourself.
we aim to please

> Given current state of the project, a retrofit to come up on OT is not the
> solution to the topic-at-hand (and besides I have a colored opinion on
> taking on the API of another after spending a bunch of time recently
> undoing our mistake letting third-party Interfaces and Classes show through
> in hbase).
sensible for any api highly disconnected from the ecosystem,
especially without practice yet.

> I appreciate the higher-level point made by Andrew, that it is hard to
> thread a cross-cutting library across the Hadoop landscape whether because
> releases happen on the geologic time scale or that there is little by way
> of coordination.
I think this is indeed leading a path towards focus, eg the H in Htrace :)

> Can we do a focused 'win' like Colin suggests? E.g. hook up hbase and hdfs
> end-to-end with connection to a viewer (zipkin? Or text dumps in a
> webpage?). A while back I had a go at the hbase side but it was burning up
> the hours just getting it hooked up w/ tests to scream if any spans were
> broken in a refactor. I had to put it aside.
Incidentally, I wouldn't necessarily say Zipkin is ready out of box
because htrace UI and query is more advanced (in some ways due to some
data storage options we have available). So, something like this could
be a move of focus which would require investment on the other side to
avail features needed, or discuss how to upgrade into them (ex if
using hbase storage, certain queries would work). It is fair to say
zipkin has  a great devops pipeline, we are good at fixing things. At
the same time, we are imperfect in impl and inexperienced in hadoop
ecosystem. Having some way to join together could be really
beneficial, at the cost of up-front effort (due to model, UI and
storage differences). I would be happy to direct time, though would
need some help because of my irrelevance in the data services space
(something this might correct!)

> Like the rest of you, my time is a little occupied elsewhere these times so
> I can't revive the project, not at the moment at least.
ack


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Adrian Cole
e thing that really held back HTrace 4.0
was that it was originally scheduled to be part of Hadoop 2.8-- and the
Hadoop 2.8 release was delayed for a really, really long time, to the
point when it almost became a punchline.  So people had to use vendor
releases to get HTrace 4, because those were the only releases with new
Hadoop code.

Colin


>
>
>
> On Thu, Aug 17, 2017 at 2:21 PM, Colin McCabe <cmcc...@apache.org> wrote:
>
> > On Thu, Aug 17, 2017, at 12:25, Andrew Purtell wrote:
> > > What about OpenTracing (http://opentracing.io/)? Is this the successor
> > > project to ZipKin? In particular grpc-opentracing (
> > > https://github.com/grpc-ecosystem/grpc-opentracing) seems to finally
> > > fulfill in open source the tracing architecture described in the
Dapper
> > > paper.
> >
> > OpenTracing is essentially an API which sits on top of another tracing
> > system.
> >
> > So you can instrument your code with the OpenTracing library, and then
> > have that send the trace spans to OpenZipkin.
> >
> > Here are some thoughts here about this topic from a Zipkin developer:
> > https://gist.github.com/wu-sheng/b8d51dda09d3ce6742630d1484fd55
> > c7#what-is-the-relationship-between-zipkin-and-opentracing
> > .  Probably Adrian Cole can chime in here as well.
> >
> > In general the OpenTracing folks have been friendly and respectful.  (If
> > any of them are reading this, I apologize for not following some of the
> > discussions on gitter more thoroughly-- my time is just split so many
> > ways right now!)
> >
> > >
> > > If one takes a step back and looks at all of the hand rolled RPC
stacks
> > > in
> > > the Hadoop ecosystem it's a mess. It is a heavier lift but getting
> > > everyone
> > > migrated to a single RPC stack - gRPC - would provide the unified
tracing
> > > layer envisioned by HTrace. The tracing integration is then done
exactly
> > > in
> > > one place. In contrast HTrace requires all of the components to
sprinkle
> > > spans throughout the application code.
> > >
> >
> > That's not the issue.  We already have HTrace integration with Hadoop
> > RPC, such that a Hadoop RPC creates a span.  Integration with any RPC
> > system is actually very straightforward-- you just add two fields to the
> > base RPC request definition, and patch the RPC system to use them.
> >
> > Just instrumenting RPC is not sufficient.  You need programmers to add
> > explicit span annotations to your code so that you can have useful
> > information beyond what a program like wireshark would find.  Things
> > like what disk is a request hitting, what HBase PUT is an HDFS write
> > associated with, and so forth.
> >
> > Also, this is getting off topic, but there is a new RPC system every
> > year or two.  Java-RMI, CORBA, Thrift, Akka, SOAP, KRPC, Finagle, GRPC,
> > REST/JSON, etc.  They all have advantages and disadvantages.  For
> > example, GRPC depends on protobuf-- and Hadoop has a lot of deployment
> > and performance problems with the protobuf-java library.  I wish GPRC
> > luck, but I think it's good for people to experiment with different
> > libraries.  It doesn't make sense to try to force everyone to use one
> > thing, even if we could.
> >
> > > The Hadoop ecosystem is always partially at odds with itself, if for
no
> > > other reason than there is no shared vision among the projects. There
are
> > > no coordinated releases. There isn't even agreement on which version
of
> > > shared dependencies to use (hence the recurring pain in various places
> > > with
> > > downstream version changes of protobuf, guava, jackson, etc. etc).
> > > Therefore HTrace is severely constrained on what API changes can be
made.
> > > Unfortunately the different major versions of HTrace do not
interoperate
> > > at
> > > all. And are not even source compatible. While is not unreasonable at
all
> > > for a project in incubation, when combined with the inability of the
> > > Hadoop
> > > ecosystem to coordinate releases as a cross-cutting dependency ships a
> > > new
> > > version, this has reduced the utility of HTrace to effectively nil for
> > > the
> > > average user. I am sorry to say that. Only a commercial Hadoop vendor
or
> > > power user can be expected to patch and build a stack that actually
> > > works.
> >
> > One correction: The different major versions of HTrace are indeed source
> > code compatible.  You can build an application that ca

Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Adrian Cole
> What are the likely alternatives for downstream projects that want 
> distributed tracing?
Yes, for general purpose or RPC, but I think HTrace is still
positioned well for data services specifically.

> Do we think the field still has a big gap that HTrace can solve?
When at twitter (a couple yrs ago now), I know the data team preferred
htrace eventhough we had zipkin. Most of the tracing projects out
there do not focus on data services, or only recently do. While HTrace
may not be great at filling gaps in traditional RPC (as others do this
well enough), it probably does still have compelling advantages in
data services. I think the main holdback is getting the word out
and/or showing examples where the model and UI really shines in
HTrace's sweet spot (data services).

my 2p


Re: [DISCUSS] OpenTracing API implementation

2017-03-06 Thread Adrian Cole
> If nobody steps up to implement it before me or working on it yet, I'll try
> to get my hands into it. Unfortunately I'm not sure when I'd have spare
> time for it.

There will be some work needed in htrace as the OT interface is wider
in most places, but not precise in others. It should be easier than
zipkin, since htrace has a single span per tracer model. That said
there are some key concerns which may or may not play out well,
particularly htrace's lack of a trace ID.

For example, you'll have choices to make on how to handle nested data
structures that are sent via its "log" api. Also, if or how to map
"special" properties defined in opentracing such as "span.kind". How
to deal with the required propagation apis (for example, how to encode
and decode a trace context in binary and text form). Sadly, there's no
compatibility kit or interop tests of any kind, so figuring out if
things work will largely be up to testers.

Anyway, in case it helps, here are a couple bridge projects:

https://github.com/openzipkin/brave-opentracing
https://github.com/DealerDotCom/sleuth-opentracing

> I had two things in mind starting this discussion:
> - check that nobody is working on it yet (to avoid wasting time if someone
> is);
pretty confident you are it.

> - check that such work would be useful for someone else.
this is an important step! There's certainly work to do, and best to
not go at it without at least a user to help q/a. I'd ping their
gitter and/or use twitter to recruit others interested if I were you.

> I haven't checked with IPMC and legal but opentracing seems to be under
> Apache License v2.0 and has no externtal deps, so it should be ok as
> external dependency.
agreed, though bear in mind OT java at least is <1.0, so you'd need to
plan for how to manage version updates (since htrace is more coarse
grained at >1.0.

Good luck on your journey!


Re: How to deal with htrace conversion of values to base64

2016-12-30 Thread Adrian Cole
ok well update your issue here, then?

https://github.com/openzipkin/zipkin/issues/1455

There's a test ScribeSpanConsumerTest which covers some base64 related
concerns. It is unfortunate that base64 is a requirement as indeed
there are edge cases.

https://github.com/openzipkin/zipkin/blob/master/zipkin-collector/scribe/src/test/java/zipkin/collector/scribe/ScribeSpanConsumerTest.java#L152

-A

On Fri, Dec 30, 2016 at 10:37 PM, Raam Rosh-Hai <r...@findhotel.net> wrote:
> Hey Adrian,
>
> After poking around, and changing org.apache.htrace.impl.ScribeTransport
> which didn't seem to solve this issue I am starting to think it's on the
> zipkin side seeing that most of the data is displayed correctly other then
> the KV annotations. further more the annotations themselves are being sent
> to zipkin in a binary format and only the thrift object is base64 encoded.
> I am trying to find that place that encodes the annotations to base64 but
> no luck up until now in htrace
>
> On 23 December 2016 at 10:21, Adrian Cole <adrian.f.c...@gmail.com> wrote:
>
>> My guess is that this has to do with url encoding. Can you patch
>> org.apache.htrace.impl.ScribeTransport to use
>> encodeBase64URLSafeString instead of encodeBase64String?
>>
>> that might answer it..
>>
>> On Fri, Dec 23, 2016 at 5:04 PM, Raam Rosh-Hai <r...@findhotel.net> wrote:
>> > Hi St.Ack,
>> >
>> > Thank you for your reply, I am using "org.apache.htrace" %
>> "htrace-core4" %
>> > "4.1.0-incubating",
>> > "org.apache.htrace" % "htrace-zipkin" % "4.1.0-incubating" in scala.
>> > I was looking at an older version of htrace (non incubating master on
>> > github) and now I see your are no longer doing that.
>> >
>> > What I am getting in zipkin UI is a malformed base64 string. where the
>> `/`
>> > were converted to `_` after debugging the zipkin receiver it seems like
>> the
>> > spans are sent correctly, maybe you have an idea what can go wrong?
>> >
>> > On 22 December 2016 at 18:47, Stack <st...@duboce.net> wrote:
>> >
>> >> Show us where in the code this is happening Raam Rosh-Hai and tell us
>> what
>> >> version of htrace you are using. Thanks.
>> >> St.Ack
>> >>
>> >> On Thu, Dec 22, 2016 at 3:41 AM, Raam Rosh-Hai <r...@findhotel.net>
>> wrote:
>> >>
>> >> > I am saving a simple string value and it seems like the trace
>> >> > pipkin connector is converting all values to base64 I then get the
>> base64
>> >> > values in the zipkin fronted, any suggestions?
>> >> >
>> >> > Thanks,
>> >> > Raam
>> >> >
>> >>
>>


Re: How to deal with htrace conversion of values to base64

2016-12-23 Thread Adrian Cole
My guess is that this has to do with url encoding. Can you patch
org.apache.htrace.impl.ScribeTransport to use
encodeBase64URLSafeString instead of encodeBase64String?

that might answer it..

On Fri, Dec 23, 2016 at 5:04 PM, Raam Rosh-Hai  wrote:
> Hi St.Ack,
>
> Thank you for your reply, I am using "org.apache.htrace" % "htrace-core4" %
> "4.1.0-incubating",
> "org.apache.htrace" % "htrace-zipkin" % "4.1.0-incubating" in scala.
> I was looking at an older version of htrace (non incubating master on
> github) and now I see your are no longer doing that.
>
> What I am getting in zipkin UI is a malformed base64 string. where the `/`
> were converted to `_` after debugging the zipkin receiver it seems like the
> spans are sent correctly, maybe you have an idea what can go wrong?
>
> On 22 December 2016 at 18:47, Stack  wrote:
>
>> Show us where in the code this is happening Raam Rosh-Hai and tell us what
>> version of htrace you are using. Thanks.
>> St.Ack
>>
>> On Thu, Dec 22, 2016 at 3:41 AM, Raam Rosh-Hai  wrote:
>>
>> > I am saving a simple string value and it seems like the trace
>> > pipkin connector is converting all values to base64 I then get the base64
>> > values in the zipkin fronted, any suggestions?
>> >
>> > Thanks,
>> > Raam
>> >
>>


Re: [DISCUSS] Release for HTrace/Drive Towards Graduation

2015-10-30 Thread Adrian Cole
OK, well do announce once you've decided how to do a CI pipeline that
includes pre-commit testing and context-sensitive review comments. I look
forward to it.


Re: Build instructions for Htrace

2015-09-02 Thread Adrian Cole
> +1.  Creating a docker build environment for Jenkins has been on our to-do
> list for a while.  It would be a great contribution for someone.  It seems
> like it would also help anyone who wanted to build the project as well.
Not used it yet, but dotci seems to be available for jenkins folks
http://groupon.github.io/DotCi/

I suspect a good first step is getting docker-compose working in
general? That would be independent of what sort of CI slave would
invoke it.