Re: [DISCUSS] Attic podling Apache HTrace?

2017-09-21 Thread Colin McCabe
On Wed, Sep 13, 2017, at 09:01, Mike Drob wrote:
> Hi folks,
> 
> Want to pull this conversation back into people's minds to summarize and
> reposit some questions.
> 
> It looks like there is consensus that 1) HTrace is generally useful but
> 2)
> needs to refocus on more specific use cases rather than trying to be a
> general purpose tool and 3) suffers from being niche on niche.
> 
> There have been lots of folks volunteering to contribute, but I don't see
> any resulting JIRA or mailing list activity. We all can have the best of
> intentions while still not having the necessary time available to follow
> through, myself included.
> 
> Projects can still be successful outside of the ASF; retiring as a
> podling
> does not preclude others from picking up the works. As one example, I
> have
> seen many vibrant communities run as github projects with a core set of
> dedicated contributors. I don't believe that living in the ASF makes it
> any
> easier or harder for other Apache projects such as Hadoop to integrate
> with
> us, so long as the licensing remains compatible. In the future, if a
> diverse community springs up again and finds the desire to come back, I
> don't think that will be a problem to re-incubate at that point either.

Hi Mike,

I'm on vacation this week, so I apologize for the brevity (and delayed
responses).

I think it would make more sense to fold HTrace into Hadoop if we wanted
to disband the podling.  At a minimum, there is value in allowing Hadoop
to use the trace systems that are out there (Zipkin, OpenTracing, etc.)
in a software-configurable way.

But I still hope we can find people interested in moving tracing forward
in hadoop. Let's circle back on this in a little bit...

Colin


> 
> Thoughts?
> Mike
> 
> 
> On Thu, Aug 31, 2017 at 3:32 AM, Raam Rosh-Hai 
> wrote:
> 
> > Thank you Colin for your response,
> > I was actually referring to Adrian's comment on making Htrace and Zipkin
> > play more nicely together, I have been using Htrace with zipkin for a while
> > now and I am doing pretty well but I was wondering if there are some
> > glaring pain points that should be attended.
> >
> > On 30 August 2017 at 23:41, Colin McCabe  wrote:
> >
> > > On Mon, Aug 21, 2017, at 02:09, Raam Rosh-Hai wrote:
> > > > I tend to agree that zipkin should be used as a frontend and I am
> > pretty
> > > > sure I will have some time to advance that in the coming weeks if one
> > of
> > > > the experts could create a few tasks.
> > >
> > > Hi Raam,
> > >
> > > If you're interested in this, check out the Zipkin trace sink.  That
> > > allows you to send HTrace spans to Zipkin.
> > >
> > > >
> > > > I think no one touched the obvious thing that tracing lacks any kind of
> > > > hype, it's just not glamours and in order to want to use such a
> > framework
> > > > you need to be versed in this world and to actually suffer from
> > > > distributed
> > > > debugging.
> > >
> > > Yeah, I think this is an area where some publicity and outreach would be
> > > a good thing :)
> > >
> > > It would help a lot if we could identify some core use-cases, like Stack
> > > said, and make sure everything works great for those.
> > >
> > > Colin
> > >
> > >
> > > >
> > > > so yes, I think this project deserves one last push before putting it
> > in
> > > > the attic and I would be happy to complete a few issues in order to
> > make
> > > > this happen.
> > > >
> > > >
> > > >
> > > > On 19 August 2017 at 03:55, Adrian Cole 
> > wrote:
> > > >
> > > > > > Thanks Adrian for the editorial on the landscape. Helps, especially
> > > > > coming
> > > > > > from yourself.
> > > > > we aim to please
> > > > >
> > > > > > Given current state of the project, a retrofit to come up on OT is
> > > not
> > > > > the
> > > > > > solution to the topic-at-hand (and besides I have a colored opinion
> > > on
> > > > > > taking on the API of another after spending a bunch of time
> > recently
> > > > > > undoing our mistake letting third-party Interfaces and Classes show
> > > > > through
> > > > > > in hbase).
> > > > > sensible for any api highly disconnected from the ecosystem,
> > > > > especially without practice yet.
> > > > >
> > > > > > I appreciate the higher-level point made by Andrew, that it is hard
> > > to
> > > > > > thread a cross-cutting library across the Hadoop landscape whether
> > > > > because
> > > > > > releases happen on the geologic time scale or that there is little
> > > by way
> > > > > > of coordination.
> > > > > I think this is indeed leading a path towards focus, eg the H in
> > > Htrace :)
> > > > >
> > > > > > Can we do a focused 'win' like Colin suggests? E.g. hook up hbase
> > and
> > > > > hdfs
> > > > > > end-to-end with connection to a viewer (zipkin? Or text dumps in a
> > > > > > webpage?). A while back I had a go at the hbase side but it was
> > > burning
> > > > > up
> > > > > > the hours just getting it hooked up w/ tests to 

Re: [DISCUSS] Attic podling Apache HTrace?

2017-09-13 Thread Mike Drob
Hi folks,

Want to pull this conversation back into people's minds to summarize and
reposit some questions.

It looks like there is consensus that 1) HTrace is generally useful but 2)
needs to refocus on more specific use cases rather than trying to be a
general purpose tool and 3) suffers from being niche on niche.

There have been lots of folks volunteering to contribute, but I don't see
any resulting JIRA or mailing list activity. We all can have the best of
intentions while still not having the necessary time available to follow
through, myself included.

Projects can still be successful outside of the ASF; retiring as a podling
does not preclude others from picking up the works. As one example, I have
seen many vibrant communities run as github projects with a core set of
dedicated contributors. I don't believe that living in the ASF makes it any
easier or harder for other Apache projects such as Hadoop to integrate with
us, so long as the licensing remains compatible. In the future, if a
diverse community springs up again and finds the desire to come back, I
don't think that will be a problem to re-incubate at that point either.

Thoughts?
Mike


On Thu, Aug 31, 2017 at 3:32 AM, Raam Rosh-Hai  wrote:

> Thank you Colin for your response,
> I was actually referring to Adrian's comment on making Htrace and Zipkin
> play more nicely together, I have been using Htrace with zipkin for a while
> now and I am doing pretty well but I was wondering if there are some
> glaring pain points that should be attended.
>
> On 30 August 2017 at 23:41, Colin McCabe  wrote:
>
> > On Mon, Aug 21, 2017, at 02:09, Raam Rosh-Hai wrote:
> > > I tend to agree that zipkin should be used as a frontend and I am
> pretty
> > > sure I will have some time to advance that in the coming weeks if one
> of
> > > the experts could create a few tasks.
> >
> > Hi Raam,
> >
> > If you're interested in this, check out the Zipkin trace sink.  That
> > allows you to send HTrace spans to Zipkin.
> >
> > >
> > > I think no one touched the obvious thing that tracing lacks any kind of
> > > hype, it's just not glamours and in order to want to use such a
> framework
> > > you need to be versed in this world and to actually suffer from
> > > distributed
> > > debugging.
> >
> > Yeah, I think this is an area where some publicity and outreach would be
> > a good thing :)
> >
> > It would help a lot if we could identify some core use-cases, like Stack
> > said, and make sure everything works great for those.
> >
> > Colin
> >
> >
> > >
> > > so yes, I think this project deserves one last push before putting it
> in
> > > the attic and I would be happy to complete a few issues in order to
> make
> > > this happen.
> > >
> > >
> > >
> > > On 19 August 2017 at 03:55, Adrian Cole 
> wrote:
> > >
> > > > > Thanks Adrian for the editorial on the landscape. Helps, especially
> > > > coming
> > > > > from yourself.
> > > > we aim to please
> > > >
> > > > > Given current state of the project, a retrofit to come up on OT is
> > not
> > > > the
> > > > > solution to the topic-at-hand (and besides I have a colored opinion
> > on
> > > > > taking on the API of another after spending a bunch of time
> recently
> > > > > undoing our mistake letting third-party Interfaces and Classes show
> > > > through
> > > > > in hbase).
> > > > sensible for any api highly disconnected from the ecosystem,
> > > > especially without practice yet.
> > > >
> > > > > I appreciate the higher-level point made by Andrew, that it is hard
> > to
> > > > > thread a cross-cutting library across the Hadoop landscape whether
> > > > because
> > > > > releases happen on the geologic time scale or that there is little
> > by way
> > > > > of coordination.
> > > > I think this is indeed leading a path towards focus, eg the H in
> > Htrace :)
> > > >
> > > > > Can we do a focused 'win' like Colin suggests? E.g. hook up hbase
> and
> > > > hdfs
> > > > > end-to-end with connection to a viewer (zipkin? Or text dumps in a
> > > > > webpage?). A while back I had a go at the hbase side but it was
> > burning
> > > > up
> > > > > the hours just getting it hooked up w/ tests to scream if any spans
> > were
> > > > > broken in a refactor. I had to put it aside.
> > > > Incidentally, I wouldn't necessarily say Zipkin is ready out of box
> > > > because htrace UI and query is more advanced (in some ways due to
> some
> > > > data storage options we have available). So, something like this
> could
> > > > be a move of focus which would require investment on the other side
> to
> > > > avail features needed, or discuss how to upgrade into them (ex if
> > > > using hbase storage, certain queries would work). It is fair to say
> > > > zipkin has  a great devops pipeline, we are good at fixing things. At
> > > > the same time, we are imperfect in impl and inexperienced in hadoop
> > > > ecosystem. Having some way to join together could be 

Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-31 Thread Raam Rosh-Hai
Thank you Colin for your response,
I was actually referring to Adrian's comment on making Htrace and Zipkin
play more nicely together, I have been using Htrace with zipkin for a while
now and I am doing pretty well but I was wondering if there are some
glaring pain points that should be attended.

On 30 August 2017 at 23:41, Colin McCabe  wrote:

> On Mon, Aug 21, 2017, at 02:09, Raam Rosh-Hai wrote:
> > I tend to agree that zipkin should be used as a frontend and I am pretty
> > sure I will have some time to advance that in the coming weeks if one of
> > the experts could create a few tasks.
>
> Hi Raam,
>
> If you're interested in this, check out the Zipkin trace sink.  That
> allows you to send HTrace spans to Zipkin.
>
> >
> > I think no one touched the obvious thing that tracing lacks any kind of
> > hype, it's just not glamours and in order to want to use such a framework
> > you need to be versed in this world and to actually suffer from
> > distributed
> > debugging.
>
> Yeah, I think this is an area where some publicity and outreach would be
> a good thing :)
>
> It would help a lot if we could identify some core use-cases, like Stack
> said, and make sure everything works great for those.
>
> Colin
>
>
> >
> > so yes, I think this project deserves one last push before putting it in
> > the attic and I would be happy to complete a few issues in order to make
> > this happen.
> >
> >
> >
> > On 19 August 2017 at 03:55, Adrian Cole  wrote:
> >
> > > > Thanks Adrian for the editorial on the landscape. Helps, especially
> > > coming
> > > > from yourself.
> > > we aim to please
> > >
> > > > Given current state of the project, a retrofit to come up on OT is
> not
> > > the
> > > > solution to the topic-at-hand (and besides I have a colored opinion
> on
> > > > taking on the API of another after spending a bunch of time recently
> > > > undoing our mistake letting third-party Interfaces and Classes show
> > > through
> > > > in hbase).
> > > sensible for any api highly disconnected from the ecosystem,
> > > especially without practice yet.
> > >
> > > > I appreciate the higher-level point made by Andrew, that it is hard
> to
> > > > thread a cross-cutting library across the Hadoop landscape whether
> > > because
> > > > releases happen on the geologic time scale or that there is little
> by way
> > > > of coordination.
> > > I think this is indeed leading a path towards focus, eg the H in
> Htrace :)
> > >
> > > > Can we do a focused 'win' like Colin suggests? E.g. hook up hbase and
> > > hdfs
> > > > end-to-end with connection to a viewer (zipkin? Or text dumps in a
> > > > webpage?). A while back I had a go at the hbase side but it was
> burning
> > > up
> > > > the hours just getting it hooked up w/ tests to scream if any spans
> were
> > > > broken in a refactor. I had to put it aside.
> > > Incidentally, I wouldn't necessarily say Zipkin is ready out of box
> > > because htrace UI and query is more advanced (in some ways due to some
> > > data storage options we have available). So, something like this could
> > > be a move of focus which would require investment on the other side to
> > > avail features needed, or discuss how to upgrade into them (ex if
> > > using hbase storage, certain queries would work). It is fair to say
> > > zipkin has  a great devops pipeline, we are good at fixing things. At
> > > the same time, we are imperfect in impl and inexperienced in hadoop
> > > ecosystem. Having some way to join together could be really
> > > beneficial, at the cost of up-front effort (due to model, UI and
> > > storage differences). I would be happy to direct time, though would
> > > need some help because of my irrelevance in the data services space
> > > (something this might correct!)
> > >
> > > > Like the rest of you, my time is a little occupied elsewhere these
> times
> > > so
> > > > I can't revive the project, not at the moment at least.
> > > ack
> > >
>


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-30 Thread Colin McCabe
On Mon, Aug 21, 2017, at 02:09, Raam Rosh-Hai wrote:
> I tend to agree that zipkin should be used as a frontend and I am pretty
> sure I will have some time to advance that in the coming weeks if one of
> the experts could create a few tasks.

Hi Raam,

If you're interested in this, check out the Zipkin trace sink.  That
allows you to send HTrace spans to Zipkin.

> 
> I think no one touched the obvious thing that tracing lacks any kind of
> hype, it's just not glamours and in order to want to use such a framework
> you need to be versed in this world and to actually suffer from
> distributed
> debugging.

Yeah, I think this is an area where some publicity and outreach would be
a good thing :)

It would help a lot if we could identify some core use-cases, like Stack
said, and make sure everything works great for those.

Colin


> 
> so yes, I think this project deserves one last push before putting it in
> the attic and I would be happy to complete a few issues in order to make
> this happen.
> 
> 
> 
> On 19 August 2017 at 03:55, Adrian Cole  wrote:
> 
> > > Thanks Adrian for the editorial on the landscape. Helps, especially
> > coming
> > > from yourself.
> > we aim to please
> >
> > > Given current state of the project, a retrofit to come up on OT is not
> > the
> > > solution to the topic-at-hand (and besides I have a colored opinion on
> > > taking on the API of another after spending a bunch of time recently
> > > undoing our mistake letting third-party Interfaces and Classes show
> > through
> > > in hbase).
> > sensible for any api highly disconnected from the ecosystem,
> > especially without practice yet.
> >
> > > I appreciate the higher-level point made by Andrew, that it is hard to
> > > thread a cross-cutting library across the Hadoop landscape whether
> > because
> > > releases happen on the geologic time scale or that there is little by way
> > > of coordination.
> > I think this is indeed leading a path towards focus, eg the H in Htrace :)
> >
> > > Can we do a focused 'win' like Colin suggests? E.g. hook up hbase and
> > hdfs
> > > end-to-end with connection to a viewer (zipkin? Or text dumps in a
> > > webpage?). A while back I had a go at the hbase side but it was burning
> > up
> > > the hours just getting it hooked up w/ tests to scream if any spans were
> > > broken in a refactor. I had to put it aside.
> > Incidentally, I wouldn't necessarily say Zipkin is ready out of box
> > because htrace UI and query is more advanced (in some ways due to some
> > data storage options we have available). So, something like this could
> > be a move of focus which would require investment on the other side to
> > avail features needed, or discuss how to upgrade into them (ex if
> > using hbase storage, certain queries would work). It is fair to say
> > zipkin has  a great devops pipeline, we are good at fixing things. At
> > the same time, we are imperfect in impl and inexperienced in hadoop
> > ecosystem. Having some way to join together could be really
> > beneficial, at the cost of up-front effort (due to model, UI and
> > storage differences). I would be happy to direct time, though would
> > need some help because of my irrelevance in the data services space
> > (something this might correct!)
> >
> > > Like the rest of you, my time is a little occupied elsewhere these times
> > so
> > > I can't revive the project, not at the moment at least.
> > ack
> >


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-18 Thread Adrian Cole
> Thanks Adrian for the editorial on the landscape. Helps, especially coming
> from yourself.
we aim to please

> Given current state of the project, a retrofit to come up on OT is not the
> solution to the topic-at-hand (and besides I have a colored opinion on
> taking on the API of another after spending a bunch of time recently
> undoing our mistake letting third-party Interfaces and Classes show through
> in hbase).
sensible for any api highly disconnected from the ecosystem,
especially without practice yet.

> I appreciate the higher-level point made by Andrew, that it is hard to
> thread a cross-cutting library across the Hadoop landscape whether because
> releases happen on the geologic time scale or that there is little by way
> of coordination.
I think this is indeed leading a path towards focus, eg the H in Htrace :)

> Can we do a focused 'win' like Colin suggests? E.g. hook up hbase and hdfs
> end-to-end with connection to a viewer (zipkin? Or text dumps in a
> webpage?). A while back I had a go at the hbase side but it was burning up
> the hours just getting it hooked up w/ tests to scream if any spans were
> broken in a refactor. I had to put it aside.
Incidentally, I wouldn't necessarily say Zipkin is ready out of box
because htrace UI and query is more advanced (in some ways due to some
data storage options we have available). So, something like this could
be a move of focus which would require investment on the other side to
avail features needed, or discuss how to upgrade into them (ex if
using hbase storage, certain queries would work). It is fair to say
zipkin has  a great devops pipeline, we are good at fixing things. At
the same time, we are imperfect in impl and inexperienced in hadoop
ecosystem. Having some way to join together could be really
beneficial, at the cost of up-front effort (due to model, UI and
storage differences). I would be happy to direct time, though would
need some help because of my irrelevance in the data services space
(something this might correct!)

> Like the rest of you, my time is a little occupied elsewhere these times so
> I can't revive the project, not at the moment at least.
ack


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Adrian Cole
n use both HTrace
> > 3 and HTrace 4.  This was absolutely essential for us because of the
> > version skew issues you mention.
> >
> > > On Thu, Aug 17, 2017 at 11:04 AM, lewis john mcgibbney <
> > lewi...@apache.org> wrote:
> > >
> > > > Hi Mike,
> > > > I think this is a fair question. We've probably all been associated
> > with
> > > > projects which just don't really make it. It would appear that
HTrace
> > is
> > > > one of them. This is not to say that there is nothing going on with
the
> > > > tracing effort generally (as there is) but it looks like HTrace as a
> > > > project may be headed to the Attic.
> > > > I suppose the response to this thread will determine what happens...
> >
> > Thanks, Lewis.
> >
> > I think maybe we should try to identify the top tracing priorities for
> > HBase and HDFS and see how HTrace / OpenTracing / OpenZipkin could fit
> > into those.  Just start from a nice crisp set of requirements, like
> > Stack suggested, and think about how we could make those a reality.  If
> > we can advance the state of tracing in hadoop, that will be a good thing
> > for our users, even if htrace goes to the attic.  I've been mostly
> > working on Apache Kafka these days but I could drop by to brainstorm.
> >
> > best,
> > Colin
> >
> >
> > > > Lewis
> > > > ​​
> > > >
> > > >
> > > > On Wed, Aug 16, 2017 at 10:01 AM, <
> > > > dev-digest-h...@htrace.incubator.apache.org> wrote:
> > > >
> > > > >
> > > > > From: Mike Drob <md...@apache.org>
> > > > > To: dev@htrace.incubator.apache.org
> > > > > Cc:
> > > > > Bcc:
> > > > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > > > Hi folks,
> > > > >
> > > > > Want to bring up a potentially uncofortable topic for some. Is it
> > time to
> > > > > retire/attic the project?
> > > > >
> > > > > We've seen a minimal amount of activity in the past year. The last
> > > > release
> > > > > had two bug fixes, and had been pending for several months before
> > > > somebody
> > > > > reminded me to push the artifacts to subversion from the staging
> > > > directory.
> > > > >
> > > > > I'd love to see a renewed set of activity here, but I don't think
> > there
> > > > is
> > > > > a ton of interest going on.
> > > > >
> > > > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on
> > 4.1,
> > > > > which is a good sign, but I haven't heard much from them
recently. I
> > > > > definitely do no think we are at the point where a lack of
releases
> > and
> > > > > activity is a sign of super advanced maturity and stability.
> > > > >
> > > > > Your thoughts?
> > > > >
> > > > > Mike
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > http://home.apache.org/~lewismc/
> > > > @hectorMcSpector
> > > > http://www.linkedin.com/in/lmcgibbney
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >- A23, Crosstalk
> >
>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Colin McCabe
lems with the protobuf-java library.  I wish GPRC
> > luck, but I think it's good for people to experiment with different
> > libraries.  It doesn't make sense to try to force everyone to use one
> > thing, even if we could.
> >
> > > The Hadoop ecosystem is always partially at odds with itself, if for no
> > > other reason than there is no shared vision among the projects. There are
> > > no coordinated releases. There isn't even agreement on which version of
> > > shared dependencies to use (hence the recurring pain in various places
> > > with
> > > downstream version changes of protobuf, guava, jackson, etc. etc).
> > > Therefore HTrace is severely constrained on what API changes can be made.
> > > Unfortunately the different major versions of HTrace do not interoperate
> > > at
> > > all. And are not even source compatible. While is not unreasonable at all
> > > for a project in incubation, when combined with the inability of the
> > > Hadoop
> > > ecosystem to coordinate releases as a cross-cutting dependency ships a
> > > new
> > > version, this has reduced the utility of HTrace to effectively nil for
> > > the
> > > average user. I am sorry to say that. Only a commercial Hadoop vendor or
> > > power user can be expected to patch and build a stack that actually
> > > works.
> >
> > One correction: The different major versions of HTrace are indeed source
> > code compatible.  You can build an application that can use both HTrace
> > 3 and HTrace 4.  This was absolutely essential for us because of the
> > version skew issues you mention.
> >
> > > On Thu, Aug 17, 2017 at 11:04 AM, lewis john mcgibbney <
> > lewi...@apache.org> wrote:
> > >
> > > > Hi Mike,
> > > > I think this is a fair question. We've probably all been associated
> > with
> > > > projects which just don't really make it. It would appear that HTrace
> > is
> > > > one of them. This is not to say that there is nothing going on with the
> > > > tracing effort generally (as there is) but it looks like HTrace as a
> > > > project may be headed to the Attic.
> > > > I suppose the response to this thread will determine what happens...
> >
> > Thanks, Lewis.
> >
> > I think maybe we should try to identify the top tracing priorities for
> > HBase and HDFS and see how HTrace / OpenTracing / OpenZipkin could fit
> > into those.  Just start from a nice crisp set of requirements, like
> > Stack suggested, and think about how we could make those a reality.  If
> > we can advance the state of tracing in hadoop, that will be a good thing
> > for our users, even if htrace goes to the attic.  I've been mostly
> > working on Apache Kafka these days but I could drop by to brainstorm.
> >
> > best,
> > Colin
> >
> >
> > > > Lewis
> > > > ​​
> > > >
> > > >
> > > > On Wed, Aug 16, 2017 at 10:01 AM, <
> > > > dev-digest-h...@htrace.incubator.apache.org> wrote:
> > > >
> > > > >
> > > > > From: Mike Drob <md...@apache.org>
> > > > > To: dev@htrace.incubator.apache.org
> > > > > Cc:
> > > > > Bcc:
> > > > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > > > Hi folks,
> > > > >
> > > > > Want to bring up a potentially uncofortable topic for some. Is it
> > time to
> > > > > retire/attic the project?
> > > > >
> > > > > We've seen a minimal amount of activity in the past year. The last
> > > > release
> > > > > had two bug fixes, and had been pending for several months before
> > > > somebody
> > > > > reminded me to push the artifacts to subversion from the staging
> > > > directory.
> > > > >
> > > > > I'd love to see a renewed set of activity here, but I don't think
> > there
> > > > is
> > > > > a ton of interest going on.
> > > > >
> > > > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on
> > 4.1,
> > > > > which is a good sign, but I haven't heard much from them recently. I
> > > > > definitely do no think we are at the point where a lack of releases
> > and
> > > > > activity is a sign of super advanced maturity and stability.
> > > > >
> > > > > Your thoughts?
> > > > >
> > > > > Mike
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > http://home.apache.org/~lewismc/
> > > > @hectorMcSpector
> > > > http://www.linkedin.com/in/lmcgibbney
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >- A23, Crosstalk
> >
> 
> 
> 
> -- 
> Best regards,
> Andrew
> 
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Andrew Purtell
ommercial Hadoop vendor or
> > power user can be expected to patch and build a stack that actually
> > works.
>
> One correction: The different major versions of HTrace are indeed source
> code compatible.  You can build an application that can use both HTrace
> 3 and HTrace 4.  This was absolutely essential for us because of the
> version skew issues you mention.
>
> > On Thu, Aug 17, 2017 at 11:04 AM, lewis john mcgibbney <
> lewi...@apache.org> wrote:
> >
> > > Hi Mike,
> > > I think this is a fair question. We've probably all been associated
> with
> > > projects which just don't really make it. It would appear that HTrace
> is
> > > one of them. This is not to say that there is nothing going on with the
> > > tracing effort generally (as there is) but it looks like HTrace as a
> > > project may be headed to the Attic.
> > > I suppose the response to this thread will determine what happens...
>
> Thanks, Lewis.
>
> I think maybe we should try to identify the top tracing priorities for
> HBase and HDFS and see how HTrace / OpenTracing / OpenZipkin could fit
> into those.  Just start from a nice crisp set of requirements, like
> Stack suggested, and think about how we could make those a reality.  If
> we can advance the state of tracing in hadoop, that will be a good thing
> for our users, even if htrace goes to the attic.  I've been mostly
> working on Apache Kafka these days but I could drop by to brainstorm.
>
> best,
> Colin
>
>
> > > Lewis
> > > ​​
> > >
> > >
> > > On Wed, Aug 16, 2017 at 10:01 AM, <
> > > dev-digest-h...@htrace.incubator.apache.org> wrote:
> > >
> > > >
> > > > From: Mike Drob <md...@apache.org>
> > > > To: dev@htrace.incubator.apache.org
> > > > Cc:
> > > > Bcc:
> > > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > > Hi folks,
> > > >
> > > > Want to bring up a potentially uncofortable topic for some. Is it
> time to
> > > > retire/attic the project?
> > > >
> > > > We've seen a minimal amount of activity in the past year. The last
> > > release
> > > > had two bug fixes, and had been pending for several months before
> > > somebody
> > > > reminded me to push the artifacts to subversion from the staging
> > > directory.
> > > >
> > > > I'd love to see a renewed set of activity here, but I don't think
> there
> > > is
> > > > a ton of interest going on.
> > > >
> > > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on
> 4.1,
> > > > which is a good sign, but I haven't heard much from them recently. I
> > > > definitely do no think we are at the point where a lack of releases
> and
> > > > activity is a sign of super advanced maturity and stability.
> > > >
> > > > Your thoughts?
> > > >
> > > > Mike
> > > >
> > > >
> > >
> > >
> > > --
> > > http://home.apache.org/~lewismc/
> > > @hectorMcSpector
> > > http://www.linkedin.com/in/lmcgibbney
> > >
> >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
>



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Andrew Purtell
lly make it. It would appear that HTrace
> is
> > > one of them. This is not to say that there is nothing going on with the
> > > tracing effort generally (as there is) but it looks like HTrace as a
> > > project may be headed to the Attic.
> > > I suppose the response to this thread will determine what happens...
>
> Thanks, Lewis.
>
> I think maybe we should try to identify the top tracing priorities for
> HBase and HDFS and see how HTrace / OpenTracing / OpenZipkin could fit
> into those.  Just start from a nice crisp set of requirements, like
> Stack suggested, and think about how we could make those a reality.  If
> we can advance the state of tracing in hadoop, that will be a good thing
> for our users, even if htrace goes to the attic.  I've been mostly
> working on Apache Kafka these days but I could drop by to brainstorm.
>
> best,
> Colin
>
>
> > > Lewis
> > > ​​
> > >
> > >
> > > On Wed, Aug 16, 2017 at 10:01 AM, <
> > > dev-digest-h...@htrace.incubator.apache.org> wrote:
> > >
> > > >
> > > > From: Mike Drob <md...@apache.org>
> > > > To: dev@htrace.incubator.apache.org
> > > > Cc:
> > > > Bcc:
> > > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > > Hi folks,
> > > >
> > > > Want to bring up a potentially uncofortable topic for some. Is it
> time to
> > > > retire/attic the project?
> > > >
> > > > We've seen a minimal amount of activity in the past year. The last
> > > release
> > > > had two bug fixes, and had been pending for several months before
> > > somebody
> > > > reminded me to push the artifacts to subversion from the staging
> > > directory.
> > > >
> > > > I'd love to see a renewed set of activity here, but I don't think
> there
> > > is
> > > > a ton of interest going on.
> > > >
> > > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on
> 4.1,
> > > > which is a good sign, but I haven't heard much from them recently. I
> > > > definitely do no think we are at the point where a lack of releases
> and
> > > > activity is a sign of super advanced maturity and stability.
> > > >
> > > > Your thoughts?
> > > >
> > > > Mike
> > > >
> > > >
> > >
> > >
> > > --
> > > http://home.apache.org/~lewismc/
> > > @hectorMcSpector
> > > http://www.linkedin.com/in/lmcgibbney
> > >
> >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
>



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Colin McCabe
lin


> > Lewis
> > ​​
> >
> >
> > On Wed, Aug 16, 2017 at 10:01 AM, <
> > dev-digest-h...@htrace.incubator.apache.org> wrote:
> >
> > >
> > > From: Mike Drob <md...@apache.org>
> > > To: dev@htrace.incubator.apache.org
> > > Cc:
> > > Bcc:
> > > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > > Subject: [DISCUSS] Attic podling Apache HTrace?
> > > Hi folks,
> > >
> > > Want to bring up a potentially uncofortable topic for some. Is it time to
> > > retire/attic the project?
> > >
> > > We've seen a minimal amount of activity in the past year. The last
> > release
> > > had two bug fixes, and had been pending for several months before
> > somebody
> > > reminded me to push the artifacts to subversion from the staging
> > directory.
> > >
> > > I'd love to see a renewed set of activity here, but I don't think there
> > is
> > > a ton of interest going on.
> > >
> > > HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> > > which is a good sign, but I haven't heard much from them recently. I
> > > definitely do no think we are at the point where a lack of releases and
> > > activity is a sign of super advanced maturity and stability.
> > >
> > > Your thoughts?
> > >
> > > Mike
> > >
> > >
> >
> >
> > --
> > http://home.apache.org/~lewismc/
> > @hectorMcSpector
> > http://www.linkedin.com/in/lmcgibbney
> >
> 
> 
> 
> -- 
> Best regards,
> Andrew
> 
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Andrew Purtell
What about OpenTracing (http://opentracing.io/)? Is this the successor
project to ZipKin? In particular grpc-opentracing (
https://github.com/grpc-ecosystem/grpc-opentracing) seems to finally
fulfill in open source the tracing architecture described in the Dapper
paper.

If one takes a step back and looks at all of the hand rolled RPC stacks in
the Hadoop ecosystem it's a mess. It is a heavier lift but getting everyone
migrated to a single RPC stack - gRPC - would provide the unified tracing
layer envisioned by HTrace. The tracing integration is then done exactly in
one place. In contrast HTrace requires all of the components to sprinkle
spans throughout the application code.

The Hadoop ecosystem is always partially at odds with itself, if for no
other reason than there is no shared vision among the projects. There are
no coordinated releases. There isn't even agreement on which version of
shared dependencies to use (hence the recurring pain in various places with
downstream version changes of protobuf, guava, jackson, etc. etc).
Therefore HTrace is severely constrained on what API changes can be made.
Unfortunately the different major versions of HTrace do not interoperate at
all. And are not even source compatible. While is not unreasonable at all
for a project in incubation, when combined with the inability of the Hadoop
ecosystem to coordinate releases as a cross-cutting dependency ships a new
version, this has reduced the utility of HTrace to effectively nil for the
average user. I am sorry to say that. Only a commercial Hadoop vendor or
power user can be expected to patch and build a stack that actually works.
​​

On Thu, A
​​
ug 17, 2017 at 11:04 AM, lewis john mcgibbney <lewi...@apache.org> wrote:

> Hi Mike,
> I think this is a fair question. We've probably all been associated with
> projects which just don't really make it. It would appear that HTrace is
> one of them. This is not to say that there is nothing going on with the
> tracing effort generally (as there is) but it looks like HTrace as a
> project may be headed to the Attic.
> I suppose the response to this thread will determine what happens...
> Lewis
> ​​
>
>
> On Wed, Aug 16, 2017 at 10:01 AM, <
> dev-digest-h...@htrace.incubator.apache.org> wrote:
>
> >
> > From: Mike Drob <md...@apache.org>
> > To: dev@htrace.incubator.apache.org
> > Cc:
> > Bcc:
> > Date: Wed, 16 Aug 2017 12:00:49 -0500
> > Subject: [DISCUSS] Attic podling Apache HTrace?
> > Hi folks,
> >
> > Want to bring up a potentially uncofortable topic for some. Is it time to
> > retire/attic the project?
> >
> > We've seen a minimal amount of activity in the past year. The last
> release
> > had two bug fixes, and had been pending for several months before
> somebody
> > reminded me to push the artifacts to subversion from the staging
> directory.
> >
> > I'd love to see a renewed set of activity here, but I don't think there
> is
> > a ton of interest going on.
> >
> > HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> > which is a good sign, but I haven't heard much from them recently. I
> > definitely do no think we are at the point where a lack of releases and
> > activity is a sign of super advanced maturity and stability.
> >
> > Your thoughts?
> >
> > Mike
> >
> >
>
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread lewis john mcgibbney
Hi Mike,
I think this is a fair question. We've probably all been associated with
projects which just don't really make it. It would appear that HTrace is
one of them. This is not to say that there is nothing going on with the
tracing effort generally (as there is) but it looks like HTrace as a
project may be headed to the Attic.
I suppose the response to this thread will determine what happens...
Lewis

On Wed, Aug 16, 2017 at 10:01 AM, <
dev-digest-h...@htrace.incubator.apache.org> wrote:

>
> From: Mike Drob <md...@apache.org>
> To: dev@htrace.incubator.apache.org
> Cc:
> Bcc:
> Date: Wed, 16 Aug 2017 12:00:49 -0500
> Subject: [DISCUSS] Attic podling Apache HTrace?
> Hi folks,
>
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
>
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
>
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
>
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
>
> Your thoughts?
>
> Mike
>
>


-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Colin McCabe
Thanks for bringing this up, Mike.

The original vision for HTrace was trying to unify a bunch of disparate
Hadoop components with a unified tracing layer.  This would allow us to
debug slowness or odd behavior in a much better way.  We started from
that vision and deduced the need to build a frontend API (htrace-core),
backend data store (htrace-hbase, htrace-htraced, etc.), and web UI
(htrace-web).

I still think that vision is valid, but achieving it was a lot harder
than we expected, for a couple of reasons.

First of all, I think building all those components needed someone (or
maybe several someones) to work on it full time.  We tried to do it part
time with a few HDFS and HBase committers.  Ultimately this didn't scale
as much as we needed it to.

Secondly, we were hoping for a lot of buy-in from Hadoop vendors and big
tech companies that used Hadoop.  Unfortunately, we didn't really get
that.  The Hadoop vendors were preoccupied with other things.  Big tech
companies seem to mostly developed their own internal systems using bits
and pieces of open source.  I think this is another area where we just
needed more budget.  In retrospect, having meetups and reaching out to
potential users is something we needed to do.  There are some other
projects that have been a lot better with this than we have.

I think we should try to refocus on some core use-cases.  Basically,
decide what we want to achieve and find the shortest path to that.  If
that involves using other projects, then that's fine-- as long as they
are open source projects compatible with the ideals of the ASF.

Off the top of my head, I can think of a few core use-cases:

* Why is my HDFS request slow?   Figure out if there are disk issues or
network issues.

* Why is my HBase request slow?  Follow HBase requests into the HDFS
layer.

* Who is making the most requests to HDFS?

* What average speed is Hadoop getting from its S3 requests?  How often
do we hit our local caches, versus going over the network?

best,
Colin


On Thu, Aug 17, 2017, at 10:04, Stack wrote:
> On Wed, Aug 16, 2017 at 10:00 AM, Mike Drob  wrote:
> 
> > Hi folks,
> >
> > Want to bring up a potentially uncofortable topic for some. Is it time to
> > retire/attic the project?
> >
> > We've seen a minimal amount of activity in the past year. The last release
> > had two bug fixes, and had been pending for several months before somebody
> > reminded me to push the artifacts to subversion from the staging directory.
> >
> > I'd love to see a renewed set of activity here, but I don't think there is
> > a ton of interest going on.
> >
> > HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> > which is a good sign, but I haven't heard much from them recently. I
> > definitely do no think we are at the point where a lack of releases and
> > activity is a sign of super advanced maturity and stability.
> >
> > Your thoughts?
> 
> 
> Thanks Mike for starting this thread.
> 
> Activity over the last year is here [1].
> 
> Is there any testimony other than evangelizing presentations on how
> htrace
> has provided a benefit?
> 
> HTrace needs a bit of work. In order of import:
> 
> 1. A complete viewer (punt and use zipkin instead?)
> 2. Hooked up systems that tell wholesome trace stories: hdfs is
> incomplete,
> hbase is broke, accumulo/unknown, phoenix/custom-htrace... who else?
> 3. Work needs to be done so an operator can easily enable/disable trace
> and
> easily obtain views without impinging upon general perf
> 
> It could do w/ an API cleanup (v5.0.0?) and study of the fact that it is
> painstaking manual work adding it into a system (and that it is
> subsequently easily damaged by code movement). It needs a particular type
> of barker to drive it cross-project since the cross-project realm is when
> it starts to come into its own (and each project in its turn will resist
> since the benefit not immediate), etc.
> 
> None of the above is under active dev.
> 
> St.Ack
> 
> 1. https://github.com/apache/incubator-htrace/graphs/commit-activity
> 
> 
> 
> > Mike
> >


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Stack
On Wed, Aug 16, 2017 at 10:00 AM, Mike Drob  wrote:

> Hi folks,
>
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
>
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
>
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
>
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
>
> Your thoughts?


Thanks Mike for starting this thread.

Activity over the last year is here [1].

Is there any testimony other than evangelizing presentations on how htrace
has provided a benefit?

HTrace needs a bit of work. In order of import:

1. A complete viewer (punt and use zipkin instead?)
2. Hooked up systems that tell wholesome trace stories: hdfs is incomplete,
hbase is broke, accumulo/unknown, phoenix/custom-htrace... who else?
3. Work needs to be done so an operator can easily enable/disable trace and
easily obtain views without impinging upon general perf

It could do w/ an API cleanup (v5.0.0?) and study of the fact that it is
painstaking manual work adding it into a system (and that it is
subsequently easily damaged by code movement). It needs a particular type
of barker to drive it cross-project since the cross-project realm is when
it starts to come into its own (and each project in its turn will resist
since the benefit not immediate), etc.

None of the above is under active dev.

St.Ack

1. https://github.com/apache/incubator-htrace/graphs/commit-activity



> Mike
>


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Adrian Cole
> What are the likely alternatives for downstream projects that want 
> distributed tracing?
Yes, for general purpose or RPC, but I think HTrace is still
positioned well for data services specifically.

> Do we think the field still has a big gap that HTrace can solve?
When at twitter (a couple yrs ago now), I know the data team preferred
htrace eventhough we had zipkin. Most of the tracing projects out
there do not focus on data services, or only recently do. While HTrace
may not be great at filling gaps in traditional RPC (as others do this
well enough), it probably does still have compelling advantages in
data services. I think the main holdback is getting the word out
and/or showing examples where the model and UI really shines in
HTrace's sweet spot (data services).

my 2p


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Sean Busbey
What are the likely alternatives for downstream projects that want distributed 
tracing?

Do we think the field still has a big gap that HTrace can solve?

On 2017-08-16 12:00, Mike Drob  wrote: 
> Hi folks,
> 
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
> 
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
> 
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
> 
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
> 
> Your thoughts?
> 
> Mike
> 


Re: [DISCUSS] Attic podling Apache HTrace?

2017-08-17 Thread Masatake Iwasaki

Hi Mike,

Thanks for putting this issue up.

> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?

I would like to keep the project alive.
While we are silent for months,
many of the committers are still working on projects
using HTrace (such as Hadoop and HBase) and
we are capable to make new release if new major issues are found.

> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently.

I will look into HBASE-14451 again and try to make it move forward.
Since one of the intent of big change in HTrace-4 is
making better end-to-end tracing (e.g. from HBase to HDFS),
bumping HTrace in HBase up to 4 would reveal the next task.

Regards,
Masatake Iwasaki

On 8/17/17 02:00, Mike Drob wrote:

Hi folks,

Want to bring up a potentially uncofortable topic for some. Is it time to
retire/attic the project?

We've seen a minimal amount of activity in the past year. The last release
had two bug fixes, and had been pending for several months before somebody
reminded me to push the artifacts to subversion from the staging directory.

I'd love to see a renewed set of activity here, but I don't think there is
a ton of interest going on.

HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
which is a good sign, but I haven't heard much from them recently. I
definitely do no think we are at the point where a lack of releases and
activity is a sign of super advanced maturity and stability.

Your thoughts?

Mike





[DISCUSS] Attic podling Apache HTrace?

2017-08-16 Thread Mike Drob
Hi folks,

Want to bring up a potentially uncofortable topic for some. Is it time to
retire/attic the project?

We've seen a minimal amount of activity in the past year. The last release
had two bug fixes, and had been pending for several months before somebody
reminded me to push the artifacts to subversion from the staging directory.

I'd love to see a renewed set of activity here, but I don't think there is
a ton of interest going on.

HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
which is a good sign, but I haven't heard much from them recently. I
definitely do no think we are at the point where a lack of releases and
activity is a sign of super advanced maturity and stability.

Your thoughts?

Mike