All,
We have employed Heron on a large streaming data project for our customer and we have a deep vested interest in its adoption/success. It would be very premature to retire Heron while it has not yet reached a full Apache release. Heron has been solid for us and we've been using it for well over a year against the largest of data throughputs - its held up great and is the most versatile/configurable/industrial streaming engine when using the straight/original topology approach -- I'm not a fan of the Streamlets API .. why try to compete there? Heron should go for being the best industrial solution!

Whatever the hold up for this project is preventing it from moving out from incubation should be prioritized for this developer community. Many companies are not going to use an "incubated" Apache open source product - but once fully released/available in Apache developers will have an easier time convincing their technical and business management that they can move to Heron and trust that it will have staying power. Here is a list of some tasks that I feel need to be tackled to provide an upswell of adoption:

* Get out of "Apache incubation" state to a full release - this has to happen before there is real/broad adoption - are we looking for 'perfect', no, because you'll never get there ... perfection can happen over time after more open source adoption! * Capitalize on the history of this project: That it is derived from the Twitter development team that Storm projects can easily convert to Heron and why that is important (better performance, deployment versatility, backpressure features, etc). * Where are the original founders of this project (Karthik Ramasamy? now at Splunk?) - why do they no longer support this project and can we get them involved again to help get this over the finish line? Need more high level advocacy all around to promote Heron. * The technical barrier to entry with Heron is high, it took me a lot of grunt work years ago to figure out the proper way to make it work - so it is very important to provide strong/deep documentation, getting started documentation, and out-of-the-box multi-language examples * Provide a script/helm based end-to-end deployment that can be run on a developers local machine/VM using well known resource managers like Kubernetes (e.g. minikube example) - obviously this requires pulling in multiple supporting technologies ..... provide expanded examples for large/true clusters. * Need a better administrative tool and monitoring ability across/within the topologies - we have many separated topologies working together across the stream - its difficult for a new person on the project to see how these link together or how the data flows - ability to easily track a tuple through - built in ability to send 'canary' tuples to insure throughput * Explain the best practices for including Heron in a wholistic streaming architecture such as using Kafka or Pulsar between smaller topologies so the stream has queuing break points during backpressure/restarts - how to use local caches (Redis) or the best approaches for writing out to database end points from bolts, etc * Correct the default packing algorithm to the original (maybe this is already fixed?) - there was a release a while back where the default packing algorithm was changed to create a container per bolt/spout which is not a good approach on limited hardware (not everyone has 1000 nodes) - the concepts of this need better explanation/understanding * Update the deployment process such that the Heron Client is part of compilation but not required to deploy to Kubernetes - result of the Heron Client should be standard container images that can be pushed into a container repository to be pushed normally like any other image --- this will allow it to be more easily mixed in with standard DevOps (DDS) procedures * Allow for on-the-fly (per-environment) configuration settings at the point of build/deploy - currently in the version we use its required to rebuild per environment (I think this may be handled in a newer release but we've been waiting for a final release to occur that gets past incubation?) * FUTURE: Provide even further configuration for spouts/bolts ... allow for more dynamic CPU/Memory allocation, special assignment to CPUs/GPUs, simplify VM settings, etc * FUTURE: Expand on functionality such as elastic scaling (Dhalion) but provide this within more ubiquitous resource managers like Kubernetes (my understanding is that this only works in Mesos/Aurora?).

Thanks all - keep it up - let me know if I can help (my time is limited - maybe documentation?)

Ron


------ Original Message ------
From: "Jim Mantheiy" <[email protected]>
To: [email protected]
Cc: "Sree Vaddi" <[email protected]>; [email protected]
Sent: 10/16/2020 8:32:07 AM
Subject: Re: [DISCUSS] Retire Heron

All,

I speak for a few people I work with when I say that heron has a unique
place in the streaming/analytic space. Have used storm, kafka streams, and
other frameworks. I feel heron is easily the most simplistic, lowest cost
to entry framework out there.

Personally, if heron would to expand its k8a capabilities such as
horizontal pod autoscaler, health checks on bolts,, better dashboard,
perhaps open trace? Then heron would be a one stop shop for highly
efficient, scalable, robust streaming solution.

Basically, how can I help?

Thanks

Jim

On Fri, Oct 16, 2020, 7:33 AM Josh Fischer <[email protected]> wrote:

 Hi All,

 Windham, I agree with everything you said. Most importantly what stood out
 to me is the lack of documentation that covers why or how someone would use
 Heron.  I agree with Dave, we should try to organize and set some goals for
 us to complete within the next few weeks and months.

 I don't want to see Heron go.  It's the first big open source project I've
 worked on and I'd hate to get all teary eyed over a bunch of code that
 retired at the Apache Foundation.

 How would everyone feel if we used this github project
 https://github.com/apache/incubator-heron/projects/4 to track some tasks?
 This way we could give some visibility to people trying to learn what's
 going on with the community?

 What is left outstanding with our 0.23.0-incubating release?  Let's add
 those tasks to the github project above.

 On Thu, Oct 15, 2020 at 9:59 PM Ning Wang <[email protected]> wrote:

 > Thanks! That's a lot of helpful information!
 >
 > Agreed that documentation and examples can be better to lower the barrier
 > and be more friendly to new users.
 >
 > On Thu, Oct 15, 2020 at 7:42 PM Windham Wong <[email protected]>
 > wrote:
 >
 > > I am new to Apache foundation thing and I want to point out a thing
 that,
 > > when start trying Heron, i got a very big barrier that the
 documentation
 > is
 > > not good enough to do quick start and good understanding the structure
 of
 > > Heron. I saw a few people asking about how to launch the demo topology
 > with
 > > facing some technical issues related to python version and
 > configurations.
 > >
 > > For my point of view, we are using Heron as production for log parsing
 > > system, and we see great opportunity of increasing the usage of Heron
 > with
 > > our growth of business. However, recalling my experience when started
 > > looking into Heron, the learning curve isn't too high but still much
 > higher
 > > than other software or systems. I believe the documentation requires
 more
 > > improvement to let new users to understand more quickly. Furthermore,
 > for a
 > > business aspect, I believe Heron requires more use case promotion to
 > > people. Many don't know what to do with a piece of software and they
 > forget
 > > about it after some time. Cross-language support (Java/Python/Lua/C++)
 is
 > > great for people in different field to start using it, but they can't
 > find
 > > a blog/article/tutorial/youtube to realise what they can do with it. I
 am
 > > thinking if we can ask companies to share their experience of using
 > Heron,
 > > and also some personal to share what they can do or their idea would
 help
 > > the community growth.
 > > Sorry for the long words.
 > > Windham Wong
 > > OSWE, OSCP, GCIA, Specialist in Cybersecurity
 > > Co-Founder, Managing Partner of
 > > Stormeye.io, Hong Kong Managed Security Operation Center Limited
 > > Email // [email protected] (
 > >
 >
 
https://link.getmailspring.com/link/[email protected]/1?redirect=mailto%3Awindham.wong%40stormeye.io&recipient=ZGV2QGhlcm9uLmluY3ViYXRvci5hcGFjaGUub3Jn
 > > )
 > > Phone // +852_3590_2212_|_+852_9832_0707 (tel:+85235902212)
 > > Fax // +852_3590_2202 (tel:+852_3590_2202)
 > >
 > > On 10月 16 2020, at 4:44 凌晨, Ning Wang <[email protected]> wrote:
 > > > Thanks Dave!
 > > >
 > > > IMO our goal is to have an official release, which has been
 > challenging.
 > > At
 > > > the same time, some kubernetes and python works are going on at
 least.
 > I
 > > > remember the issue we found in the latest release candidate was
 Python
 > 3
 > > > related.
 > > >
 > > >
 > > > On Thu, Oct 15, 2020 at 12:51 PM Dave Fisher <[email protected]>
 wrote:
 > > > > It would be helpful to have more discussion about what is happening
 > on
 > > > > this mailing list.
 > > > >
 > > > > I’m your last active Mentor and I joined only when it seemed like
 the
 > > > > start of incubation was blocked.
 > > > >
 > > > > Please show the activity with some visible direction.
 > > > >
 > > > > > On Oct 15, 2020, at 11:59 AM, Sree Vaddi <
 [email protected]
 > > .INVALID>
 > > > > wrote:
 > > > > >
 > > > > > Heron will continue to live long.
 > > > > > It has it's own place in the stream processing world among other
 > > > > competing technologies.The ever increasing data has stretched
 > > competitions
 > > > > to the limits of breaking.
 > > > > >
 > > > > > In addition:
 > > > > > In production at the creating company and others around the
 > > world.Best
 > > > > open source alternative to Google Dataflow, from the recent talks.
 > > > > > Higher freedom to customizations, makes it attractive for
 > innovation.
 > > > > > 27 continuous monthly meetups.
 > > > > > Slack is active.Mailing lists are active.
 > > > > > 455 meetup members and counting.40 linkedin group members and
 > > counting.
 > > > > >
 > > > > > All of these, just by a few bunch of us.
 > > > > >
 > > > > > It is too early for 'retirement' talk, IMHO.
 > > > > > Let's focus on, making it to TLP.
 > > > > > Taking one task or part of it at a time.
 > > > > >
 > > > > >
 > > > > > Thank you./Sree
 > > > > >
 > > > > > On Thursday, October 15, 2020, 11:00:10 AM PDT, H W <
 > > > > [email protected]> wrote:
 > > > > >
 > > > > > The community size and activity look steady rather than
 dwindling.
 > > The
 > > > > > heronstreaming slack is still active. The
 > > > > conversations/meetups/discussions
 > > > > > keep going well.
 > > > > > As for 'retirement' I think that would be premature
 > > > > >
 > > > > > On Thu, Oct 15, 2020 at 10:29 AM Ning Wang <[email protected]
 >
 > > wrote:
 > > > > >
 > > > > >> Hmm.
 > > > > >>
 > > > > >> Community isn't very active, but there are still works going on
 > > (python,
 > > > > >> k8s/helm, etc) and a few users relying on the project. IMO it is
 > too
 > > > > early
 > > > > >> to retire.
 > > > > >>
 > > > > >>
 > > > > >>
 > > > > >> On Wed, Oct 14, 2020 at 5:57 PM Josh Fischer <
 [email protected]
 > >
 > > > > wrote:
 > > > > >>
 > > > > >>> Hi All,
 > > > > >>>
 > > > > >>> It seems the community is dwindling for Heron. I think it is
 time
 > > to
 > > > > >> start
 > > > > >>> a discussion on retiring the podling.
 > > > > >>>
 > > > > >>> Thoughts?
 > > > > >>>
 > > > > >>> - Josh
 > > > > >>>
 > > > > >>
 > > > >
 > > > >
 > > >
 > >
 > >
 >

Reply via email to