Re[2]: [DISCUSS] Retire Heron

Ron Wilcom Fri, 16 Oct 2020 07:46:30 -0700

All,

We have employed Heron on a large streaming data project for ourcustomer and we have a deep vested interest in its adoption/success. Itwould be very premature to retire Heron while it has not yet reached afull Apache release. Heron has been solid for us and we've been usingit for well over a year against the largest of data throughputs - itsheld up great and is the most versatile/configurable/industrialstreaming engine when using the straight/original topology approach --I'm not a fan of the Streamlets API .. why try to compete there? Heronshould go for being the best industrial solution!

Whatever the hold up for this project is preventing it from moving outfrom incubation should be prioritized for this developer community.Many companies are not going to use an "incubated" Apache open sourceproduct - but once fully released/available in Apache developers willhave an easier time convincing their technical and business managementthat they can move to Heron and trust that it will have staying power.Here is a list of some tasks that I feel need to be tackled to providean upswell of adoption:

* Get out of "Apache incubation" state to a full release - this has tohappen before there is real/broad adoption - are we looking for'perfect', no, because you'll never get there ... perfection can happenover time after more open source adoption!* Capitalize on the history of this project: That it is derived from theTwitter development team that Storm projects can easily convert to Heronand why that is important (better performance, deployment versatility,backpressure features, etc).* Where are the original founders of this project (Karthik Ramasamy? nowat Splunk?) - why do they no longer support this project and can we getthem involved again to help get this over the finish line? Need morehigh level advocacy all around to promote Heron.* The technical barrier to entry with Heron is high, it took me a lot ofgrunt work years ago to figure out the proper way to make it work - soit is very important to provide strong/deep documentation, gettingstarted documentation, and out-of-the-box multi-language examples* Provide a script/helm based end-to-end deployment that can be run on adevelopers local machine/VM using well known resource managers likeKubernetes (e.g. minikube example) - obviously this requires pulling inmultiple supporting technologies ..... provide expanded examples forlarge/true clusters.* Need a better administrative tool and monitoring ability across/withinthe topologies - we have many separated topologies working togetheracross the stream - its difficult for a new person on the project to seehow these link together or how the data flows - ability to easily tracka tuple through - built in ability to send 'canary' tuples to insurethroughput* Explain the best practices for including Heron in a wholisticstreaming architecture such as using Kafka or Pulsar between smallertopologies so the stream has queuing break points duringbackpressure/restarts - how to use local caches (Redis) or the bestapproaches for writing out to database end points from bolts, etc* Correct the default packing algorithm to the original (maybe this isalready fixed?) - there was a release a while back where the defaultpacking algorithm was changed to create a container per bolt/spout whichis not a good approach on limited hardware (not everyone has 1000 nodes)- the concepts of this need better explanation/understanding* Update the deployment process such that the Heron Client is part ofcompilation but not required to deploy to Kubernetes - result of theHeron Client should be standard container images that can be pushed intoa container repository to be pushed normally like any other image ---this will allow it to be more easily mixed in with standard DevOps (DDS)procedures* Allow for on-the-fly (per-environment) configuration settings at thepoint of build/deploy - currently in the version we use its required torebuild per environment (I think this may be handled in a newer releasebut we've been waiting for a final release to occur that gets pastincubation?)* FUTURE: Provide even further configuration for spouts/bolts ... allowfor more dynamic CPU/Memory allocation, special assignment to CPUs/GPUs,simplify VM settings, etc* FUTURE: Expand on functionality such as elastic scaling (Dhalion) butprovide this within more ubiquitous resource managers like Kubernetes(my understanding is that this only works in Mesos/Aurora?).

Thanks all - keep it up - let me know if I can help (my time is limited- maybe documentation?)


Ron


------ Original Message ------
From: "Jim Mantheiy" <[email protected]>
To: [email protected]

Cc: "Sree Vaddi" <[email protected]>;[email protected]

Sent: 10/16/2020 8:32:07 AM
Subject: Re: [DISCUSS] Retire Heron

All,

I speak for a few people I work with when I say that heron has a unique
place in the streaming/analytic space. Have used storm, kafka streams, and
other frameworks. I feel heron is easily the most simplistic, lowest cost
to entry framework out there.

Personally, if heron would to expand its k8a capabilities such as
horizontal pod autoscaler, health checks on bolts,, better dashboard,
perhaps open trace? Then heron would be a one stop shop for highly
efficient, scalable, robust streaming solution.

Basically, how can I help?

Thanks

Jim

On Fri, Oct 16, 2020, 7:33 AM Josh Fischer <[email protected]> wrote:

 Hi All,

 Windham, I agree with everything you said. Most importantly what stood out
 to me is the lack of documentation that covers why or how someone would use
 Heron.  I agree with Dave, we should try to organize and set some goals for
 us to complete within the next few weeks and months.

 I don't want to see Heron go.  It's the first big open source project I've
 worked on and I'd hate to get all teary eyed over a bunch of code that
 retired at the Apache Foundation.

 How would everyone feel if we used this github project
 https://github.com/apache/incubator-heron/projects/4 to track some tasks?
 This way we could give some visibility to people trying to learn what's
 going on with the community?

 What is left outstanding with our 0.23.0-incubating release?  Let's add
 those tasks to the github project above.

 On Thu, Oct 15, 2020 at 9:59 PM Ning Wang <[email protected]> wrote:

 > Thanks! That's a lot of helpful information!
 >
 > Agreed that documentation and examples can be better to lower the barrier
 > and be more friendly to new users.
 >
 > On Thu, Oct 15, 2020 at 7:42 PM Windham Wong <[email protected]>
 > wrote:
 >
 > > I am new to Apache foundation thing and I want to point out a thing
 that,
 > > when start trying Heron, i got a very big barrier that the
 documentation
 > is
 > > not good enough to do quick start and good understanding the structure
 of
 > > Heron. I saw a few people asking about how to launch the demo topology
 > with
 > > facing some technical issues related to python version and
 > configurations.
 > >
 > > For my point of view, we are using Heron as production for log parsing
 > > system, and we see great opportunity of increasing the usage of Heron
 > with
 > > our growth of business. However, recalling my experience when started
 > > looking into Heron, the learning curve isn't too high but still much
 > higher
 > > than other software or systems. I believe the documentation requires
 more
 > > improvement to let new users to understand more quickly. Furthermore,
 > for a
 > > business aspect, I believe Heron requires more use case promotion to
 > > people. Many don't know what to do with a piece of software and they
 > forget
 > > about it after some time. Cross-language support (Java/Python/Lua/C++)
 is
 > > great for people in different field to start using it, but they can't
 > find
 > > a blog/article/tutorial/youtube to realise what they can do with it. I
 am
 > > thinking if we can ask companies to share their experience of using
 > Heron,
 > > and also some personal to share what they can do or their idea would
 help
 > > the community growth.
 > > Sorry for the long words.
 > > Windham Wong
 > > OSWE, OSCP, GCIA, Specialist in Cybersecurity
 > > Co-Founder, Managing Partner of
 > > Stormeye.io, Hong Kong Managed Security Operation Center Limited
 > > Email // [email protected] (
 > >
 >
 
https://link.getmailspring.com/link/[email protected]/1?redirect=mailto%3Awindham.wong%40stormeye.io&recipient=ZGV2QGhlcm9uLmluY3ViYXRvci5hcGFjaGUub3Jn
 > > )
 > > Phone // +852_3590_2212_|_+852_9832_0707 (tel:+85235902212)
 > > Fax // +852_3590_2202 (tel:+852_3590_2202)
 > >
 > > On 10月 16 2020, at 4:44 凌晨, Ning Wang <[email protected]> wrote:
 > > > Thanks Dave!
 > > >
 > > > IMO our goal is to have an official release, which has been
 > challenging.
 > > At
 > > > the same time, some kubernetes and python works are going on at
 least.
 > I
 > > > remember the issue we found in the latest release candidate was
 Python
 > 3
 > > > related.
 > > >
 > > >
 > > > On Thu, Oct 15, 2020 at 12:51 PM Dave Fisher <[email protected]>
 wrote:
 > > > > It would be helpful to have more discussion about what is happening
 > on
 > > > > this mailing list.
 > > > >
 > > > > I’m your last active Mentor and I joined only when it seemed like
 the
 > > > > start of incubation was blocked.
 > > > >
 > > > > Please show the activity with some visible direction.
 > > > >
 > > > > > On Oct 15, 2020, at 11:59 AM, Sree Vaddi <
 [email protected]
 > > .INVALID>
 > > > > wrote:
 > > > > >
 > > > > > Heron will continue to live long.
 > > > > > It has it's own place in the stream processing world among other
 > > > > competing technologies.The ever increasing data has stretched
 > > competitions
 > > > > to the limits of breaking.
 > > > > >
 > > > > > In addition:
 > > > > > In production at the creating company and others around the
 > > world.Best
 > > > > open source alternative to Google Dataflow, from the recent talks.
 > > > > > Higher freedom to customizations, makes it attractive for
 > innovation.
 > > > > > 27 continuous monthly meetups.
 > > > > > Slack is active.Mailing lists are active.
 > > > > > 455 meetup members and counting.40 linkedin group members and
 > > counting.
 > > > > >
 > > > > > All of these, just by a few bunch of us.
 > > > > >
 > > > > > It is too early for 'retirement' talk, IMHO.
 > > > > > Let's focus on, making it to TLP.
 > > > > > Taking one task or part of it at a time.
 > > > > >
 > > > > >
 > > > > > Thank you./Sree
 > > > > >
 > > > > > On Thursday, October 15, 2020, 11:00:10 AM PDT, H W <
 > > > > [email protected]> wrote:
 > > > > >
 > > > > > The community size and activity look steady rather than
 dwindling.
 > > The
 > > > > > heronstreaming slack is still active. The
 > > > > conversations/meetups/discussions
 > > > > > keep going well.
 > > > > > As for 'retirement' I think that would be premature
 > > > > >
 > > > > > On Thu, Oct 15, 2020 at 10:29 AM Ning Wang <[email protected]
 >
 > > wrote:
 > > > > >
 > > > > >> Hmm.
 > > > > >>
 > > > > >> Community isn't very active, but there are still works going on
 > > (python,
 > > > > >> k8s/helm, etc) and a few users relying on the project. IMO it is
 > too
 > > > > early
 > > > > >> to retire.
 > > > > >>
 > > > > >>
 > > > > >>
 > > > > >> On Wed, Oct 14, 2020 at 5:57 PM Josh Fischer <
 [email protected]
 > >
 > > > > wrote:
 > > > > >>
 > > > > >>> Hi All,
 > > > > >>>
 > > > > >>> It seems the community is dwindling for Heron. I think it is
 time
 > > to
 > > > > >> start
 > > > > >>> a discussion on retiring the podling.
 > > > > >>>
 > > > > >>> Thoughts?
 > > > > >>>
 > > > > >>> - Josh
 > > > > >>>
 > > > > >>
 > > > >
 > > > >
 > > >
 > >
 > >
 >

Re[2]: [DISCUSS] Retire Heron

Reply via email to