Thanks Richard for starting the discussion and Sigee + Davide for sharing
your thoughts.

The health of Apache Storm has been a concern for a while now and it is
very much a project in maintenance state. Richard has done an amazing job
of keeping things going but at times it feels like it is StormCrawler and
its community that are keeping it alive. In typical Apache fashion, the
project boasts a large number of committers but effectively, there is only
a handful of individuals actively involved. Emeritus would probably be a
better status for most people there.

Anyway, I think it wouldn't necessarily take a lot of effort to keep Storm
going. We got rid of the Clojure parts, so it is all in Java which makes it
easier for people to contribute. The work on Storm is mostly about
upgrading dependencies and the odd improvement here and there - none of
which requires a deep understanding of the internal intricacies of the
code. And these days, AI is there to help us understand other people's code
better anyway. As Richard said, all we need is a few of us in the
StormCrawler community to get involved in Storm. Simply validating a RC
would already be a fantastic way to help.

The alternative is essentially to start a new project on a different
platform by borrowing code snippets from SC—similar to how I adapted code
from Apache Nutch to SC. It is quite a lot of work with no guarantee of
success, whereas what we have now is pretty solid and used by various
organisations worldwide. Some of them have downloaded dozens of billions of
pages using SC, so overall, SC might have fetched trillions of URLs in
total, who knows?

Web crawling is a niche activity but people are using StormCrawler. The
question is: why don't we get more contributions and potential committers?
If our community grows, so does the Storm one. I used to invest time
finding who was using SC, convincing them to make it visible, and then
contributing to the project. Perhaps that needs to be done again?

Julien


On Tue, 19 May 2026 at 09:25, Davide Polato <[email protected]> wrote:

> Hi Richard,
>
> The contributor route makes more sense to me than ripping out our cluster
> layer for another engine.
>
> For what it's worth, anyone coming from StormCrawler isn't starting from
> zero on Storm. We work on top of its model every day. The internals are the
> part we don't know, but that's still a smaller jump than picking up a new
> framework from scratch.
>
> It's a direction I'd be interested in, even if not something I can jump on
> immediately.
>
> Best,
> Davide
>
> Il giorno lun 18 mag 2026 alle ore 16:55 Dávid Szigecsán <
> [email protected]>
> ha scritto:
>
> > Hi,
> >
> > To be honest, it was kind of a coincidence, I showed up there. I started
> to
> > look a bit deeper into Storm in the last few days, but I still don't know
> > much about it. I started to check the project (clone the repository and
> > tried to build) with some errors, because I use windows and Storm is not
> so
> > windows friendly. :D
> > Anyway, I did not think I could make a huge impact in it (especially as I
> > am unfortunately not the most active member of the community :( ).
> > But I don't want to let Storm die. If I can help, I want to. In the last
> > few days I read lots about Storm's history and it deserves to live.
> > I am going on a long vacation from 24. May to 8. June, but after that I
> am
> > happy to discuss how I can help.
> >
> > Regards,
> > Sigee
> >
> > Richard Zowalla <[email protected]> ezt írta (időpont: 2026. máj. 18., H,
> > 16:35):
> >
> > >
> > > Hi all,
> > >
> > > I wanted to raise something that's been on my mind for a while
> regarding
> > > the sustainability of Apache Storm itself. From what I've observed,
> > getting
> > > the 3 votes required for releases and decisions has become quite
> > cumbersome
> > > - sometimes really hard - and that makes me worry about how viable
> Storm
> > is
> > > as a foundation for us going forward.
> > >
> > > On a more positive note, I noticed that Dávid recently showed up in one
> > of
> > > the Storm issues. I think it would be a good idea to try to get a few
> > more
> > > people from the StormCrawler side involved in Storm directly. One thing
> > > that helps here: Storm doesn't differentiate between Committer and PMC
> -
> > > they vote new people straight into the PMC. So it could be a relatively
> > > clean way to inject some fresh contributors and voting power into the
> > > project.
> > >
> > > If we don't manage to do something along those lines, I'm afraid we'll
> > > have to seriously consider re-inventing or migrating our underlying
> > cluster
> > > technology to another stream processing framework sooner rather than
> > later,
> > > and let Storm die (in the attic).
> > > I'd rather avoid that if we can since the technology has a proven
> record
> > > in web crawling projects.
> > >
> > > Any thoughts?
> > >
> > > Gruß
> > > Richard
> >
>

Reply via email to