+1 on Java 25. The JEP 491 argument is convincing on its own — skipping the
ReentrantLock refactoring across FetchItemQueues is already worth it.

The OkHttp pool concern is real but I think orthogonal. We already document
that it does not scale past ~1000 total connections and point to
protocol.instances.num as the way to spread load. Virtual threads would
stress that limit more, but that is a problem to solve separately
regardless of which Java baseline we pick.
Same goes for the timeout executors. They are there because platform
threads can get stuck in native I/O and interrupt is not always reliable —
with virtual threads that improves, and we already have OkHttp call
timeouts and socket timeouts underneath. I think the pattern could be
simplified, though I would keep some bolt-level safety net either way.

One small thing: beingFetched in FetcherBolt is a String[] sized to
threadCount, indexed by threadNum, only used for debug logging. Minor, but
a good example of the kind of thing that would need updating in the bolt
when the model changes.

Il giorno mer 3 giu 2026 alle ore 09:18 Julien Nioche <
[email protected]> ha scritto:

> Thanks Richard for the suggestion and thorough explanation.
> Two things come to my mind:
>
>    - Need to check that it works with OKHttp's connection cache: IIRC
>    having too many connections made things slower because of the
>    implementation of their cache. If virtual threads means more
> parallelism,
>    wouldn't that be a bottleneck?
>    - *"Dropping the per-fetcher-thread timeout ExecutorServices" - *this
>    was added recently to avoid threads getting blocked forever by the
>    protocol, which we did see in practice. Wouldn't we need a similar
>    mechanism with virtual threads?
>
> What do you think?
>
> Julien
>
> On Tue, 2 Jun 2026 at 11:55, Richard Zowalla <[email protected]> wrote:
>
> > Hi all,
> >
> >   Storm 3.0 (currently in development upstream) raises its Java baseline
> >   to 21 [1]. Once we move StormCrawler onto Storm 3, we will have to lift
> >   our own baseline (currently Java 17) anyway, so I'd like to discuss
> >   where we should land.
> >
> >   My proposal: go directly to Java 25 (the current LTS) instead of
> >   stopping at Java 21.
> >
> >   Why not just 21?
> >
> >   The main technical argument is virtual threads. Our fetch path is a
> >   textbook use case for them: FetcherBolt today maintains a fixed pool of
> >   platform threads (fetcher.threads.number), each spending most of its
> >   life blocked on DNS / TLS / slow servers / timeouts. With virtual
> >   threads we could move to a thread-per-fetch model where concurrency is
> >   bounded by politeness rules and connection pools rather than by thread
> >   count. For broad multi-host crawls this lifts the per-worker
> >   concurrency ceiling from a few hundred to many thousands of in-flight
> >   fetches, and removes fetcher.threads.number as the tuning knob that
> >   users most often get wrong. (Single-host crawls see no difference -
> >   politeness remains the cap there.)
> >
> >   The catch with Java 21: virtual threads pin their carrier thread inside
> >   synchronized blocks. Adopting them on a 21 baseline would mean
> >   refactoring synchronized usage across the fetch path (FetchItemQueues,
> >   ProtocolFactory, several external modules) to ReentrantLock. JEP 491
> >   (JDK 24) removed this limitation, so on a Java 25 baseline most of that
> >   refactoring simply isn't needed - we could adopt virtual threads in
> >   FetcherBolt with a much smaller and safer change.
> >
> >   Beyond that, 25 is an LTS like 21, with a longer support window.
> >
> >   Practical considerations:
> >
> >      - Users moving to Storm 3 have to upgrade their JVM to 21+ anyway;
> the
> >     additional step to 25 should be small for most, but it would exclude
> >     anyone whose organisation pins them to 21. Input welcome on whether
> >     this is a real concern for our user base.
> >   - Storm 3 itself is built against 21; running it on a 25 JRE should be
> >     fine, but we'd want to validate this in our CI matrix.
> >   - Dependency ecosystem on 25 needs a quick audit (I don't expect
> >     issues).
> >
> >   What this would enable as follow-up work (separate threads/issues):
> >
> >   - FetcherBolt: virtual-thread-per-fetch, deprecating
> >     fetcher.threads.number in favour of a (much higher) max-in-flight cap
> >   - Dropping the per-fetcher-thread timeout ExecutorServices
> >   - Decoupling HTTP connection pool sizing from thread count
> >   - Pluggable async-capable DNS resolution (JEP 418 SPI)
> >
> >   None of this would affect the current 3.x line - it would target the
> >   release in which we adopt Storm 3.
> >
> >      Looking forward to your thoughts.
> >
> > Gruß
> > Richard
> >
> >
>

Reply via email to