Hi all,

A while ago I emailed about the issue of how fields (key) grouped routing
in Heron was not consistent across an update and how this makes preserving
state across an update very difficult and also makes it
difficult/impossible to analyse or predict tuple flows through a
current/proposed topology physical plan.

I suggested adopting Storms approach of pre-defining a routing key
space for each component (eg 0-999), so that instead of an instance having
a single task id that gets reset at every update (eg 10) it has a range of
id's (eg 10-16) that changes depending on the parallelism of the component.
This has the advantage that a key will always hash to the same task ID for
the lifetime of the topology. Meaning recovering state for an instance
after a crash or update is just a case of pulling the state linked to the
keys in its task ID range.

I know the above proposal has issues, not least of all placing a hard upper
limit on the scale out of a component, and that some alternative ideas are
being floated for solving the stateful update issue. However, I just wanted
to throw some more weight behind the Storm approach. There was a recent
paper about high-performance network load balancing
<https://blog.acolyer.org/2018/05/03/stateless-datacenter-load-balancing-with-beamer/>that
describes an approach using a fixed key space similar to Storm's (see the
section called Stable Hashing - they assign a range 100x the expected
connection pool size - which we could do with heron to prevent ever hitting
the upper scaling limit). Also, this new load balancer, Beamer, claims to
be twice as fast as Google's Maglev
<https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-reliable-software-network-load-balancer/>
which again uses a pre-defined keyspace and ID ranges to create look-up
tables deterministically.

I know a load balancer is a different beast to a stream grouping but there
are some interesting ideas in those papers (The links point to summary blog
posts so you don't have to read the whole paper).

Anyway, I just thought I would those papers out there and see what people
think.

Tom Cooper
W: www.tomcooper.org.uk  | Twitter: @tomncooper
<https://twitter.com/tomncooper>

Reply via email to