Hi all, A while ago I emailed about the issue of how fields (key) grouped routing in Heron was not consistent across an update and how this makes preserving state across an update very difficult and also makes it difficult/impossible to analyse or predict tuple flows through a current/proposed topology physical plan.
I suggested adopting Storms approach of pre-defining a routing key space for each component (eg 0-999), so that instead of an instance having a single task id that gets reset at every update (eg 10) it has a range of id's (eg 10-16) that changes depending on the parallelism of the component. This has the advantage that a key will always hash to the same task ID for the lifetime of the topology. Meaning recovering state for an instance after a crash or update is just a case of pulling the state linked to the keys in its task ID range. I know the above proposal has issues, not least of all placing a hard upper limit on the scale out of a component, and that some alternative ideas are being floated for solving the stateful update issue. However, I just wanted to throw some more weight behind the Storm approach. There was a recent paper about high-performance network load balancing <https://blog.acolyer.org/2018/05/03/stateless-datacenter-load-balancing-with-beamer/>that describes an approach using a fixed key space similar to Storm's (see the section called Stable Hashing - they assign a range 100x the expected connection pool size - which we could do with heron to prevent ever hitting the upper scaling limit). Also, this new load balancer, Beamer, claims to be twice as fast as Google's Maglev <https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-reliable-software-network-load-balancer/> which again uses a pre-defined keyspace and ID ranges to create look-up tables deterministically. I know a load balancer is a different beast to a stream grouping but there are some interesting ideas in those papers (The links point to summary blog posts so you don't have to read the whole paper). Anyway, I just thought I would those papers out there and see what people think. Tom Cooper W: www.tomcooper.org.uk | Twitter: @tomncooper <https://twitter.com/tomncooper>