Re: Stateful updating and deterministic routing

Ning Wang Fri, 04 May 2018 14:44:38 -0700

Interesting. Thanks for sharing~

On Fri, May 4, 2018 at 2:31 PM, Thomas Cooper <[email protected]>
wrote:


> Hi all,
>
> A while ago I emailed about the issue of how fields (key) grouped routing
> in Heron was not consistent across an update and how this makes preserving
> state across an update very difficult and also makes it
> difficult/impossible to analyse or predict tuple flows through a
> current/proposed topology physical plan.
>
> I suggested adopting Storms approach of pre-defining a routing key
> space for each component (eg 0-999), so that instead of an instance having
> a single task id that gets reset at every update (eg 10) it has a range of
> id's (eg 10-16) that changes depending on the parallelism of the component.
> This has the advantage that a key will always hash to the same task ID for
> the lifetime of the topology. Meaning recovering state for an instance
> after a crash or update is just a case of pulling the state linked to the
> keys in its task ID range.
>
> I know the above proposal has issues, not least of all placing a hard upper
> limit on the scale out of a component, and that some alternative ideas are
> being floated for solving the stateful update issue. However, I just wanted
> to throw some more weight behind the Storm approach. There was a recent
> paper about high-performance network load balancing
> <https://blog.acolyer.org/2018/05/03/stateless-datacenter-load-balancing-
> with-beamer/>that
> describes an approach using a fixed key space similar to Storm's (see the
> section called Stable Hashing - they assign a range 100x the expected
> connection pool size - which we could do with heron to prevent ever hitting
> the upper scaling limit). Also, this new load balancer, Beamer, claims to
> be twice as fast as Google's Maglev
> <https://blog.acolyer.org/2016/03/21/maglev-a-fast-and-
> reliable-software-network-load-balancer/>
> which again uses a pre-defined keyspace and ID ranges to create look-up
> tables deterministically.
>
> I know a load balancer is a different beast to a stream grouping but there
> are some interesting ideas in those papers (The links point to summary blog
> posts so you don't have to read the whole paper).
>
> Anyway, I just thought I would those papers out there and see what people
> think.
>
> Tom Cooper
> W: www.tomcooper.org.uk  | Twitter: @tomncooper
> <https://twitter.com/tomncooper>
>

Re: Stateful updating and deterministic routing

Reply via email to