Thanks Karthik. The doc is not exactly the same but close enough. It seems my doc is an internal one so let's use your doc as reference. I will see if there is any major differences and comment.
On Sun, May 6, 2018 at 1:22 PM, Karthik Ramasamy <kart...@streaml.io> wrote: > Here it is > > https://docs.google.com/document/d/1YDFNvLTX6Sg3WDrNFKiWLaJvuEtK4 > eyxEaA0w9cVlG4/edit#heading=h.d6uy2uxfs2xq > > cheers > /karthik > > > On Sun, May 6, 2018 at 8:20 AM, Bill Graham <billgra...@gmail.com> wrote: > >> Can you share the doc please? >> >> On Sat, May 5, 2018 at 4:18 PM Ning Wang <wangnin...@gmail.com> wrote: >> >> > Thanks. >> > >> > Yeah I have read the design doc. It has a section for scaling and covers >> > some designs but not reaching this level of details I am afraid. >> > >> > On Sat, May 5, 2018 at 9:45 AM, Bill Graham <billgra...@gmail.com> >> wrote: >> > >> >> The stateful processing design included a large section on scaling, >> which >> >> was intended to be done as a future phase. It's very similar to what's >> >> being described. Sanjeev and I worked on it about a 1.5 years ago with >> >> Maosong and it was in a google doc. Sanjeev do you have that design >> doc? I >> >> can't seem locate it. >> >> >> >> On Sat, May 5, 2018 at 12:03 AM, Ning Wang <wangnin...@gmail.com> >> wrote: >> >> >> >> > If we go this way, we need key -> state map for each component so >> that >> >> the >> >> > state data can be repartitioned. >> >> > >> >> > On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy < >> kart...@streaml.io> >> >> > wrote: >> >> > >> >> > > Instead - if it references >> >> > > >> >> > > topology name + component name + key range >> >> > > >> >> > > will it be better? >> >> > > >> >> > > cheers >> >> > > /karthik >> >> > > >> >> > > >> >> > > On Fri, May 4, 2018 at 11:23 PM, Ning Wang <wangnin...@gmail.com> >> >> wrote: >> >> > > >> >> > > > Currently I think each Instance serializes the state object into >> a >> >> byte >> >> > > > array and checkpoint manager saves the byte array into a file. >> The >> >> file >> >> > > is >> >> > > > referenced by topology name + component name + instance id. >> >> > > > >> >> > > > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy < >> >> kart...@streaml.io> >> >> > > > wrote: >> >> > > > >> >> > > > > I am not sure I understand why the state is tied to an >> instance? >> >> > > > > >> >> > > > > cheers >> >> > > > > /karthik >> >> > > > > >> >> > > > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper < >> >> > tom.n.coo...@gmail.com> >> >> > > > > wrote: >> >> > > > > >> >> > > > > > Yeah, state recovery is a bit more difficult with Heron's >> >> > > architecture. >> >> > > > > In >> >> > > > > > Storm, the task IDs are not just values used for routing they >> >> > > actually >> >> > > > > > equate to a task instance within the executor. An executor >> which >> >> > > > > currently >> >> > > > > > processes the keys 4-8 actually contains 5 task instances of >> the >> >> > same >> >> > > > > > component. So for each task, they just save its state >> attached >> >> to >> >> > the >> >> > > > > > single task ID and reassemble executors with the new task >> >> > instances. >> >> > > > > > >> >> > > > > > We don't want or have to do that with Heron instances but we >> >> would >> >> > > need >> >> > > > > to >> >> > > > > > have some way to have a state change tied to the task (or >> >> routing >> >> > key >> >> > > > if >> >> > > > > we >> >> > > > > > go to the key range idea). For something like a word count >> you >> >> > might >> >> > > > save >> >> > > > > > counts using a nested map like: { routing key : {word : count >> >> }}. >> >> > The >> >> > > > > > routing key could be included in the Tuple instance. However, >> >> > whether >> >> > > > > this >> >> > > > > > pattern would work for more generic state cases I don't know? >> >> > > > > > >> >> > > > > > Tom Cooper >> >> > > > > > W: www.tomcooper.org.uk | Twitter: @tomncooper >> >> > > > > > <https://twitter.com/tomncooper> >> >> > > > > > >> >> > > > > > >> >> > > > > > On Fri, 4 May 2018 at 15:54, Neng Lu <freen...@gmail.com> >> >> wrote: >> >> > > > > > >> >> > > > > > > +1 for this idea. As long as the predefined key space is >> large >> >> > > > enough, >> >> > > > > it >> >> > > > > > > should work for most of the cases. >> >> > > > > > > >> >> > > > > > > Based on my experience with topologies, I never saw one >> >> component >> >> > > has >> >> > > > > > more >> >> > > > > > > than 1000 instances in a topology. >> >> > > > > > > >> >> > > > > > > For recovering states from an update, there will be some >> >> problems >> >> > > > > though. >> >> > > > > > > Since the states stored in heron are strongly connected >> with >> >> each >> >> > > > > > instance, >> >> > > > > > > we either need to have >> >> > > > > > > some resolver does the state repartitioning or stores >> states >> >> with >> >> > > the >> >> > > > > key >> >> > > > > > > instead of with each instance. >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > On Fri, May 4, 2018 at 3:01 PM, Karthik Ramasamy < >> >> > > > kramas...@gmail.com> >> >> > > > > > > wrote: >> >> > > > > > > >> >> > > > > > > > Thanks for sharing. I like the Storm approach >> >> > > > > > > > >> >> > > > > > > > - keeps the implementation simpler >> >> > > > > > > > - state is deterministic across restarts >> >> > > > > > > > - makes it easy to reason and debug >> >> > > > > > > > >> >> > > > > > > > The hard limit is not a problem at all since most of the >> >> > > topologies >> >> > > > > > will >> >> > > > > > > > be never that big. >> >> > > > > > > > If you can handle Twitter topologies cleanly, it is more >> >> that >> >> > > > > > sufficient >> >> > > > > > > I >> >> > > > > > > > believe. >> >> > > > > > > > >> >> > > > > > > > cheers >> >> > > > > > > > /karthik >> >> > > > > > > > >> >> > > > > > > > > On May 4, 2018, at 2:31 PM, Thomas Cooper < >> >> > > > tom.n.coo...@gmail.com> >> >> > > > > > > > wrote: >> >> > > > > > > > > >> >> > > > > > > > > Hi all, >> >> > > > > > > > > >> >> > > > > > > > > A while ago I emailed about the issue of how fields >> (key) >> >> > > grouped >> >> > > > > > > routing >> >> > > > > > > > > in Heron was not consistent across an update and how >> this >> >> > makes >> >> > > > > > > > preserving >> >> > > > > > > > > state across an update very difficult and also makes it >> >> > > > > > > > > difficult/impossible to analyse or predict tuple flows >> >> > through >> >> > > a >> >> > > > > > > > > current/proposed topology physical plan. >> >> > > > > > > > > >> >> > > > > > > > > I suggested adopting Storms approach of pre-defining a >> >> > routing >> >> > > > key >> >> > > > > > > > > space for each component (eg 0-999), so that instead >> of an >> >> > > > instance >> >> > > > > > > > having >> >> > > > > > > > > a single task id that gets reset at every update (eg >> 10) >> >> it >> >> > > has a >> >> > > > > > range >> >> > > > > > > > of >> >> > > > > > > > > id's (eg 10-16) that changes depending on the >> parallelism >> >> of >> >> > > the >> >> > > > > > > > component. >> >> > > > > > > > > This has the advantage that a key will always hash to >> the >> >> > same >> >> > > > task >> >> > > > > > ID >> >> > > > > > > > for >> >> > > > > > > > > the lifetime of the topology. Meaning recovering state >> >> for an >> >> > > > > > instance >> >> > > > > > > > > after a crash or update is just a case of pulling the >> >> state >> >> > > > linked >> >> > > > > to >> >> > > > > > > the >> >> > > > > > > > > keys in its task ID range. >> >> > > > > > > > > >> >> > > > > > > > > I know the above proposal has issues, not least of all >> >> > placing >> >> > > a >> >> > > > > hard >> >> > > > > > > > upper >> >> > > > > > > > > limit on the scale out of a component, and that some >> >> > > alternative >> >> > > > > > ideas >> >> > > > > > > > are >> >> > > > > > > > > being floated for solving the stateful update issue. >> >> > However, I >> >> > > > > just >> >> > > > > > > > wanted >> >> > > > > > > > > to throw some more weight behind the Storm approach. >> There >> >> > was >> >> > > a >> >> > > > > > recent >> >> > > > > > > > > paper about high-performance network load balancing >> >> > > > > > > > > <https://blog.acolyer.org/2018/05/03/stateless- >> >> > > > > > > > datacenter-load-balancing-with-beamer/>that >> >> > > > > > > > > describes an approach using a fixed key space similar >> to >> >> > > Storm's >> >> > > > > (see >> >> > > > > > > the >> >> > > > > > > > > section called Stable Hashing - they assign a range >> 100x >> >> the >> >> > > > > expected >> >> > > > > > > > > connection pool size - which we could do with heron to >> >> > prevent >> >> > > > ever >> >> > > > > > > > hitting >> >> > > > > > > > > the upper scaling limit). Also, this new load balancer, >> >> > Beamer, >> >> > > > > > claims >> >> > > > > > > to >> >> > > > > > > > > be twice as fast as Google's Maglev >> >> > > > > > > > > <https://blog.acolyer.org/2016 >> /03/21/maglev-a-fast-and- >> >> > > > > > > > reliable-software-network-load-balancer/> >> >> > > > > > > > > which again uses a pre-defined keyspace and ID ranges >> to >> >> > create >> >> > > > > > look-up >> >> > > > > > > > > tables deterministically. >> >> > > > > > > > > >> >> > > > > > > > > I know a load balancer is a different beast to a stream >> >> > > grouping >> >> > > > > but >> >> > > > > > > > there >> >> > > > > > > > > are some interesting ideas in those papers (The links >> >> point >> >> > to >> >> > > > > > summary >> >> > > > > > > > blog >> >> > > > > > > > > posts so you don't have to read the whole paper). >> >> > > > > > > > > >> >> > > > > > > > > Anyway, I just thought I would those papers out there >> and >> >> see >> >> > > > what >> >> > > > > > > people >> >> > > > > > > > > think. >> >> > > > > > > > > >> >> > > > > > > > > Tom Cooper >> >> > > > > > > > > W: www.tomcooper.org.uk | Twitter: @tomncooper >> >> > > > > > > > > <https://twitter.com/tomncooper> >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> > >> > -- >> Sent from Gmail Mobile >> > >