Re: Stateful updating and deterministic routing

2018-05-07 Thread Ning Wang
I see. Then the doc I was reading might not be it either. I will ask maosong then. Thanks for the info! On Mon, May 7, 2018 at 9:23 AM, Bill Graham wrote: > Yeah, that's not it. The stateful scaling part of that doc got lengthy > enough that we broke it into a doc of

Re: Stateful updating and deterministic routing

2018-05-07 Thread Bill Graham
Yeah, that's not it. The stateful scaling part of that doc got lengthy enough that we broke it into a doc of it's own, per Sanjeev's suggestion IIRC. The fact that I can't locate it makes me think it was a twitter doc of mine (although it was not Twitter-specific), which I'm sure was shared with

Re: Stateful updating and deterministic routing

2018-05-07 Thread Ning Wang
Thanks Karthik. The doc is not exactly the same but close enough. It seems my doc is an internal one so let's use your doc as reference. I will see if there is any major differences and comment. On Sun, May 6, 2018 at 1:22 PM, Karthik Ramasamy wrote: > Here it is > >

Re: Stateful updating and deterministic routing

2018-05-06 Thread Karthik Ramasamy
Here it is https://docs.google.com/document/d/1YDFNvLTX6Sg3WDrNFKiWLaJvuEtK4eyxEaA0w9cVlG4/edit#heading=h.d6uy2uxfs2xq cheers /karthik On Sun, May 6, 2018 at 8:20 AM, Bill Graham wrote: > Can you share the doc please? > > On Sat, May 5, 2018 at 4:18 PM Ning Wang

Re: Stateful updating and deterministic routing

2018-05-06 Thread Bill Graham
Can you share the doc please? On Sat, May 5, 2018 at 4:18 PM Ning Wang wrote: > Thanks. > > Yeah I have read the design doc. It has a section for scaling and covers > some designs but not reaching this level of details I am afraid. > > On Sat, May 5, 2018 at 9:45 AM, Bill

Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
Thanks. Yeah I have read the design doc. It has a section for scaling and covers some designs but not reaching this level of details I am afraid. On Sat, May 5, 2018 at 9:45 AM, Bill Graham wrote: > The stateful processing design included a large section on scaling, which

Re: Stateful updating and deterministic routing

2018-05-05 Thread Bill Graham
The stateful processing design included a large section on scaling, which was intended to be done as a future phase. It's very similar to what's being described. Sanjeev and I worked on it about a 1.5 years ago with Maosong and it was in a google doc. Sanjeev do you have that design doc? I can't

Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
If we go this way, we need key -> state map for each component so that the state data can be repartitioned. On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy wrote: > Instead - if it references > > topology name + component name + key range > > will it be better? > > cheers

Re: Stateful updating and deterministic routing

2018-05-05 Thread Karthik Ramasamy
Instead - if it references topology name + component name + key range will it be better? cheers /karthik On Fri, May 4, 2018 at 11:23 PM, Ning Wang wrote: > Currently I think each Instance serializes the state object into a byte > array and checkpoint manager saves the

Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
Currently I think each Instance serializes the state object into a byte array and checkpoint manager saves the byte array into a file. The file is referenced by topology name + component name + instance id. On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy wrote: > I am not

Re: Stateful updating and deterministic routing

2018-05-05 Thread Karthik Ramasamy
I am not sure I understand why the state is tied to an instance? cheers /karthik On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper wrote: > Yeah, state recovery is a bit more difficult with Heron's architecture. In > Storm, the task IDs are not just values used for routing

Re: Stateful updating and deterministic routing

2018-05-04 Thread Thomas Cooper
Yeah, state recovery is a bit more difficult with Heron's architecture. In Storm, the task IDs are not just values used for routing they actually equate to a task instance within the executor. An executor which currently processes the keys 4-8 actually contains 5 task instances of the same

Re: Stateful updating and deterministic routing

2018-05-04 Thread Neng Lu
+1 for this idea. As long as the predefined key space is large enough, it should work for most of the cases. Based on my experience with topologies, I never saw one component has more than 1000 instances in a topology. For recovering states from an update, there will be some problems though.

Re: Stateful updating and deterministic routing

2018-05-04 Thread Ning Wang
Interesting. Thanks for sharing~ On Fri, May 4, 2018 at 2:31 PM, Thomas Cooper wrote: > Hi all, > > A while ago I emailed about the issue of how fields (key) grouped routing > in Heron was not consistent across an update and how this makes preserving > state across an

Stateful updating and deterministic routing

2018-05-04 Thread Thomas Cooper
Hi all, A while ago I emailed about the issue of how fields (key) grouped routing in Heron was not consistent across an update and how this makes preserving state across an update very difficult and also makes it difficult/impossible to analyse or predict tuple flows through a current/proposed