On 18/04/17 08:44 AM, Julien Danjou wrote:
> > Live upgrade never has been supported in Gnocchi, so I don't see how > that's a problem. It'd be cool to support it for sure, but we're far > from having been able to implement it at any point in time in the best. > So it's not a new issue or anything like that. I really don't see > a problem with loading the number of sacks at startup. > it's a problem if you don't do a full shut down of every single gnocchi service. my main concern is this is not a 'lose data' situation like if you screw up any of the options. this will corrupt your storage. i'll ignore discussion for live upgrade for now to not get sidetracked. > > I think it's worth it only if you use replicas – and I don't think 2 is > enough, I'd try 3 at least, and make it configurable. It'll reduce a lot > lock-contention (e.g. by 17x time in my previous example). i could make it same reduction in lock contention if i added basic partitioning :P > As far as I'm concerned, since the number of replicas is configurable, > you can add a knob that would set replicas=number_of_metricd_worker that > would implement the current behaviour you implemented – every worker > tries to grab every sack. do we want it configurable? tbh, would anyway one configure it or know how to configure it? even for us, we're just guessing somewhat.lol i'm going to leave it static for now. > > We're not leveraging the re-balancing aspect of hashring, that's true. > We could probably use any dumber system to spread sacks across workers, > We could stick to the good ol' "len(sacks) / len(workers in the group)". > > But I think there's a couple of things down the road that may help us: > Using the hashring makes sure worker X does not jump from sacks [A, B, > C], to [W, X, Y, Z] but just to [A, B] or [A, B, C, X]. That should > minimize lock contention when bringing up/down new workers. I admit it's > a very marginal win, but… it comes free with it. > Also, I envision a push based approach in the future (to replace the > metricd_processing_delay) which will require worker to register to > sacks. Making sure the rebalancing does not shake everything but is > rather smooth will also reduce workload around that. Again, it comes > free. > this is not really free. choosing hashring means we will have idle workers and more complexity of figuring out what each of the other agents look like in group. it's a trade-off (especially considering how few keys to nodes we have) which is why i brought up question. i'll be honest, i'll probably still switch back to hashring... but want to make sure we're not just thinking hashring only. -- gord __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
