On Tue, 15 Jul 2014, Sandy Walsh wrote:
This looks like a particular schema for one event-type (let's say "foo.sample"). It's hard to extrapolate this one schema to a generic set of common metadata applicable to all events. Really the only common stuff we can agree on is the stuff already there: tenant, user, server, message_id, request_id, timestamp, event_type, etc.
This is pretty much what I'm trying to figure out. We can, relatively agree on a small set of stuff (like what you mention). Presumably there are three more sets: * special keys that could be changed to something more generally meaningful if we tried hard enough * special keys that really are special and must be saved as such * special keys that nobody cares about and can be tossed Everybody thinks their own stuff is special[1] but it is often the case that it's not. In your other message you linked to http://paste.openstack.org/show/54140/ which shows some very complicated payloads (but only gets through the first six events). Is there related data (even speculative) for how many of those keys are actually used? And just looking at the paste (and the problem) generally, does it make sense for the accessors in the dictionaries (the keys) to be terms which are specific to the producer? Obviously that will increase the appearance of disjunction between different events. A different representation might not be as problematic. Or maybe I'm completely wrong, just thinking out loud.
This way, we can keep important notifications in a priority queue and handle them accordingly (since they hold important data), but let the samples get routed across less-reliable transports (like UDP) via the RoutingNotifier.
Presumably a more robust, uh, contract, for notifications, will allow them to be dispatched (and re-dispatched) more effectively.
Also, send the samples one-at-a-time and let them either a) drop on the floor (udp) or b) let the aggregator roll them up into something smaller (sliding window, etc). Making these large notifications contain a list of samples means we had to store state somewhere on the server until transmission time. Ideally something we wouldn't want to rely on.
I've wondered about this too. Is there history for why some of the notifications which include samples are rolled up lists instead of fired off one at a time. Seems like that will hurt parallelism opportunities? [1] There's vernacular here that I'd prefer to use but this is a family mailing list. -- Chris Dent tw:@anticdent freenode:cdent https://tank.peermore.com/tanks/cdent _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev