On 09/23/2014 09:29 PM, Fox, Kevin M wrote: > Flavio wrote> "The reasoning, as explained in an other > email, is that from a use-case perspective, strict ordering won't hurt > you if you don't need it whereas having to implement it in the client > side because the service doesn't provide it can be a PITA." > > The reasoning is flawed though. If performance is a concern, having strict > ordering costs you when you may not care! > > For example, is it better to implement a video streaming service on tcp or > udp if firewalls aren't a concern? The latter. Why? Because ordering is a > problem for these systems! If you have frames, 1 2 and 3..., and frame 2 gets > lost on the first transmit and needs resending, but 3 gets there, the system > has to wait to display frame 3 waiting for frame 2. But by the time frame 2 > gets there, frame 3 doesn't matter because the system needs to move on to > frame 5 now. The human eye doens't care to wait for retransmits of frames. it > only cares about the now. So because of the ordering, the eye sees 3 dropped > frames instead of just one. making the system worse, not better. > > Yeah, I know its a bit of a silly example. No one would implement video > streaming on top of messaging like that. But it does present the point that > something that seemingly only provides good things (order is always better > then disorder, right?), sometimes has unintended and negative side affects. > In lossless systems, it can show up as unnecessary latency or higher cpu > loads. > > I think your option 1 will make Zaqar much more palatable to those that don't > need the strict ordering requirement. > > I'm glad you want to make hard things like guaranteed ordering available so > that users don't have to deal with it themselves if they don't want to. Its a > great feature. But it also is an anti-feature in some cases. The > ramifications of its requirements are higher then you think, and a feature to > just disable it shouldn't be very costly to implement. > > Part of the controversy right now, I think, has been not understanding the > use case here, and by insisting that FIFO only ever is positive, it makes > others that know its negatives question what other assumptions were made in > Zaqar and makes them a little gun shy. > > Please do reconsider this stance.
Hey Kevin, FWIW, I explicitly said "from a use-case perspective" which in the context of the emails I was replying to referred to the need (or not) for FIFO and not to the impact it has in other areas like performance. In any way I tried to insist that FIFO is only ever positive and I've also explicitly said in several other emails that it *does* have an impact on performance. That said, I agree that if FIFO's reality in Zaqar changes, it'll likely be towards the option (1). Thanks for your feedback, Flavio > > Thanks, > Kevin > > > ________________________________________ > From: Flavio Percoco [[email protected]] > Sent: Tuesday, September 23, 2014 5:58 AM > To: [email protected] > Subject: Re: [openstack-dev] [Zaqar] Zaqar and SQS Properties of Distributed > Queues > > On 09/23/2014 10:58 AM, Gordon Sim wrote: >> On 09/22/2014 05:58 PM, Zane Bitter wrote: >>> On 22/09/14 10:11, Gordon Sim wrote: >>>> As I understand it, pools don't help scaling a given queue since all the >>>> messages for that queue must be in the same pool. At present traffic >>>> through different Zaqar queues are essentially entirely orthogonal >>>> streams. Pooling can help scale the number of such orthogonal streams, >>>> but to be honest, that's the easier part of the problem. >>> >>> But I think it's also the important part of the problem. When I talk >>> about scaling, I mean 1 million clients sending 10 messages per second >>> each, not 10 clients sending 1 million messages per second each. >> >> I wasn't really talking about high throughput per producer (which I >> agree is not going to be a good fit), but about e.g. a large number of >> subscribers for the same set of messages, e.g. publishing one message >> per second to 10,000 subscribers. >> >> Even at much smaller scale, expanding from 10 subscribers to say 100 >> seems relatively modest but the subscriber related load would increase >> by a factor of 10. I think handling these sorts of changes is also an >> important part of the problem (though perhaps not a part that Zaqar is >> focused on). >> >>> When a user gets to the point that individual queues have massive >>> throughput, it's unlikely that a one-size-fits-all cloud offering like >>> Zaqar or SQS is _ever_ going to meet their needs. Those users will want >>> to spin up and configure their own messaging systems on Nova servers, >>> and at that kind of size they'll be able to afford to. (In fact, they >>> may not be able to afford _not_ to, assuming per-message-based pricing.) >> >> [...] >>>> If scaling the number of communicants on a given communication channel >>>> is a goal however, then strict ordering may hamper that. If it does, it >>>> seems to me that this is not just a policy tweak on the underlying >>>> datastore to choose the desired balance between ordering and scale, but >>>> a more fundamental question on the internal structure of the queue >>>> implementation built on top of the datastore. >>> >>> I agree with your analysis, but I don't think this should be a goal. >> >> I think it's worth clarifying that alongside the goals since scaling can >> mean different things to different people. The implication then is that >> there is some limit in the number of producers and/or consumers on a >> queue beyond which the service won't scale and applications need to >> design around that. > > Agreed. The above is not part of Zaqar's goals. That is to say that each > store knows best how to distribute reads and writes itself. Nonetheless, > drivers can be very smart about this and be implemented in ways they'd > take the most out of the backend. > > >>> Note that the user can still implement this themselves using >>> application-level sharding - if you know that in-order delivery is not >>> important to you, then randomly assign clients to a queue and then poll >>> all of the queues in the round-robin. This yields _exactly_ the same >>> semantics as SQS. >> >> You can certainly leave the problem of scaling in this dimension to the >> application itself by having them split the traffic into orthogonal >> streams or hooking up orthogonal streams to provide an aggregated stream. >> >> A true distributed queue isn't entirely trivial, but it may well be that >> most applications can get by with a much simpler approximation. >> >> Distributed (pub-sub) topic semantics are easier to implement, but if >> the application is responsible for keeping the partitions connected, >> then it also takes on part of the burden for availability and redundancy. >> >>> The reverse is true of SQS - if you want FIFO then you have to implement >>> re-ordering by sequence number in your application. (I'm not certain, >>> but it also sounds very much like this situation is ripe for losing >>> messages when your client dies.) >>> >>> So the question is: in which use case do we want to push additional >>> complexity into the application? The case where there are truly massive >>> volumes of messages flowing to a single point? Or the case where the >>> application wants the messages in order? >> >> I think the first case is more generally about increasing the number of >> communicating parties (publishers or subscribers or both). >> >> For competing consumers ordering isn't usually a concern since you are >> processing in parallel anyway (if it is important you need some notion >> of message grouping within which order is preserved and some stickiness >> between group and consumer). >> >> For multiple non-competing consumers the choice needn't be as simple as >> total ordering or no ordering at all. Many systems quite naturally only >> define partial ordering which can be guaranteed more scalably. >> >> That's not to deny that there are indeed cases where total ordering may >> be required however. >> >>> I'd suggest both that the former applications are better able to handle >>> that extra complexity and that the latter applications are probably more >>> common. So it seems that the Zaqar team made a good decision. >> >> If that was a deliberate decision it would be worth clarifying in the >> goals. It seems to be a different conclusion from that reached by SQS >> and as such is part of the answer to the question that began the thread. >> >>> (Aside: it follows that Zaqar probably should have a maximum throughput >>> quota for each queue; or that it should report usage information in such >>> a way that the operator could sometimes bill more for a single queue >>> than they would for the same amount of usage spread across multiple >>> queues; or both.) >>> >>>> I also get the impression, perhaps wrongly, that providing the strict >>>> ordering guarantee wasn't necessarily an explicit requirement, but was >>>> simply a property of the underlying implementation(?). > > The team decided to add FIFO based on the feedback from a group of SQS > users. FIFO is one of the things that the team decided to work on to > differentiate Zaqar from SQS. The reasoning, as explained in an other > email, is that from a use-case perspective, strict ordering won't hurt > you if you don't need it whereas having to implement it in the client > side because the service doesn't provide it can be a PITA. > > The first feedback the team got about FIFO was back at the Portland > summit. The users attending the un-conference provided such feedback, > which the team then brought up to the list[0]. The thread is very > specific to mongodb's case, though, and it does not represent the > current implementation but it does show the team's intention to gather > feedback on this feature back then. > > I believe the guarantee is still useful and it currently does not > represent an issue for the service nor the user. 2 things could happen > to FIFO in the future: > > 1. It's made optional and we allow users to opt-in in a per flavor > basis. (I personally don't like this one because it makes > interoperability even harder). > 2. It's removed completely (Again, I personally don't like this one > because I don't think we have strong enough cases to require this to > happen). > > That said, there's just 1 thing I think will happen for now, it'll be > kept as-is unless there are strong cases that'd require (1) or (2). All > this should be considered in the discussion of the API v2, whenever that > happens. > > [0] > http://lists.openstack.org/pipermail/openstack-dev/2013-April/007650.html > > Cheers, > Flavio > > P.S: again, sorry for all the late and mixed replies, email disaster > happened and I'm working on recovering it. > > -- > @flaper87 > Flavio Percoco > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- @flaper87 Flavio Percoco _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
