So is this horse dead now, 'cuz I wanna turn hitting it... First of all, this thread brings up two separate messaging concepts:
1) at-least-once delivery 2) message acknowledgement For #1 - oslo.messaging cannot guarantee that messages will not be duplicated. Specifically in the case of multiple consumers on the same topic. In that case, oslo.messaging can only dedup on a per-consumer basis because a consumer is unaware of what its peers have received. Therefore if a re-transmit is sent to a different consumer than the original transmit (think lost Ack) both consumers will regard the message as non-duplicate at process it. For #2, I'll go on the record and say that ack-before-process is inherently broken. The acknowledgment is used to inform the messaging subsystem (note I didn't say 'sender') that the receiver of the message assumed ownership of the message. It's a transfer of control thing. The acknowledgment should only be sent when the consuming application has completed processing the message. Can oslo.messaging assume that on the behalf of the consumer? I don't think it should. Acking a message that hasn't been fully processed will negatively affect the message window maintained by the message bus, possibly leading to over-delivery. Having said that, a proper acking mechanism would allow for asynchronous acking - sending the ack from a later time or another thread completely. As Mehdi pointed out this would require some significant changes to the oslo.messaging codebase. my arm is tired - this is one big horse. Thanks On Tue, Jun 7, 2016 at 2:48 AM, Renat Akhmerov <renat.akhme...@gmail.com> wrote: > > On 04 Jun 2016, at 04:16, Doug Hellmann <d...@doughellmann.com> wrote: > > Excerpts from Joshua Harlow's message of 2016-06-03 09:14:05 -0700: > > Deja, Dawid wrote: > > On Thu, 2016-05-05 at 11:08 +0700, Renat Akhmerov wrote: > > > On 05 May 2016, at 01:49, Mehdi Abaakouk <sil...@sileht.net > <mailto:sil...@sileht.net>> wrote: > > > Le 2016-05-04 10:04, Renat Akhmerov a écrit : > > No problem. Let’s not call it RPC (btw, I completely agree with that). > But it’s one of the messaging patterns and hence should be under > oslo.messaging I guess, no? > > > Yes and no, we currently have two APIs (rpc and notification). And > personally I regret to have the notification part in oslo.messaging. > > RPC and Notification are different beasts, and both are today limited > in terms of feature because they share the same driver implementation. > > Our RPC errors handling is really poor, for example Nova just put > instance in ERROR when something bad occurs in oslo.messaging layer. > This enforces deployer/user to fix the issue manually. > > Our Notification system doesn't allow fine grain routing of message, > everything goes into one configured topic/queue. > > And now we want to add a new one... I'm not against this idea, > but I'm not a huge fan. > > Thoughts from folks (mistral and oslo)? > > Also, I was not at the Summit, should I conclude the Tooz+taskflow > approach (that ensure the idempotent of the application within the > library API) have not been accepted by mistral folks ? > > Speaking about idempotency, IMO it’s not a central question that we > should be discussing here. Mistral users should have a choice: if they > manage to make their actions idempotent it’s excellent, in many cases > idempotency is certainly possible, btw. If no, then they know about > potential consequences. > > > You shouldn't mix the idempotency of the user task and the idempotency > of a Mistral action (that will at the end run the user task). > You can have your Mistral task runner implementation idempotent and just > make the workflow to use configurable in case the user task is > interrupted or badly finished even if the user task is idempotent or not. > This makes the thing very predictable. You will know for example: > * if the user task has started or not, > * if the error is due to a node power cut when the user task runs, > * if you can safely retry a not idempotent user task on an other node, > * you will not be impacted by rabbitmq restart or TCP connection issues, > * ... > > With the oslo.messaging approach, everything will just end up in a > generic MessageTimeout error. > > The RPC API already have this kind of issue. Applications have > unfortunately > dealt with that (and I think they want something better now). > I'm just not convinced we should add a new "working queue" API in > oslo.messaging for tasks scheduling that have the same issue we already > have with RPC. > > Anyway, that's your choice, if you want rely on this poor structure, > I will > not be against, I'm not involved in Mistral. I just want everybody is > aware > of this. > > And even in this case there’s usually a number > of measures that can be taken to mitigate those consequences (reruning > workflows from certain points after manually fixing problems, rollback > scenarios etc.). > > > taskflow allows to describe and automate this kind of workflow really > easily. > > What I’m saying is: let’s not make that crucial decision now about > what a messaging framework should support or not, let’s make it more > flexible to account for variety of different usage scenarios. > > > I think the confusion is in the "messaging" keyword, currently > oslo.messaging > is a "RPC" framework and a "Notification" framework on top of 'messaging' > frameworks. > > Messaging framework we uses are 'kombu', 'pika', 'zmq' and 'pingus'. > > It’s normal for frameworks to give more rather than less. > > > I disagree, here we mix different concepts into one library, all concepts > have to be implemented by different 'messaging framework', > So we fortunately give less to make thing just works in the same way > with all > drivers for all APIs. > > One more thing, at the summit we were discussing the possibility to > define at-most-once/at-least-once individually for Mistral tasks. This > is demanded because there cases where we need to do it, advanced users > may choose one or another depending on a task/action semantics. > However, it won’t be possible to implement w/o changes in the > underlying messaging framework. > > > If we goes that way, oslo.messaging users and Mistral users have to > be aware > that their job/task/action/whatever will perhaps not be called > (at-most-once) > or perhaps called twice (at-least-once). > > The oslo.messaging/Mistral API and docs must be clear about this > behavior to > not having bugs open against oslo.messaging because script written > via Mistral > API is not executed as expected "sometimes". > "sometimes" == when deployers have trouble with its rabbitmq (or > whatever) > broker and even just when a deployer restart a broker node or when a TCP > issue occurs. At this end the backtrace in theses cases always trows only > oslo.messaging trace (the well known MessageTimeout...). > > > Also oslo.messaging is already a fragile brick used by everybody that > a very small subset of people maintain (thanks to them). > > I'm afraid that adding such new API will increase the needed > maintenance for this lib while currently not many people care about > (the whole lib not the new API). > > I also wonder if other project have the same needs (that always help > to design a new API). > > > Mehdi, > > What are you proposing? Can you confirm that we should be just dealing > with this problem on our own in Mistral? If so, that works well for > us. Initially we didn’t want to switch to oslo.messaging from direct > access to RabbitMQ for this and also other reasons. But we got a > strong feedback from the community that said “you guys need to reuse > technologies from the community and hence switch to oslo.messaging”. > So we did, assuming that we would fix all needed issues in > oslo.messaging relatively soon. Now it’s been ~2 years since then and > we keep struggling with all that stuff. > > When I see these discussions again and again where people try to > convince that at-least-one delivery is a bad thing I can’t participate > in them anymore. We spent a lot of time thinking about it and > experimenting with it and know all pros and cons. > > Renat Akhmerov > @Nokia > > > Maybe this could be resolved in oslo.messaging by following one of > Python slogans /we are all responsible users here/ [1]. > > What I'm proposing is to let the consumer of the message decide when to > send ACK, because it knows best when to do so. I can think of scenarios > when it is required to send ACK in a middle of message process e.g. > after receiving message I want to store it in the DB before sending an > ACK and send it when message is safely stored. Having that we could > implement whatever delivery model we want in mistral on top of > oslo.messaging. > > > From my understanding (and some of the oslo.messaging folks can correct > me if I am wrong); but they (the oslo.messaging maintainers) don't feel > comfortable allowing such a option to be made possible because of how > doing such a thing alters the principles of oslo.messaging and increases > the complexity of the code-base (and subsequent testing, bug reports, > feature support that come along with enabling such a thing). > > Thus why I think the preference was to have this model (which isn't > really the `rpc` kind of model that oslo.messaging has been targeting at > that point, but is more like a work-queue) be in another library with a > clear API that explicitly is targeted at this kind of model. Thus > instead of having a multi-personality codebase with hidden features like > this (say in oslo.messaging) instead it gets its own codebase and API > that is 'just right' (or more close to being 'right') for it's concept > (vs trying to stuff it into oslo.messaging). > > > What happened to the idea of adding new functions at the level of the > call & cast functions we have now, that work with at-least-once instead > of at-most-once semantics? Yes this is a different sort of use case, but > it's still "messaging". > > > The idea I think is dead. Joshua essentially told the reasons in the > previous message. > > Renat Akhmerov > @Nokia > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Ken Giusti (kgiu...@gmail.com) __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev