Yeah, custom written RabbitMQ spout. What we're seeing is messages that definitely have been pulled into the topology. We tested with one message, verified that the spout was pulling the message off the queue, and then verified that the collector.ack() call had been made. However, we never saw the spout collector ack method being called -- this is where the logic for doing the RabbitMQ ack'ing is called and why the message is remaining on the queue.
As max spout pending is per task (spout instance), our solution at the moment is to increase parallelism hints on both the spouts and bolts and set max spout pending to a very low number (at the moment 1, but we'll probably increment this slightly higher). Thanks for the input, glad to see this issue isn't unique to our configuration. On Mon, Feb 9, 2015 at 4:59 PM, Michael Rose <[email protected]> wrote: > Is this your own custom RabbitMQ spout? > > Prefetch means there'll be more messages hanging around within the > RabbitMQ client that haven't been injected into Storm yet. We've had this > too -- the messages return to visible once the client is killed. Are you > seeing messages that have definitely been pulled into Storm not being acked? > > I'd guess you weren't seeing acks because without max spout pending you > were flooding your topology however it shouldn't be necessary to set it to > receive acks as long as everything is under capacity. Having max spout > pending is always a good idea to set, otherwise you have no backpressure > mechanism. Something to keep in mind, max spout pending is per spout > instance. > > *Michael Rose* > Senior Platform Engineer > *Full*Contact | fullcontact.com > <https://www.fullcontact.com/?utm_source=FullContact%20-%20Email%20Signatures&utm_medium=email&utm_content=Signature%20Link&utm_campaign=FullContact%20-%20Email%20Signatures> > m: +1.720.837.1357 | t: @xorlev > > > All Your Contacts, Updated and In One Place. > Try FullContact for Free > <https://www.fullcontact.com/?utm_source=FullContact%20-%20Email%20Signatures&utm_medium=email&utm_content=Signature%20Link&utm_campaign=FullContact%20-%20Email%20Signatures> > > On Mon, Feb 9, 2015 at 2:00 PM, Shepherd Kendall < > [email protected]> wrote: > >> Greetings -- >> >> We've been using Storm to pull messages off a RabbitMQ queue for >> processing in our topology. When we added the ack'ing to the spout we're >> using to pull messages off the queue, we see some interesting behavior. >> >> We have verified the following: >> - Sending an anchor on emit from the spout (ala emit(new >> Values("message"), new Ack(deliveryTag)) >> - Sending an ack at the end of the execute(Tuple) method in our bolt >> (only one bolt type in the topology) >> - Set max spout pending to a variety of settings from 1 to 200 >> - Max prefetch on our queue is set to 200 >> - Sending an acknowledgement to RabbitMQ in our ack method on the spout >> >> Initially, we were seeing none of our messages ack (before setting the >> max spout pending). Once we set the max spout pending, we started seeing >> acks, but the spout still left messages unacknowledged on the queue (i.e. a >> final "cleanup" ack wasn't being sent). >> >> So, the questions I have are: does max spout pending need to be set for >> the emit/ack/fail framework to work correctly? Is the behavior of the >> spout to block until the number of tuples ack'ed reaches max spout pending? >> >> Thanks for any help in advance. >> > > -- /shep
