Re: Not abortable slow consumers / stopped processing of messages in a queue

Tim Bain Wed, 29 Oct 2014 08:47:57 -0700

I'm not clear on what behavior you're seeing, because the descriptions you
give (as I understand them) seem contradictory.  You say that the consumer
won't abort, but that you've got a 30-minute client-side abort timeout.
You say that after the intended abort, you know it didn't work because the
consumer didn't resume processing messages, but then you say that there
weren't any messages to process.  Maybe you're describing multiple
independent scenarios with different behavior and I'm just not catching the
difference between them, but I'm not at all clear on what you're seeing.
Can you give us a from-the-top summary?  No need to give the overview or
any config files or log files, just tell us at each step what you expect to
happen and what's actually happening (and how you know).


Also, your first message was all about aborting slow consumers, while your
reply sounds like it's concerned entirely with aborting idle consumers.
Which one's the problem here?  Also, how do you know that a particular idle
consumer isn't being aborted?  The logs tell you the abort is happening;
what's telling you it's not?

1.  If I've understood correctly, you say your business logic will abort
after 30 minutes, independently of any ActiveMQ-initiated abort request.
Is that actually happening?  The logs you've posted don't give any
indication either way (and you say "the same idle consumer can’t be aborted
in a span of 18 hours"), and the behavior you're describing would be more
consistent with your clients not aborting than with them aborting but not
pulling the next message, though of course both are possible.  So make sure
your client's really doing what you think it is.

3.  Can you confirm (via JConsole in the MBeans tab or some other JMX
viewer) that your consumer is still connected to the broker after the
abort?  Also, when your client aborts, how is ActiveMQ being told about the
failure?  (And what ack mode are you using?)

5 & 6.  For you to use the approach I suggested, you'd either have to be OK
losing messages when failures occur or you'd have to persist the message to
a datastore to retry in the case of a failure.  It sounds like neither of
those is appealing, so this may not be an option.



On Tue, Oct 28, 2014 at 3:42 AM, Marek Dominiak <isdomi...@gmail.com> wrote:

> Hi Tim,
>
> Thank you for your input and sharing your experience and knowledge.
>
>
> tbain98 wrote
> > 1.  In my limited experience with slow consumer abort strategies (using
> > the
> > SlowConsumerAbortStrategy, not the SlowAckConsumerAbortStrategy), I've
> > observed that a client will continue processing the current message even
> > when aborted; the abort seems to allow the broker to get on with its life
> > but doesn't seem to stop the client from finishing what it's doing.  If
> > that's what you mean by "AbortSlowAckConsumerStrategy couldn't abort the
> > consumer", then that's in line with what I've observed.  Maybe someone
> who
> > knows the ActiveMQ client code more intimately will know of a way to
> > interrupt the processing that the client is doing, but if not, you might
> > need to build a max processing time into your client's message-handling
> > logic, to allow your client to stop if it takes too long.
>
>
> First I need to clarify some things:
> - We have defined a transactionTimeout (30 minutes) on a database, so if a
> listener can’t consume the message in less than 30 minutes the exception
> will be thrown and listener will run redelivery policy rules. (In this case
> it will be send back to the broker and broker will schedule one more
> redelivery after 100 seconds, if the message couldn’t be processed again
> broker will send it to the DLQ). The whole processing can take about 30-35
> minutes max.
>
> - About the logs: I have posted the logs somehow selectively - I wanted to
> show that the same idle consumer can’t be aborted in a span of 18 hours.
> Lines like this one:
>
>
> 2014-10-25 00:00:11,455 [host] Scheduler] INFO  AbortSlowConsumerStrategy
> - aborting slow consumer:
> ID:min-p-app02.osl.basefarm.net-36433-1414153506788-1:1:17:7 for
> destination:queue://generateReportQueue
>
> we get every 6 minutes (or so) with the same consumer id all the time. I
> have cut the most of logs to keep it short. There were no messages to
> consume at that point in the queue. The real issue is that
> SlowAckConsumerAbortStrategy couldn’t abort the consumer which was idle. My
> experience with the abort strategy (when it’s working correctly) is the
> same
> as yours: it doesn’t abort the consumer but politely asks to abort when it
> finished processing current message. But in this case the consumer didn’t
> have anything to process (maybe it only had some acks to send back - as I
> was using JMX to move messages to a DLQ by mylself).
>
>
>
>
> tbain98 wrote
> > 2.  Your config seems reasonable for your use-case, though slow consumer
> > abort strategies are generally intended for when a consumer unexpectedly
> > takes a long time, whereas your use case seems like your consumers
> > expectedly but unpredictably take a long time.  But certainly you're
> using
> > the more appropriate of the two strategies if you're going to use one.
>
> As I mentioned earlier max processing time of one message is about 30
> minutes. I agree with you that with so few messages in the queue it’s
> probably better to use check acks and not rely on prefetch buffer.
>
>
> tbain98 wrote
> > 3.  How does queue processing "stop"?  Do you just mean that once both
> > consumers start working on large messages, they're not available to work
> > on
> > small messages?
>
> I mean when consumer “stops processing”, none of messages in the queue are
> being consumed at all (both: small and bigger ones) - they stay
> indefinitely
> in the queue (until the whole application is restarted). It happens as well
> for consumers on both nodes (1 consumer per node).
>
>
> tbain98 wrote
> > 4.  I'm concerned that by allowing one redelivery of each message, you're
> > setting up a situation where you could tie up both of your consumers (one
> > processing the first delivery, one processing the second for the same
> > message); is message re-delivery something you have to have?
>
> That could be the case. I can try to verify if in this case I really need
> redelivery, but from what I remember in two attempts the reports are
> generated in most of the cases, with only one attempt the percentage is
> smaller, which requires more manual attention ...
>
>
>
> tbain98 wrote
> > 6.  One thing you might consider is having your client spin off the work
> > of
> > processing a message into a separate thread, and then returning
> > (successfully) after either the thread finishes or some timeout elapses,
> > whichever happens first.  Then when a large message comes in, it will run
> > in the background till it finishes, but it won't prevent the consumer
> from
> > continuing on without it and it won't cause the broker to redeliver the
> > message to the other consumer and tie up processing.  Obviously your
> > processing algorithm will need to be thread-safe for this to work, but it
> > might give you options without even needing to worry about the
> > SlowConsumerAbortStrategy...  Also, if you've got an algorithm that
> > usually
> > takes under 10 minutes and sometimes takes 18 hours (based on your logs
> > from before you restarted Tomcat), you might want to improve your
> > algorithm, to either speed up the work you're currently doing or find a
> > way
> > to get your answer with less processing (e.g. by only sampling some of
> > your
> > data).  This is obviously very specific to whatever domain you're working
> > in and might not be easy to do, but 18 hours to process a message
> > definitely makes my Spidey senses tingle...
>
> If I understand you correctly I think we can’t use this approach (did I?).
> The whole point of employing JMS for us was to have async processing with
> guarantees. In our system we could have many bugfix releases throughout the
> day, and if that would happen and the report wasn’t generated before the
> restart of the application we would lose the message. I am trying to find a
> config which will work for us most often automatically and only for certain
> problems require manual developer attention.
>
> Once again, thank you for input.
>
> Regards
> Marek
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Not-abortable-slow-consumers-stopped-processing-of-messages-in-a-queue-tp4686721p4686741.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: Not abortable slow consumers / stopped processing of messages in a queue

Reply via email to