José Micó created APLO-206:
------------------------------
Summary: Load balance of job queues
Key: APLO-206
URL: https://issues.apache.org/jira/browse/APLO-206
Project: ActiveMQ Apollo
Issue Type: Improvement
Reporter: José Micó
I wish to have load balanced job queues, like in ActiveMQ (copied and pasted):
"A queue implements load balancer semantics. A single message will be received
by exactly one consumer. If there are no consumers available at the time the
message is sent it will be kept until a consumer is available that can process
the message. If a consumer receives a message and does not acknowledge it
before closing then the message will be redelivered to another consumer. A
queue can have many consumers with messages load balanced across the available
consumers."
For example, suppose that I send tree jobs (j1, j2, j3) to a queue with 2
consumers (c1,c2).
The first job takes 10 seconds to complete, jobs 2 and 3 takes only 1 second.
Consumers are using 'client' ack, with credit:1,0 in order to receive only one
job at the time.
The desired behaviour of consumers is:
[ 21:30:00 ][ c1 ] Got job 1
[ 21:30:00 ][ c2 ] Got job 2
[ 21:30:01 ][ c2 ] Ack job 2, now idle
[ 21:30:01 ][ c2 ] Got job 3
[ 21:30:02 ][ c2 ] Ack job 3, now idle
[ 21:30:10 ][ c1 ] Ack job 1, now idle
But currently, Apollo does:
[ 21:30:00 ][ c1 ] Got job 1
[ 21:30:00 ][ c2 ] Got job 2
[ 21:30:01 ][ c2 ] Ack job 2, now idle (!) c2 is idle but does not gets job 3
[ 21:30:10 ][ c1 ] Ack job 1, now idle
[ 21:30:10 ][ c1 ] Got job 3
[ 21:30:14 ][ c1 ] Ack job 3, now idle
Seems that jobs are assigned in a round-robin fashion at the moment of being
received by the broker.
If in this example I send 9 jobs of 1 second (instead of 2), consumer #1 gets 5
and consumer #2 gets the remaining 5, when the optimum would be to send the 9
fast jobs to consumer #2 while consumer #1 is processing the slow one. I know
that using 'client' ack with credit:1,0 is suboptimal from broker perspective,
but is the optimal way to balance jobs between workers.
Besides the underutilization of resources, the main problem is that if a
consumer takes too much time to process a job (say, due to a DB lock) it may
block the processing of a bunch of jobs already assigned to it.
...
BTW, impressive piece of work!!! I really impressed by the completeness of
features, and how well Apollo behaved when I overloaded it with stress tests :)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira