Hi Helen, I think you and Gordon moight be talking about a different use of selectors; yours being for the case where you try to pick off particular messages from an existing backlog at a specific time, and Gordon's suggestion being more around use of selectors on all your long-lived consumers to consume the messages as they arrive and remove the need to pick out specific messages later.
Picking individual messages off a queue with new consumers using a selector is never likely to be that fast, because as you say it might have to evaluate every message first to find a match (assuming there is one). On the other hand, when using [a number of] long lived consumers aimed at [collectively] consuming all the messages on the queue, the selectors are simply evaluated during the regular process of attempting to deliver each message to the available consumers. The overhead then moves from checking every message specifically until it matches, which may be dependant on the queue depth, to the regular competition between particular subscriptions for accepting the messages as they are processed, which in some ways isnt. I also presume that, as discussed on the other thread, you could also currently be seeing a hit in performance due to using 'shared groups' with unique'ish keys; more on this later. Particular distributions of selectors could be used to ensure that at most K consumers could ever process particular messages like X, and thus do so at one time, but would additionally mean that if those particular consumers were busy processing messages like Y which they can also consume, then it could be possible for other consumers to sit idle because their selectors indicate they cant process messages like X. It would need to be considered as a balance, governed by whatever it is you are looking to achieve by limiting the maximum number of concurrent consumers for a given type of message. The same approach could technically be used across multiple queues, which might also use groups (which is a little weird to write, since groups are usually used to prevent concurrent delivery), but doing so would add an additional element to the 'idle consumers' balancing problem, wherein at a given time some sources might have messages of interest and others might not. The above, and any other specific ideas people have, might all depend what your messages, groups, and queues are actually like. E.g. are some higher volume than others, are some more important than others, do you know what the groups are ahead of time, etc etc. If I am reading them correctly it seems like all of the original options in this thread allow for a message to go round in circles between the various queues, potentially forever. There also seems to be scope for interesting ordering effects from re-enqueing messages. Given that ordering is another key reason for using message grouping, what are your actual ordering requirements? On a related-but-not note, see the other thread for further discussion around improvements for queues with 'shared groups'. Robbie On 17 January 2014 21:38, Helen Kwong <[email protected]> wrote: > Hi Gordon, > > In the tests that we've run, the time it takes to dequeue messages using > selectors seems to increase with the depth of the queue. Since the number > of unprocessed messages can sometimes be quite high (e.g., >200000), if > they are all on the same queue and we use selectors, the dequeue time will > increase by a lot (e.g., 3-4 seconds if we're selecting the 200000th > message), and the performance hit is probably too much for us. Is there a > way to dequeue using selectors quickly from a high-depth queue? > > Helen > > > On Fri, Jan 17, 2014 at 2:40 AM, Gordon Sim <[email protected]> wrote: > > > On 01/16/2014 07:20 PM, Helen Kwong wrote: > > > >> Hi Qpid users / experts, > >> > >> I need to limit the number of consumers concurrently processing messages > >> considered to be in the same group, across multiple queues, and was > >> wondering if anyone has ideas about how to do it. We’re using the Java > >> broker and client, and have multiple queues, each with multiple > listeners, > >> each listener’s session listening to multiple queues. Some messages are > >> associated with groups, and for a given group we want at most K > listeners > >> processing messages from the group at any given time. The messages are > >> enqueued to multiple queues, and it’s possible for messages from the > same > >> group to be in different queues. > >> > >> If messages in the same group can go into only one queue, then the > message > >> groups feature will give us what we need (it’d work directly with K = 1 > >> and > >> with K > 1 we can tweak the grouping value, e.g., hash it to one of 1 > to K > >> and append the number to the grouping value). But since messages > >> considered > >> to be in the same group can be in different queues, the feature is not > >> enough for our case. > >> > > > > Instead of multiple queues, could you have one queue with different > > selectors pulling subsets of the messages? > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > >
