Hi,

On Fri, Aug 1, 2014 at 11:58 AM, Carsten Ziegeler <cziege...@apache.org> wrote:
> in my case I have several producers and a single consumer, so your second
> suggestion should do the trick.

Actually, the getChildNodeNames() feature is only available at Oak
level, so you'll need to use JCR's getNodes() call to get something
like this:

    // consumes all entries currently in the queue, call again later to continue
    void consume(Node queue) {
        // collect and sort all current entries in the queue
        Map<String, Node> entries = new TreeMap<String, Node>();
        for (Node node : JcrUtils.getNodes(queue)) {
            entries.put(node.getName(), node);
        }

        // process all current entries in correct order
        for (Node entry : entries.values()) {
            processEntry(entry);
            entry.remove();
        }

        queue.getSession().save();
    }

> Can you make any comments about the performance of such a solution? I
> assume Node#getChildNodeNames() is pretty cheap, what about the addNode,
> getNode, removeNode methods?

The exact performance depends on the storage backend you're using, but
generally I'd expect the performance of the above code to be governed
by the processEntry() call if it needs to do any non-trivial work on
the entries in the queue.

The getChildNodeNames/getNodes call (including iteration over the
results) is probably the most expensive individual call if the queue
is long, but since its performance is O(n) with all Oak backends, and
the constant factor is pretty low, I would expect that part to only
take a fraction of the time spent in the above call.

The addNode/getNode/removeNode calls are O(log n) operations or
faster, so the overall performance of the above method would follow
O(n log n), including the explicit sorting. The more expensive the
processEntry() call is, the closer the performance will be to O(n),
which is the best case limit for n processEntry() calls regardless of
which queue structure is used.

Note that good performance here is achieved by making the consumer
process entries in bulk. If you instead have a need to process entries
only one by one in separate method calls (i.e. you can't keep the
sorted list of entries in memory), you'd end up with an O(n^2)
algorithm to consume all entries in the queue, as each consume call
would need to traverse the list of remaining entries to find the one
that should be processed first.

BR,

Jukka Zitting

Reply via email to