Ard Schrijvers wrote:

Since my "stuff//[EMAIL PROTECTED]" gives me 1.200.000, it makes
perfect sense
to users I think, that even with our patches and a working
cache, that
retaining them all would be slow. But if I set the limit to
1 or 10, I
would expect to have performance (certainly when you have not implemented any AccessManager).

But, if I set limit to 1, why would we have to check all 1.200.000 parents wether the path is correct?
I'm not quite sure if this is a valid/common use case. I can't imagine doing a query like this without using an "order by" clause. Because without an "order by" you will just get a random node. But if you use an "order by" you need to get all nodes first anyway.

This is not my point. Wether you have an order by or not, lucene will
compute the score of all hits anyway. So, no order by, does not mean
that lucene does not order: it orders on score (but ofcourse you already
know that :-) )
So, my thing holds with and without order by.

Ok, if you use jcr:contains() it makes certainly sense to use lucenes default ordering by score. As soon as you are ordering by specific properties like modification date you won't win anything. I just wanted to express that this solution only works for a limited number of use cases.

1) The total result size will be very inaccurate until you fetched the whole result set. Even now it might be inaccurate because of AccessManager checks but doing lazy parent-child relation check will make it almost unusable.

You might warn that fetching a total result size is slow. Without having
to know the total, it should not have to be slow.

Ok. That might be unexpected behaviour but is a valid solution.

2) DescendantSelfAxisQueries and ChildAxisQueries are not only used as a final selector but can also be used inside a query like this:

        stuff//[EMAIL PROTECTED]'text' and @foo/count]

You probably can't calculate @foo/count lazyily.

@foo/count should probably be foo/@count isn't? I haven't yet used
DescendantSelfAxisQueries and ChildAxisQueries in these kind of queries,
but I see your point

Yes, I meant foo/@count of course. ;)

I know what you are talking about. That's why I don't use any hierarchical queries at all. My queries all look like:

        //element(*, nt:specific-node-type)[EMAIL PROTECTED]

So I'm distinguishing my nodes only by node type or sometimes mixins instead of by paths.

I already understood that (aware) people are using it like this (but
what about the unaware people). But, suppose I have articles in
different languages, with different initials paths, and I want the 10
lastmodified from some language. It doesn't make sense that I need to
make articles for every language a different nodetype, because of
DescendantSelfAxisQueries and ChildAxisQueries.

My current solution is definitely not the way to go for the future!

I also have the idea that it will at least be extremely hard, *but*, I
also wanted to emphasize that if we just look at the problem from a
birds eye view, we must agree that checking all parent paths doesn't
really make sense in some cases (certainly when the number of hits is
very large)

Agreed ;)

Anyway, perhaps we just have to think a little harder. Not everything
has to be simple :-)

I'll try to think a little harder ;)

Cheers,
Christoph

Reply via email to