Re: DescendantSelfAxisWeight ChildAxisQuery performance

Christoph Kiehl Fri, 30 Nov 2007 04:00:22 -0800

Ard Schrijvers wrote:

Since my "stuff//[EMAIL PROTECTED]" gives me 1.200.000, it makes
perfect sense
to users I think, that even with our patches and a working
cache, that
retaining them all would be slow. But if I set the limit to
1 or 10, I
would expect to have performance (certainly when you have notimplemented any AccessManager).
But, if I set limit to 1, why would we have to check all 1.200.000parents wether the path is correct?
I'm not quite sure if this is a valid/common use case. Ican't imagine doing a query like this without using an "orderby" clause. Because without an "order by" you will just get arandom node. But if you use an "order by" you need to get allnodes first anyway.
This is not my point. Wether you have an order by or not, lucene will
compute the score of all hits anyway. So, no order by, does not mean
that lucene does not order: it orders on score (but ofcourse you already
know that :-) )
So, my thing holds with and without order by.

Ok, if you use jcr:contains() it makes certainly sense to use lucenesdefault ordering by score. As soon as you are ordering by specificproperties like modification date you won't win anything. I just wantedto express that this solution only works for a limited number of use cases.

1) The total result size will be very inaccurate until youfetched the whole result set. Even now it might be inaccuratebecause of AccessManager checks but doing lazy parent-childrelation check will make it almost unusable.
You might warn that fetching a total result size is slow. Without having
to know the total, it should not have to be slow.


Ok. That might be unexpected behaviour but is a valid solution.

2) DescendantSelfAxisQueries and ChildAxisQueries are notonly used as a final selector but can also be used inside aquery like this:
        stuff//[EMAIL PROTECTED]'text' and @foo/count]

You probably can't calculate @foo/count lazyily.
@foo/count should probably be foo/@count isn't? I haven't yet used
DescendantSelfAxisQueries and ChildAxisQueries in these kind of queries,
but I see your point


Yes, I meant foo/@count of course. ;)

I know what you are talking about. That's why I don't use anyhierarchical queries at all. My queries all look like:
        //element(*, nt:specific-node-type)[EMAIL PROTECTED]
So I'm distinguishing my nodes only by node type or sometimesmixins instead of by paths.


I already understood that (aware) people are using it like this (but
what about the unaware people). But, suppose I have articles in
different languages, with different initials paths, and I want the 10
lastmodified from some language. It doesn't make sense that I need to
make articles for every language a different nodetype, because of
DescendantSelfAxisQueries and ChildAxisQueries.


My current solution is definitely not the way to go for the future!

I also have the idea that it will at least be extremely hard, *but*, I
also wanted to emphasize that if we just look at the problem from a
birds eye view, we must agree that checking all parent paths doesn't
really make sense in some cases (certainly when the number of hits is
very large)


Agreed ;)

Anyway, perhaps we just have to think a little harder. Not everything

has to be simple :-)


I'll try to think a little harder ;)

Cheers,
Christoph

Re: DescendantSelfAxisWeight ChildAxisQuery performance

Reply via email to