Hello Ard, We rewrote a part of our virtual path handling, and now store both the virtual path itself, and the lower-case equivalent (we really need the not-lowercased path). All queries are now done on the lowercased virtual path and indeed (!) everything stays fast, even after a million virtual paths. We'll try to keep away from the lower-case function and similar functions.
Thanks very much for all your help! Dennis Ard Schrijvers wrote: > On Mon, Jan 11, 2010 at 1:07 PM, Dennis van der Laan > <[email protected]> wrote: > >> Hello Ard, >> >> We rewrote a part of our virtual path handling, and now store both the >> virtual path itself, and the lower-case equivalent (we really need the >> not-lowercased path). All queries are now done on the lowercased virtual >> path and indeed (!) everything stays fast, even after a million virtual >> paths. We'll try to keep away from the lower-case function and similar >> functions. >> > > as long as it is a single term lookup in Lucene, it is always fast, > almost regardless the number of terms there are > > >> Thanks very much for all your help! >> > > You're welcome, > > Ard > > >> Dennis >> >> Ard Schrijvers wrote: >> >>> On Thu, Dec 17, 2009 at 10:59 PM, Dennis van der Laan >>> <[email protected]> wrote: >>> >>> >>>> Dennis van der Laan wrote: >>>> >>>> >>> >>>> See the increase of time spent on the execution: 400+ ms instead of 7ms. >>>> And this is not a single incident, I see this increase on all queries >>>> like the above. >>>> >>>> The memory of the JVM should not be a problem, it's set to 2Gb and only >>>> 800Mb is used at the moment the queries are slow. Restarting the >>>> application does not help either. >>>> >>>> >>> No, this seems logical to me. The memory is consumed by internal >>> lucene term enums. I am quite sure what your issue is, but did not >>> test it, nor ever tried it myself. But, I have always wondered *how* >>> the fn:lower-case could have been implemented efficiently in >>> Jackrabbit. It doesn't fit into my understanding of how inverted >>> indexes work, what Lucene is in the end. So, I am happy that my >>> understanding was correct, and unhappy that fn:lower-case does (again, >>> from top of my head and looking at code only) not scale to well. >>> >>> I think in your setup a lot of time is spend in the CaseTermQuery, >>> which traverses all your 1 million virtualpaths first and lowercase >>> it. This cannot scale (nor in cpu, nor in memory). >>> >>> So, would you like to give me an indication about the query execution >>> time without the fn:lower-case? I think it will drop to < 1 ms. >>> >>> I think you should try to get away without using the fn:local-name if >>> this works for you. Just make sure that you store the virtualpath >>> property always as lower-case: then, you are fine >>> >>> >>> >>>> Again, any help will be appreciated. >>>> >>>> >>> let me know if this helped, >>> >>> Regards Ard >>> >>> >>> >>>> Dennis >>>> >>>> >>>> >>>>>> Furthermore, of course, index size matters as well >>>>>> >>>>>> >>>>>> >> -- >> Dennis van der Laan >> >> >> -- Dennis van der Laan
