Dennis van der Laan wrote: > Hi Ard, > >> Hello Dennis, >> >> On Fri, Dec 11, 2009 at 11:24 AM, Dennis van der Laan >> <[email protected]> wrote: >> >> >>> Hi Ard, >>> >>> Thanks! The performance went up by a factor x10. Still not what I hoped >>> for, but I'm not sure the query itself is still a problem. >>> >>> >> so now it is 100 ms? That is not to fast still. What is your query? >> Some logging:
2009-12-17 15:51:42,102 DEBUG (208340) [jcr.JcrFileSystem] - created vpath query string: //element(*,nt:unstructured)[fn:lower-case(@cms:virtualPath) = '/_definition/shared/schemas/include/banner.xsd'] 2009-12-17 15:51:42,102 DEBUG (208340) [jcr.JcrFileSystem] - vpath query object created 2009-12-17 15:51:42,109 DEBUG (208340) [jcr.JcrFileSystem] - vpath query executed 2009-12-17 15:51:42,109 DEBUG (208340) [jcr.JcrFileSystem] - vpath node iterator created 2009-12-17 15:51:42,109 DEBUG (208340) [jcr.JcrFileSystem] - vpath query done Then, several hours later: 2009-12-17 22:49:44,533 DEBUG ( ) [jcr.JcrFileSystem] - created vpath query string: //element(*,nt:unstructured)[fn:lower-case(@cms:virtualPath) = '/fwn/onderwijs/roosters/2007/wi/overzicht/overzicht_4.xml'] 2009-12-17 22:49:44,534 DEBUG ( ) [jcr.JcrFileSystem] - vpath query object created 2009-12-17 22:49:44,977 DEBUG ( ) [jcr.JcrFileSystem] - vpath query executed 2009-12-17 22:49:44,977 DEBUG ( ) [jcr.JcrFileSystem] - vpath node iterator created 2009-12-17 22:49:44,977 DEBUG ( ) [jcr.JcrFileSystem] - vpath query done See the increase of time spent on the execution: 400+ ms instead of 7ms. And this is not a single incident, I see this increase on all queries like the above. The memory of the JVM should not be a problem, it's set to 2Gb and only 800Mb is used at the moment the queries are slow. Restarting the application does not help either. Again, any help will be appreciated. Dennis >> Furthermore, of course, index size matters as well >> >> > Triggered by your remark on index size, I created a new repository and > started filling it up with nodes which have a virtual path property > (cms:virtualPath). At a certain point, I see a significant degradation > of the performance. I made a thread dump to see what the VM was doing > and found this stack trace: > > java.lang.Thread.State: RUNNABLE > at java.io.RandomAccessFile.readBytes(Native Method) > at java.io.RandomAccessFile.read(RandomAccessFile.java:322) > at > org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal(FSDirectory.java:596) > - locked <0x85523040> (a > org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor) > at > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136) > at > org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157) > at > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116) > at > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:92) > at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:82) > at > org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:127) > at > org.apache.lucene.index.SegmentMergeInfo.next(SegmentMergeInfo.java:65) > at > org.apache.lucene.index.MultiSegmentReader$MultiTermEnum.next(MultiSegmentReader.java:494) > at > org.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:67) > at > org.apache.jackrabbit.core.query.lucene.CaseTermQuery$CaseTermEnum.<init>(CaseTermQuery.java:146) > at > org.apache.jackrabbit.core.query.lucene.CaseTermQuery.getEnum(CaseTermQuery.java:53) > at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:55) > at > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383) > at > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383) > at > org.apache.jackrabbit.core.query.lucene.JackrabbitIndexSearcher.evaluate(JackrabbitIndexSearcher.java:99) > at > org.apache.jackrabbit.core.query.lucene.JackrabbitIndexSearcher.execute(JackrabbitIndexSearcher.java:84) > at > org.apache.jackrabbit.core.query.lucene.SearchIndex.executeQuery(SearchIndex.java:760) > at > org.apache.jackrabbit.core.query.lucene.SingleColumnQueryResult.executeQuery(SingleColumnQueryResult.java:66) > at > org.apache.jackrabbit.core.query.lucene.QueryResultImpl.getResults(QueryResultImpl.java:298) > at > org.apache.jackrabbit.core.query.lucene.SingleColumnQueryResult.<init>(SingleColumnQueryResult.java:58) > at > org.apache.jackrabbit.core.query.lucene.QueryImpl.execute(QueryImpl.java:131) > at > org.apache.jackrabbit.core.query.QueryImpl.execute(QueryImpl.java:177) > > Could this mean that there is not enough memory for the Lucene indexes > and the indexes are read from disk all the time? > Any idea how large the indexes will become? I have no idea how the > internals of Lucene look like. The virtual paths have an average string > length of about 50 characters and we end up having about 1 million of > these properties. > > Thanks for any help! > > Dennis > >> >> >>> A related question: could it be that when a query returns no results, >>> this is slower than when it does return a result? Might it have >>> something to do with Lucene not having an index for that particular >>> property value? >>> >>> >> No, an inverted index structure does not suffer from this >> >> Regards Ard >> >> >> >>>> Hello Dennis, >>>> >>>> >>>> > >
