Hi, Thanks for the response,
I am using 1.5.3 and I have the respectDocumentOrder set to false. I did a bit of trawling the forum, once I realised it was the NodeIterator and I found that although respectDocumentOrder=false seemed to have no effect, if I tagged on a default order-by clause to the end of the query I got a LazyNodeIterator rather tha a DocumentOrdered one and this helped a lot with speed. The fianl query went down from over 16 seconds to return 3300 nodes to 0.8 seconds, so that seems to be the solution - ensure you have an "order by" clause on all queries (or alternatively the "respectDocuemntOrder="false" works) regards Dave daveg0 wrote: > > Hi, > > I am getting some really poor search performance out of our repository and > need some help to try understand what I am doing wrong. > > I wil try and give as much detail as possible: > > 1) We are trying to implement an atom repository (see cnd file) > http://www.nabble.com/file/p22865321/atom_node_types.cnd > atom_node_types.cnd > > 2) Search performance is VERY slow using this test below: It appears that > the problem is not the searches themselves, but the loading of the nods in > the returned NodeIterator, this is extemely slow. Does this not lazilly > load? > > @Test > public final void testSearchPerformance() throws Exception { > // Test jcr:like > SpringJCRNodeDAO nodeQuery = (SpringJCRNodeDAO) > daoFactory.getDAO(Node.class); > StopWatch watch = new StopWatch(); > String pathPrefix = > "//portal/portal/test-collection//element(*,atom:Entry)"; > String searchPrefix = > "//element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)"; > > String rootPrefix = "//element(*,atom:Entry)"; > > Limits limits = new Limits(); > limits.setMaxRows(10000); > String query = null; > QueryResultWrapper<List<Node>> results = null; > > query = pathPrefix + > "[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]"; > watch.start(); > results = nodeQuery.get(query, limits.getMaxRows(), > limits.getOffset()); > displayTime(query, watch); > displayResults(query, results); > } > > results: > > 09:43:06,257 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > time: 4598 > 09:43:06,257 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > 09:43:07,972 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] > time: 1715 > 09:43:07,972 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] > size: 1110 > 09:43:09,639 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > time: 1667 > 09:43:09,639 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > 09:43:11,260 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] time: > 1605 > 09:43:11,260 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] size: > 1110 > 09:43:12,881 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > time: 1621 > 09:43:12,881 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > 09:43:14,518 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] > time: 1637 > 09:43:14,518 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] > size: 1110 > 09:43:16,186 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > time: 1668 > 09:43:16,186 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > > The node query does something like: > > public QueryResultWrapper<List<T>> get(String queryString, long limit, > long offset, String userId) throws DAOException { > > // check user privs code etc removed > try { > QueryManager queryManager = > session.getWorkspace().getQueryManager(); > > // This code is tied to JackRabbbit as it allows limits > // and offsets on the queries. The uncommented line is JCR > // implementation agnostic > Query query = queryManager.createQuery(queryString, Query.XPATH); > // QueryResult queryresult = query.execute(); > // ========= JackRabbit-specific code > QueryImpl jackRabbitQuery = (QueryImpl) query; > jackRabbitQuery.setLimit(limit); > jackRabbitQuery.setOffset(offset); > jackrabbitQueryResult = (QueryResultImpl) jackRabbitQuery.execute(); > // QueryResult queryresult = jackRabbitQuery.execute(); > // ===== End of JakcRabbit-specific code > > // NodeIterator nodes = queryresult.getNodes(); > NodeIterator nodes = jackrabbitQueryResult.getNodes(); > while (nodes.hasNext()) { > returnList.add(nodes.nextNode()); > } > } catch (Exception e) { > LOG.error(e.getMessage(), e); > throw new DAOException(e.getMessage()); > } > } > > The equivalent Lucene searches through all of the index subdirectories in > workspaces/default/index say is: > > @Test > public void testTitle() throws Exception { > File[] indexes = new File(indexDir).listFiles(new FileFilter() { > @Override > public boolean accept(File f) { > return f.isDirectory(); > } > }); > > StopWatch watch = new StopWatch(); > for (File file : indexes) { > Directory directory = FSDirectory.getDirectory(file); > IndexSearcher searcher = new IndexSearcher(directory); > try { > System.out.println("Directory: " + directory); > Term t = new Term("12:FULL:titletext", "*"); > Query query = new WildcardQuery(t); > watch.start(); > Hits hits = searcher.search(query); > //showDocs(hits); > watch.stop(); > System.out.println("Hits: " + hits.length() + " query: " + > query + " time: " + watch.getTime()); > watch.reset(); > } finally { > searcher.close(); > directory.close(); > } > } > } > > private void showDocs(Hits hits) throws CorruptIndexException, > IOException { > Document doc; > for (int i = 0; i < hits.length(); i++) { > doc = hits.doc(i); > System.out.println("doc: " + > doc.getField("_:UUID").stringValue()); > } > } > > This returns very quickly: > > Directory: > org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_5 > Hits: 601 query: 12:FULL:titletext:* time: 47 > Directory: > org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_b > Hits: 699 query: 12:FULL:titletext:* time: 16 > Directory: > org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_h > Hits: 811 query: 12:FULL:titletext:* time: 47 > Directory: > org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_y > Hits: 199 query: 12:FULL:titletext:* time: 15 > Directory: > org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_z > Hits: 1000 query: 12:FULL:titletext:* time: 31 > > I have inserted some debug statements into the code and it appears that > the NodeIterator is the problem, the first call to NodeIterator.hasNext() > seems to take seconds with only 1000 nodes, is there something that can be > done about this as the searches are quick but the loading of the nodes is > VERY slow, > > regards, > > Dave > > > > > > -- View this message in context: http://www.nabble.com/Help-with-performace-again-tp22865321p22868029.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
