Hi, I am getting some really poor search performance out of our repository and need some help to try understand what I am doing wrong.
I wil try and give as much detail as possible: 1) We are trying to implement an atom repository (see cnd file) http://www.nabble.com/file/p22865321/atom_node_types.cnd atom_node_types.cnd 2) Search performance is VERY slow using this test below: It appears that the problem is not the searches themselves, but the loading of the nods in the returned NodeIterator, this is extemely slow. Does this not lazilly load? @Test public final void testSearchPerformance() throws Exception { // Test jcr:like SpringJCRNodeDAO nodeQuery = (SpringJCRNodeDAO) daoFactory.getDAO(Node.class); StopWatch watch = new StopWatch(); String pathPrefix = "//portal/portal/test-collection//element(*,atom:Entry)"; String searchPrefix = "//element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)"; String rootPrefix = "//element(*,atom:Entry)"; Limits limits = new Limits(); limits.setMaxRows(10000); String query = null; QueryResultWrapper<List<Node>> results = null; query = pathPrefix + "[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]"; watch.start(); results = nodeQuery.get(query, limits.getMaxRows(), limits.getOffset()); displayTime(query, watch); displayResults(query, results); } results: 09:43:06,257 [main] INFO SearchPerformanceTest : Query: //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] time: 4598 09:43:06,257 [main] INFO SearchPerformanceTest : Results for: //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] size: 1110 09:43:07,972 [main] INFO SearchPerformanceTest : Query: //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] time: 1715 09:43:07,972 [main] INFO SearchPerformanceTest : Results for: //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] size: 1110 09:43:09,639 [main] INFO SearchPerformanceTest : Query: //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] time: 1667 09:43:09,639 [main] INFO SearchPerformanceTest : Results for: //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] size: 1110 09:43:11,260 [main] INFO SearchPerformanceTest : Query: //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] time: 1605 09:43:11,260 [main] INFO SearchPerformanceTest : Results for: //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] size: 1110 09:43:12,881 [main] INFO SearchPerformanceTest : Query: //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] time: 1621 09:43:12,881 [main] INFO SearchPerformanceTest : Results for: //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] size: 1110 09:43:14,518 [main] INFO SearchPerformanceTest : Query: //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] time: 1637 09:43:14,518 [main] INFO SearchPerformanceTest : Results for: //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] size: 1110 09:43:16,186 [main] INFO SearchPerformanceTest : Query: //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] time: 1668 09:43:16,186 [main] INFO SearchPerformanceTest : Results for: //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')] size: 1110 The node query does something like: public QueryResultWrapper<List<T>> get(String queryString, long limit, long offset, String userId) throws DAOException { // check user privs code etc removed try { QueryManager queryManager = session.getWorkspace().getQueryManager(); // This code is tied to JackRabbbit as it allows limits // and offsets on the queries. The uncommented line is JCR // implementation agnostic Query query = queryManager.createQuery(queryString, Query.XPATH); // QueryResult queryresult = query.execute(); // ========= JackRabbit-specific code QueryImpl jackRabbitQuery = (QueryImpl) query; jackRabbitQuery.setLimit(limit); jackRabbitQuery.setOffset(offset); jackrabbitQueryResult = (QueryResultImpl) jackRabbitQuery.execute(); // QueryResult queryresult = jackRabbitQuery.execute(); // ===== End of JakcRabbit-specific code // NodeIterator nodes = queryresult.getNodes(); NodeIterator nodes = jackrabbitQueryResult.getNodes(); while (nodes.hasNext()) { returnList.add(nodes.nextNode()); } } catch (Exception e) { LOG.error(e.getMessage(), e); throw new DAOException(e.getMessage()); } } The equivalent Lucene searches through all of the index subdirectories in workspaces/default/index say is: @Test public void testTitle() throws Exception { File[] indexes = new File(indexDir).listFiles(new FileFilter() { @Override public boolean accept(File f) { return f.isDirectory(); } }); StopWatch watch = new StopWatch(); for (File file : indexes) { Directory directory = FSDirectory.getDirectory(file); IndexSearcher searcher = new IndexSearcher(directory); try { System.out.println("Directory: " + directory); Term t = new Term("12:FULL:titletext", "*"); Query query = new WildcardQuery(t); watch.start(); Hits hits = searcher.search(query); //showDocs(hits); watch.stop(); System.out.println("Hits: " + hits.length() + " query: " + query + " time: " + watch.getTime()); watch.reset(); } finally { searcher.close(); directory.close(); } } } private void showDocs(Hits hits) throws CorruptIndexException, IOException { Document doc; for (int i = 0; i < hits.length(); i++) { doc = hits.doc(i); System.out.println("doc: " + doc.getField("_:UUID").stringValue()); } } This returns very quickly: Directory: org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_5 Hits: 601 query: 12:FULL:titletext:* time: 47 Directory: org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_b Hits: 699 query: 12:FULL:titletext:* time: 16 Directory: org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_h Hits: 811 query: 12:FULL:titletext:* time: 47 Directory: org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_y Hits: 199 query: 12:FULL:titletext:* time: 15 Directory: org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_z Hits: 1000 query: 12:FULL:titletext:* time: 31 I have inserted some debug statements into the code and it appears that the NodeIterator is the problem, the first call to NodeIterator.hasNext() seems to take seconds with only 1000 nodes, is there something that can be done about this as the searches are quick but the loading of the nodes is VERY slow, regards, Dave -- View this message in context: http://www.nabble.com/Help-with-performace-again-tp22865321p22865321.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
