Re: Help with performace again

daveg0 Fri, 03 Apr 2009 06:08:04 -0700

Hi,

Thanks for the response,


I am using 1.5.3 and I have the respectDocumentOrder set to false. 

I did a bit of trawling the forum, once I realised it was the NodeIterator
and I found that although respectDocumentOrder=false seemed to have no
effect, if I tagged on a default order-by clause to the end of the query I
got a LazyNodeIterator rather tha a DocumentOrdered one and this helped a
lot with speed. The fianl query went down from over 16 seconds to return
3300 nodes to 0.8 seconds, so that seems to be the solution - ensure you
have an "order by" clause on all queries (or alternatively the
"respectDocuemntOrder="false" works)

regards

Dave

daveg0 wrote:
> 
> Hi,
> 
> I am getting some really poor search performance out of our repository and
> need some help to try understand what I am doing wrong.
> 
> I wil try and give as much detail as possible:
> 
> 1) We are trying to implement an atom repository (see cnd file)
> http://www.nabble.com/file/p22865321/atom_node_types.cnd
> atom_node_types.cnd 
> 
> 2) Search performance is VERY slow using this test below: It appears that
> the problem is not the searches themselves, but the loading of the nods in
> the returned NodeIterator, this is extemely slow. Does this not lazilly
> load?
> 
>    @Test
>     public final void testSearchPerformance() throws Exception {
>         // Test jcr:like
>         SpringJCRNodeDAO nodeQuery = (SpringJCRNodeDAO)
> daoFactory.getDAO(Node.class);
>         StopWatch watch = new StopWatch();
>         String pathPrefix =
> "//portal/portal/test-collection//element(*,atom:Entry)";
>         String searchPrefix =
> "//element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)";
> 
>         String rootPrefix = "//element(*,atom:Entry)";
> 
>         Limits limits = new Limits();
>         limits.setMaxRows(10000);
>         String query = null;
>         QueryResultWrapper<List<Node>> results = null;
> 
>         query = pathPrefix +
> "[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]";
>         watch.start();
>         results = nodeQuery.get(query, limits.getMaxRows(),
> limits.getOffset());
>         displayTime(query, watch);
>         displayResults(query, results);
> }
> 
> results:
> 
> 09:43:06,257 [main] INFO  SearchPerformanceTest : Query:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> time: 4598
> 09:43:06,257 [main] INFO  SearchPerformanceTest : Results for:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> size: 1110
> 09:43:07,972 [main] INFO  SearchPerformanceTest : Query:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')]
> time: 1715
> 09:43:07,972 [main] INFO  SearchPerformanceTest : Results for:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')]
> size: 1110
> 09:43:09,639 [main] INFO  SearchPerformanceTest : Query:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> time: 1667
> 09:43:09,639 [main] INFO  SearchPerformanceTest : Results for:
> //portal/portal/test-collection//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> size: 1110
> 09:43:11,260 [main] INFO  SearchPerformanceTest : Query:
> //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] time:
> 1605
> 09:43:11,260 [main] INFO  SearchPerformanceTest : Results for:
> //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')] size:
> 1110
> 09:43:12,881 [main] INFO  SearchPerformanceTest : Query:
> //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> time: 1621
> 09:43:12,881 [main] INFO  SearchPerformanceTest : Results for:
> //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> size: 1110
> 09:43:14,518 [main] INFO  SearchPerformanceTest : Query:
> //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')]
> time: 1637
> 09:43:14,518 [main] INFO  SearchPerformanceTest : Results for:
> //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1*')]
> size: 1110
> 09:43:16,186 [main] INFO  SearchPerformanceTest : Query:
> //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> time: 1668
> 09:43:16,186 [main] INFO  SearchPerformanceTest : Results for:
> //element(*,atom:Service)[...@atom:title='portal']/element(*,atom:Workspace)[...@atom:title='portal']/element(*,atom:Collection)[atom:titletext='test-collection']//element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]
> size: 1110
> 
> The node query does something like:
> 
> public QueryResultWrapper<List<T>> get(String queryString, long limit,
> long offset, String userId) throws DAOException {
> 
>     // check user privs code etc removed
>     try {
>           QueryManager queryManager = 
> session.getWorkspace().getQueryManager();
> 
>           // This code is tied to JackRabbbit as it allows limits
>           // and offsets on the queries. The uncommented line is JCR
>           // implementation agnostic
>           Query query = queryManager.createQuery(queryString, Query.XPATH);
>           // QueryResult queryresult = query.execute();
>           // ========= JackRabbit-specific code
>           QueryImpl jackRabbitQuery = (QueryImpl) query;
>           jackRabbitQuery.setLimit(limit);
>           jackRabbitQuery.setOffset(offset);
>           jackrabbitQueryResult = (QueryResultImpl) jackRabbitQuery.execute();
>           // QueryResult queryresult = jackRabbitQuery.execute();
>           // ===== End of JakcRabbit-specific code
> 
>           // NodeIterator nodes = queryresult.getNodes();
>           NodeIterator nodes = jackrabbitQueryResult.getNodes();
>           while (nodes.hasNext()) {
>               returnList.add(nodes.nextNode());               
>           }
>       } catch (Exception e) {
>               LOG.error(e.getMessage(), e);
>                 throw new DAOException(e.getMessage());
>       }
> }
> 
> The equivalent Lucene searches through all of the index subdirectories in
> workspaces/default/index say is:
> 
>    @Test
>     public void testTitle() throws Exception {
>         File[] indexes = new File(indexDir).listFiles(new FileFilter() {
>             @Override
>             public boolean accept(File f) {
>                 return f.isDirectory();
>             }
>         });
> 
>         StopWatch watch = new StopWatch();
>         for (File file : indexes) {
>             Directory directory = FSDirectory.getDirectory(file);
>             IndexSearcher searcher = new IndexSearcher(directory);
>             try {
>                 System.out.println("Directory: " + directory);
>                 Term t = new Term("12:FULL:titletext", "*");
>                 Query query = new WildcardQuery(t);
>                 watch.start();
>                 Hits hits = searcher.search(query);
>                 //showDocs(hits);
>                 watch.stop();
>                 System.out.println("Hits: " + hits.length() + " query: " +
> query + " time: " + watch.getTime());
>                 watch.reset();
>             } finally {
>                 searcher.close();
>                 directory.close();
>             }
>         }
>     }
> 
>     private void showDocs(Hits hits) throws CorruptIndexException,
> IOException {
>         Document doc;
>         for (int i = 0; i < hits.length(); i++) {
>             doc = hits.doc(i);
>             System.out.println("doc: " +
> doc.getField("_:UUID").stringValue());
>         }
>     }
> 
> This returns very quickly:
> 
> Directory:
> org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_5
> Hits: 601 query: 12:FULL:titletext:* time: 47
> Directory:
> org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_b
> Hits: 699 query: 12:FULL:titletext:* time: 16
> Directory:
> org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_h
> Hits: 811 query: 12:FULL:titletext:* time: 47
> Directory:
> org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_y
> Hits: 199 query: 12:FULL:titletext:* time: 15
> Directory:
> org.apache.lucene.store.fsdirect...@c:\repository\workspaces\default\index\_z
> Hits: 1000 query: 12:FULL:titletext:* time: 31
> 
> I have inserted some debug statements into the code and it appears that
> the NodeIterator is the problem, the first call to NodeIterator.hasNext()
> seems to take seconds with only 1000 nodes, is there something that can be
> done about this as the searches are quick but the loading of the nodes is
> VERY slow,
> 
> regards,
> 
> Dave
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Help-with-performace-again-tp22865321p22868029.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Help with performace again

Reply via email to