On Thursday, October 24, 2013 05:31:06 AM huey...@aol.com wrote: > I have another question about performance: Using jena-text with a Lucene > index is expected to be faster than a query with a regex filter, correct?
That will depend on details, but typically, yes. > I ran two queries, returning the (almost) same data, one using jena-text, > the other regex filter. I measured the execution times from > QueryFactory.create until after qe.execSelect(). And from there to > after CSVOutput.out(rs) (queries are attached). The results I get are: > > jena-text: > FINISH 1 - 359.32ms > FINISH 2 - 130.28ms > OVERALL - 489.61ms > > regex filter: > FINISH 1 - 46.27ms > FINISH 2 - 2540.39ms > OVERALL - 2586.66ms > > So it seems to confirm the assumption that jena-text is faster. I was just > wondering where the difference in FINISH 1 and FINISH 2 time is coming from? > Is it executing the query or just preparing it in FINISH 1 and executing it > once the ResultSet is being iterated over in FINISH 2? Yes, that's more-or-less it. The query result is streamed as it is computed, not computed in advance. > The FINISH 2 time kind of suggests that since both are printing out the same > list, > but regex takes much longer to "print" which seems unlikely if it was just > printing > the same list to console. Streaming, plus the FILTER is being applied to lots more bindings than the jena-text index generates. > And btw, sometimes the very first query using jena-text takes much longer > than subsequent queries. I am assuming that it is doing some sort of caching > in the first one?! There's caching and there's also class loading. Chris -- "The wizard seemed quite willing when I talked to him." /Howl's Moving Castle/ Epimorphics Ltd, http://www.epimorphics.com Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT Epimorphics Ltd. is a limited company registered in England (number 7016688)