On Thursday, October 24, 2013 05:31:06 AM huey...@aol.com wrote:
> I have another question about performance: Using jena-text with a Lucene
> index is expected to be faster than a query with a regex filter, correct?

That will depend on details, but typically, yes.

> I ran two queries, returning the (almost) same data, one using jena-text,
> the other regex filter. I measured the execution times from 
> QueryFactory.create until after qe.execSelect(). And from there to
> after CSVOutput.out(rs) (queries are attached). The results I get are:
> 
> jena-text:
> FINISH 1 - 359.32ms
> FINISH 2 - 130.28ms
> OVERALL - 489.61ms
> 
> regex filter:
> FINISH 1 - 46.27ms
> FINISH 2 - 2540.39ms
> OVERALL - 2586.66ms
> 
> So it seems to confirm the assumption that jena-text is faster. I was just 
> wondering where the difference in FINISH 1 and FINISH 2 time is coming from? 
> Is it executing the query or just preparing it in FINISH 1 and executing it 
> once the ResultSet is being iterated over in FINISH 2?

Yes, that's more-or-less it. The query result is streamed as it is
computed, not computed in advance.

> The FINISH 2 time kind of suggests that since both are printing out the same 
> list,
> but regex takes much longer to "print" which seems unlikely if it was just 
> printing 
> the same list to console.

Streaming, plus the FILTER is being applied to lots more bindings 
than the jena-text index generates.

> And btw, sometimes the very first query using jena-text takes much longer 
> than subsequent queries. I am assuming that it is doing some sort of caching 
> in the first one?!

There's caching and there's also class loading.

Chris

-- 
"The wizard seemed quite willing when I talked to him."  /Howl's Moving Castle/

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)

Reply via email to