On Fri, Nov 13, 2009 at 6:02 PM, Rickard Öberg <[email protected]> wrote:

> My GUESS is that the way we create the queries is not optimal. Right now the
> query creates a "v1 <property> v2" tuple and then "FILTER v2='Lead9999'" (or
> something like that). It might be that Sesame can't optimize that properly,
> and we should instead try to query for "v1 <property> 'Lead9999'" directly.

Confirmed!
I made the change suggested above and suddenly things are 'faster'. I
wouldn't say "fast", because we are still in the 200-300ms range for
the first 'find', then drops to ~20-40ms, on my machine.
Removing the DISTINCT half the response times as well.

But of course, we will have some problems with this, since the
implementation could just have been;

    FILTER ( ?v1 = 'Lead1' || ?v1 = 'Lead99999' )
or
    FILTER ( ( ?v1 = 'Lead1' && ?v2 = 'Available' ) || ?v1 = 'Lead9812' )

and so forth. IMHO, the query optimization is the responsibility of
the Sesame and not us, and that this is indeed a worrying situation,
something that I have brought up in the past, but not really pushed as
hard as I should have.


What are our choices?

1. Ignore it.

2. Try to optimize common cases, and recognize certain patterns and
create a query accordingly.

3. Engage some Sparql expert.

4. Try to use one or more other Sparql implementations.

5. Use Sesame with another query language, that may have better
optimization support.

6. Make a SQL implementation.

7. Make a Lucene implementation.

8. Make a Neo4j implementation.

9. Something else.

10. All of the above.


Cheers
-- 
Niclas Hedhman, Software Developer
http://www.qi4j.org - New Energy for Java

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Reply via email to