[MarkLogic Dev General] Query Plan in MarkLogic Semantic

Siow Boon Lin Eugene Thu, 12 Dec 2013 21:43:09 -0800

Hi,

I did a search on stackoverflow and the documentation but couldn't find any 
related information.
I'm looking to optimize my SPARQL queries by looking at a query plan for them.


I tried something like (in the query console):

xdmp:query-trace(true()),
sem:sparql(my_sparql_query),
xdmp:query-trace(false()),
xdmp:log(xdmp:query-meters)

but I don't get any results for the query-trace in the error log but I get the 
query-meters results. Is this because the query-trace is only for xquery?

I am doing something like this:

PREFIX cts: <http://marklogic.com/cts#>
PREFIX test: <http://test#>

SELECT * WHERE {
                {
                                SELECT DISTINCT ?fusedentity ?p {
                                                ?s ?p ?o.
                                                filter cts:contains(?o, 
cts:word-query("test*"))
                                                ?s a test:Entity1.
                                                ?fusedentity 
test:hasFusedReference ?s.
} LIMIT 5 OFFSET 0
                } UNION
{
                SELECT DISTINCT ?fusedentity ?p {
                                ?s ?p ?o.
                                filter cts:contains(?o, cts:word-query("test*"))
                                ?s a test:Entity2.
                                ?fusedentity test:hasFusedReference ?s.
} LIMIT 5 OFFSET 0
}
?fusedentity test:hasFusedReference ?entity.
?entity a ?class;
                ?p ?o.
}

The reason for this query is to do a full text search on the triple store and 
return up to 5 results of each class (entity1, entity2) matches.

The query runs quite slowly on a small set of data (<100k triples). The index 
options that I've enabled are the trailing wildcard option. Doing just a 
cts:word-query returns in decent time ~1s but after adding the other sparql 
patterns, this can run up to 10s. So I wanted to check the query plan to see 
what I was doing wrong.

I've increased the size of the triple cache and triple-value cache. Even when 
there are only triple and triple-value cache hits and no misses, the time it 
takes to return is still ~3-5s. I've also increased the triple index memory to 
200 (it can't go higher than that, throws an error that it is not valid from 
the admin ui).

Another question I have is also that GROUP BY and statements like 
SAMPLE/MAX/MIN and other aggregates are not in this version as far as I've 
tested, is there any other way to do this?

To summarise:

-          Is there a way to know the sparql query plan

-          Is there a way to speed up the query I'm doing

-          Is there a way to do a group by

Thanks and Regards,
Eugene

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

[MarkLogic Dev General] Query Plan in MarkLogic Semantic

Reply via email to