Hi Maxime, as mentioned in Jira: by default, the KiWi triple store uses the Sesame in-memory implementation of SPARQL, i.e. in your case it will list all triples three times and evaluate the join in-memory (i.e. 60.000^3). This is obviously quite inefficient. If you want to directly translate SPARQL queries to SQL, you have to use the kiwi-sparql module in addition to the kiwi-triplestore and wrap the KiWiStore in a KiWiSparqlSail as follows:
KiWiSparqlSail sail = new KiWiSparqlSail(store); This will add a native SPARQL implementation to KiWi, so that many queries will be translated directly to SQL. As described in [1], this translation is not a complete SPARQL-SQL translation. Only those parts will be translated that are efficient to evaluate in a relational database and the current KiWi data model. What you can expect is that any triple pattern (without OPTIONAL) and most FILTER conditions are directly translated into SQL. Please note, however, that so far native SPARQL support followed "correctness over performance", so many advanced constructs are not optimized at the moment: - projection parts are never optimized; after the WHERE part has been evaluated, the internal result is a collection of node database IDs, which are then resolved in a separate query step (simple primary key based queries) - as a consequence, aggregation constructs are not optimized, so a count(...) could take much longer than expected - OPTIONAL is not optimized, because its semantics are a bit different to a normal SQL LEFT JOIN (we are still working on that); however, all patterns outside the OPTIONAL will still be optimized, so if you have good filter criteria outside it should not be a problem - SPARQL 1.1 path queries are not optimized, because their expressiveness exceeds the expressiveness of SQL - DISTINCT, ORDER BY, GROUP BY are not optimized, because they work on the projection results; DISTINCT in particular should be avoided if not necessary (but this also holds for other systems) In case of your query, the whole WHERE part will be translated into a single database JOIN, so it should be as efficient as it can get. The variable projection part will then, however, result in additional queries to the database, so the more results you expect the longer it will take. A very good way to dramatically improve performance is to add a LIMIT to the SPARQL query. That said, we are constantly working on improving SPARQL support. In particular, one of the next things to come is probably support for OPTIONAL, and then for ORDER BY and GROUP BY. Greetings, Sebastian [1] http://marmotta.apache.org/kiwi/sparql.html 2014-04-21 17:29 GMT+02:00 Maxime Poitevineau-Millin < [email protected]>: > Hi all, > > > > We are currently implementing marmotta in our system but! > > > > When trying this query : > > > > SELECT * FROM <sesame:nil> { > > > > > > ?kw rdf:type Keyword . > > ?kw2 rdf:type Keyword . > > ?kw2 ?rel ?kw . > > > > } > > > > on a database containing 60 000 keywords, marmotta is creating 60 000 * > 60 000 queries similar to this one : > > SELECT > id,subject,predicate,object,context,deleted,inferred,creator,createdAt,deletedAt > FROM triples WHERE deleted = false AND subject = '458172920399523841' AND > object = '458172732532453376' AND context = '458172711103754240' > > > > Which obviously takes a long time. > > > > This query takes 2 sec on Virtuoso or Owlim, is there any configuration > issue or something we should change? > > > > Regards, > > ___ > > > > [image: dipia] > > Maxime Poitevineau-Millin - *M* : 06 33 17 64 28 > > * T : *03 80 40 33 46 - *F* : 04 84 25 03 63 > > >
