Hi Essam, [email protected]
Query performance mostly depends on the complexity of the query and the amount of data you are querying. The KiWI backend is translating a SPARQL query into SQL with all advantages and disadvantages (SQL offers a rich expressiveness and supports features like sorting and grouping). It is meant for medium amounts of data and more complex queries. Please see http://marmotta.apache.org/kiwi/sparql.html for information, in particular the section "performance considerations". The new Ostrich backend is designed for large amounts of data, but mostly simple queries. It won't offer you performance benefits if you are using e.g. ORDER BY or GROUP BY on large sets of candidates (the complexity is in the nature of the problem and not easily addressed). It will give you blazing speed when you have simple queries (i.e. simple join style queries) and strict limits. Ignite sounds interesting and promises very good performance. I still have doubts about it being much faster for evaluating complex SPARQL queries though. It's worth a try implementing it as a backend, but don't expect any wonders. Ostrich is using highly efficient indexing and key range queries that won't benefit much from an in-memory store, especially not if it is distributed and therefore incurs network overhead. The fact Ostrich works so fast for certain queries is because LevelDB isn't a pure key/value store, it's a key/value store allowing to query for partial keys and doing key range queries. As far as I can see, Ignite is a key/value store only, and I imagine it is hard to implement range queries over it. The best way to improve SPARQL performance is probably to use Ostrich and work on implementing/improving its SPARQL query planner, which at the moment is very simple. One example would be to reorder patterns in the WHERE part so the most selective pattern is applied first. This would require tracking triple statistics. Another example would be to drop DISTINCT for cases where it is not needed, or to use the index for ORDER BY. I'll see what I can do in this area in the next weeks, but no promises. :) Cheers, Sebastian 2016-02-03 6:02 GMT+01:00 Elsherif, Essam (ELS-NYC) <[email protected] >: > Hi Sebastian, > > Currently I am working on a Semantic Chemical Structure Search Engine > where I am trying to use Marmotta as Triple Store and linked data platform. > The SPARQL search queries run very slow. I was reading on the forum that > you added levelDB integration which runs much faster. I am going to test > that soon. However, I have been exploring for some time Apache Ignite > <https://ignite.apache.org/>, which is essentially a distributed > in-memory data fabric. I wonder if you would like to collaborate on using > Apache Ignite as backend store for Marmotta, just like H2, Kiwi, and > Ostrich. I think this we would give the best performance gain. Please let > me know, and we can have a discussion about it. > > > > Thanks, > > *Essam Elsherif* > > Solutions Architect – Elsevier Life Science Solutions > > 240 W. 37th Street, 2nd Floor > > New York, NY 10018 > > [email protected] │ email > > +1 646 380 3759 │ office > +1 646 873 0421 │ mobile > > >
