Hi Dave! This was exactly the answer I was looking for - thank you very much! I'll dive into JenaRules and see, how accurately the current SPARQL-queries would translate.
I only saw your mail after sending mine; sorry for the extra traffic on the list. Mikko On 28. Oct 2011, at 1:14 PM, Dave Reynolds wrote: > Hi, > > On Fri, 2011-10-28 at 08:16 +0000, Rinne Mikko wrote: >> Hi! >> >> New to Jena and the list, please bear with me if this has been explained >> over and over again. I had so far no luck with the documentation, mailing >> list archives or googling, so here we go: >> >> Can ARQ be used to execute multiple parallel SPARQL queries? >> >> I would like to configure e.g. 100 or 1000 queries and then run them against >> a single file of triples. I wrote a piece of code to run the queries in >> sequence and got suprisingly good performance with brute force, but I would >> expect going through the dataset only once to perform much better. >> >> If ARQ doesn't support this, is it the Jena forward-chaining RETE >> engine<http://jena.sourceforge.net/inference/> I should be looking at, and >> translate the SPARQL queries manually? > > Like Paolo I'm not quite sure what you are trying to do but based on > that question let me take a guess ... > > It sounds like you have your data, maybe in a memory model, and want to > run a *lot* of queries over that single data set. You suspect that > instead of each query starting over again maybe you could stream the > data once through some sort of query sieve to do all the queries at once > in one pass. Is that about right? > > If so then there is no specific parallel-SPARQL-query support in Jena > but as you say it might be possible to use the RETE engine depending on > the specifics of what you are doing. > > As an aside, note it is possible issue SPARQL queries in parallel (most > of the Jena stores are Multiple Reader Single Writer), so on a > multi-core machine you might get extra speed from the brute force query > approach by spreading the queries across a small number of threads. > > The RETE engine works by keeping tables of partially matched triple > patterns so that each new triple is matched against the rules > incrementally. Which does seem related to what you want. > > The problem is that JenaRules is not SPARQL - there's no equivalents of > SPARQL constructs like UNION, ORDER BY, DISTINCT etc and the set of > built in predicates for filtering is different. Furthermore all you can > do as a result of a rule matching is assert a set of triples as a result > (or call some java code like Print) - you don't have access to a stream > of binding results in the way you do with SPARQL. > > However, if your queries are primarily just basic graph patterns and if > the results from your queries can be expressed as new triples then you > could indeed use the RETE engine. > > Whether it will gain you any benefit depends on the specifics. If there > is a lot of shared patterns between your rules then it might. If not > the overheads of the rule machinery may outweigh the gain from reuse of > partial matches. > > I would suggest you try a small experiment first to measure the > cost/gains before committing to it. > >> Ultimately I would like to track the processing of each new triple from the >> dataset, in case it matches a query. > > That is the way the RETE engine works. So long as you are only adding > and not removing triples (and so long as you don't have any nasty > non-monotonic operators in your rules) then each triple added to the > model is filtered through the RETE network to see if it triggers more > rules. > >> Any proposals on good documentation? > > The primary documentation for the rules engine is: > > http://incubator.apache.org/jena/documentation/inference/index.html#rules > > Dave > >