Re: Parallel SPARQL queries with ARQ?

Paolo Castagna Fri, 28 Oct 2011 01:41:37 -0700

Hi Mikko,
I am not giving you answers to all of your questions, just a few.


Rinne Mikko wrote:

Hi!

New to Jena and the list, please bear with me if this has been explained over 
and over again. I had so far no luck with the documentation, mailing list 
archives or googling, so here we go:

Can ARQ be used to execute multiple parallel SPARQL queries?


I am not sure I follow what you are doing exactly.
But, are you reading a file with your data each time you execute a query?

A better approach would be to use something like TDB to load (and index)
your data once and then run your queries against the data stored in TDB
(using ARQ as you are doing now). See: http://openjena.org/wiki/TDB

Another option for you might be to use a SPARQL endpoint (and issue queries
in parallel via HTTP clients), see: http://openjena.org/wiki/Fuseki

I would like to configure e.g. 100 or 1000 queries and then run them against a 
single file of triples. I wrote a piece of code to run the queries in sequence 
and got suprisingly good performance with brute force, but I would expect going 
through the dataset only once to perform much better.


This is what made me think that you are parsing and reading in your data
from a file and doing this for each of your queries.

If so, bad.

Read the data once and keep it in memory if it's small.
Then run all your queries against that Jena Model.

If ARQ doesn't support this, is it the Jena forward-chaining RETE 
engine<http://jena.sourceforge.net/inference/> I should be looking at, and 
translate the SPARQL queries manually?


Now you got me confused.

Because I do not understand what you are trying to achieve, I don't know
your use case.

(However, it's seems something interesting... it seems to me you are
doing some "inference" via SPARQL and then you would like to keep data
up-to-date as you add/remove RDF data to/from your system).

Ultimately I would like to track the processing of each new triple from the 
dataset, in case it matches a query. Any proposals on good documentation? Which 
level should I try to interface on?


Interesting question... I'd love to have a clear and simple answer to
your question and a good example to point you at. But, I don't.

A related (and recent) thread from jena-users:
http://markmail.org/thread/l4ymug3ujoqifnty

You might find this example interesting and useful:
https://github.com/castagna/Apache-Jena-Examples/blob/master/src/main/java/org/apache/jena/examples/ExampleTDB_04.java
This is from GitHub, therefore it's not "official".
Some people learn best from examples.

When I need to learn how to use a new software and its APIs I prefer
Java code and many examples (with a few comments in it).

(If others want to contribute more small examples, go ahead: fork it
and send pull requests! ;-))

Mikko, maybe you can "payback" with a small example on how you could
use SPARQL CONSTRUCT queries to do small "inference" over RDF data.

Paolo


Thanks!

Mikko

Re: Parallel SPARQL queries with ARQ?

Reply via email to