Osma,

I noticed the same problem this past summer. We had to set up two different 
Fuseki instances, each one with a different config file running on a different 
port, in order to maintain two Fuseki datasources with independent indexes. You 
should be able to do this also, but we were using the "fuseki" startup script, 
and we had to modify it to make sure that each instance was started by a 
different user to avoid the "fuseki stop" or "fuseki status" command from 
referring to the wrong instance or giving incorrect status information. 
(Hopefully one day this startup approach will be improved - I've been watching 
JENA-201 to create a deployable .war for Fuseki...)

By the way, +1 on fixing JENA-164 someday - I could REALLY use that feature! As 
you can imagine, it is quite disruptive on our project that after every data 
ingestion we need to take down Fuseki and rerun the indexer. If we don't do it, 
our queries which rely on the index don't work and we often forget why! Aren't 
there others out there who are using LARQ with Fuseki based on your 
instructions who rely on an up-to-date index?

-Elli

________________________________
 From: Osma Suominen <[email protected]>
To: [email protected] 
Sent: Wednesday, March 13, 2013 4:39 AM
Subject: LARQ multiple indexes not working?
 
Hi!

I'm running into problems trying to configure multiple LARQ indexes into the 
same Fuseki instance. It appears that only one of the indexes is actually used.

This is the simplest test case I could think of:

1. Install Fuseki with LARQ. I used my own instructions [1]. I used yesterday's 
svn HEAD versions of both LARQ and Fuseki. I had to fix the versions of ARQ and 
TDB used by LARQ and compile it myself.

2. Start Fuseki with the attached configuration that defines two services 
(serviceA and serviceB), each with a dataset (datasetA and datasetB) having 
different TDB and Lucene index locations.

3. Load test data:

A.nt contains this:
<http://example.org/A> <http://www.w3.org/2000/01/rdf-schema#label> "dataset A" 
.

B.nt contains this:
<http://example.org/B> <http://www.w3.org/2000/01/rdf-schema#label> "dataset B" 
.

I load these using s-put, like this:

./s-put http://localhost:3030/dsA/data default A.nt
./s-put http://localhost:3030/dsB/data default B.nt

4. Now make sure the Lucene indexes are updated (I wish JENA-164 was fixed some 
day):

- shut down Fuseki
- rm -r /tmp/lucene*
- start up Fuseki again

5. Verify that the proper data is loaded by executing this query for both dsA 
and dsB:

SELECT ?label
WHERE { ?s ?p ?label }

I get the expected result, i.e. "dataset A" for dsA and "datasetB" for dsB.

6. Try a query that uses the LARQ index for both dsA and dsB:

PREFIX pf: <http://jena.hpl.hp.com/ARQ/property#>
SELECT ?label
WHERE { ?label pf:textMatch 'dataset' }

Now I get the same result "dataset B" for both dsA and dsB. It appears that 
both queries go to the same index!

The contents of /tmp/luceneA and /tmp/luceneB appear to be identical w.r.t. 
file sizes so I believe the indexese were written into the proper places, but 
the wrong index is used during query time. I haven't double-checked this though.

-Osma

[1] http://code.google.com/p/onki-light/wiki/InstallFusekiLARQ

-- Osma Suominen | [email protected] | +358 40 5255 882
Aalto University, Department of Media Technology, Semantic Computing Research 
Group
Room 2541, Otaniementie 17, Espoo, Finland; P.O. Box 15500, FI-00076 Aalto, 
Finland

Reply via email to