Re: Fuseki + Larq : Lucene indexing

Paolo Castagna Mon, 12 Sep 2011 07:17:56 -0700

Jérôme wrote:

Le 12/09/11 15:18, Paolo Castagna a écrit :
Jérôme wrote:
Le 12/09/11 12:24, Paolo Castagna a écrit :
Hi Jérôme,
you are lucky, I've just exactly the same need as you and I'vesomething about it recently.Unfortunately, the new LARQ (as a separate module) still did notmake it into Fuseki on trunk.
We have an open JIRA for it which you can watch|vote|contribute to:
https://issues.apache.org/jira/browse/JENA-63
In the meantime, if you want to use LARQ with Fuseki this is whatyou need to do:
cd /tmp
svn cohttps://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/fuseki
cd /tmp/fuseki
wgethttps://issues.apache.org/jira/secure/attachment/12482758/JENA-63_Fuseki_r1136050.patch
patch -p0<  JENA-63_Fuseki_r1136050.patch
mvn package

Now, you can simply use the Fuseki config.ttl file as explained here:
http://openjena.org/wiki/Fuseki#Fuseki_Configuration_File
and use the ja:textIndex property on a dataset to specify an nonexisting directory.
Is it possible to have a fuseki configuration example with aja:textIndex property? I am trying to
add it on the book service (books.ttl) with no results...
Use tdbloader to load some RDF data into /tmp/tdb, then change <#dataset>
on the example config.ttl file you have in Fuseki:
http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/config.ttl
I've never used the TDB loader - How does it work? Is there an on-linedocumentation?


Fortunately, TDB is included in Fuseki uber jar (since it includes Fuseki
binaries as well as all the jar dependencies, including TDB). So, in this
case, for an end-users it's quite useful.

Here is what I do:

cd /tmp/fuseki
java -cp target/fuseki-0.2.1-SNAPSHOT-sys.jar tdb.tdbloader --loc=/tmp/tdb 
books.ttl

This will load the data in books.ttl and build the TDB indexes in /tmp/tdb

You can also use the -h option for help:

java -cp target/fuseki-0.2.1-SNAPSHOT-sys.jar tdb.tdbloader -h
tdbloader [--desc DATASET | -loc DIR] FILE ...
  Location
      --loc=DIR              Location (a directory)
      --tdb=                 Assembler description file
  Symbol definition
      --set                  Set a configuration symbol to a value

--strict Operate in strict SPARQL mode (no extensions ofany kind)

      --graph=IRI            Act on a named graph
      --desc=                Assembler description file
  General
      -v   --verbose         Verbose
      -q   --quiet           Run with minimal output
      --debug                Output information for debugging
      --help
      --version              Version information


Paolo

Thanks
[...]

<#dataset> rdf:type      tdb:DatasetTDB ;
    tdb:location "/tmp/tdb" ;
    ja:textIndex "/tmp/lucene" ;
    .
If the /tmp/lucene directory does not exist, LARQ will index what youhave in
/tmp/tdb creating the appropriate Lucene indexes.


Paolo
Thanks
LARQ when you point it at a non existing directory will perform theindexing for you.This is particularly useful when you have multiple datasetsconfigured in Fuseki.
WARNING: it might take a while to index large datasets, so be patient.

See also: http://markmail.org/thread/tmptip55ru5wxrrj

LARQ snapshots are here:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/larq/0.2.2-incubating-SNAPSHOT/and I can quickly fix/improve things if you have problems or goodsuggestions.
I hope this helps, let me know how it goes.

Paolo

Jérôme wrote:
Hi,

i'm trying to use LARQ with my Fuseki server.
I would like to programmaticaly indexing(with lucene) documentswhen the
server starts.

Something like that:

Model model = ModelFactory.createDefaultModel();
IndexBuilderString larqBuilder = new IndexBuilderString();
model.register(larqBuilder);
FileManager.get().readModel(model, "Data/books.ttl");
larqBuilder.closeWriter();
model.unregister(larqBuilder);
index = larqBuilder.getIndex();
LARQ.setDefaultIndex(index);

Is it possible? In which class it would be the best?

Thanks

Jerome

Re: Fuseki + Larq : Lucene indexing

Reply via email to