Hi Andy, indexing via Fuseki (#31) and and jena.textIndexer now worked both for me - thanks for your help.
In a production setting, I'd prefer the latter, because a) the Fuseki datastore should better be read-only, and b) on large datasets, loading and index building may take some hours, and this will be easier to control in a "local" script >From the script, I reference a temporary config file which holds definitions >for only one dataset (whereas the fuseki config may hold many), in order to >(re-) build only one index. Thanks again - Joachim -----Ursprüngliche Nachricht----- Von: Andy Seaborne [mailto:a...@apache.org] Gesendet: Freitag, 21. Juni 2013 15:30 An: users@jena.apache.org Betreff: Re: AW: Empty index with Jena Text and Fuseki On 21/06/13 11:45, Neubert Joachim wrote: > Hi Andy, > > thanks for the quick response, which makes quite clear what was wrong: A > before for Joseki, I used a pre-built read-only tdb database. > > Well, so I have to use Fuseki for tdb building as well. I'll check and report > back. There is a command line tool jena.textindexer to take dataset and produce an index. java -cp fuseki-server.jar jena.textindexer YourJosekiConfigFile But it was broken in the way it handled the command line args - I've just fixed it and used it to index a store that wasn't loaded with text indexing enabled: tdb.tdbloader -loc=DIR jena.textindexer .... and it worked for me. You'll need the latest development build (# 31) which I just kicked off for a full rebuild. https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-fuseki/0.2.8-SNAPSHOT/jena-fuseki-0.2.8-20130621.132913-31-distribution.zip Andy > > Cheers, Joachim > > -----Ursprüngliche Nachricht----- > Von: Andy Seaborne [mailto:a...@apache.org] > Gesendet: Freitag, 21. Juni 2013 12:24 > An: users@jena.apache.org > Betreff: Re: Empty index with Jena Text and Fuseki > > On 21/06/13 09:33, Neubert Joachim wrote: >> When I got it right, Fuseki is supposed to build the text index when it >> starts up. However, this did not work for me. > > Joachim, > > Fuseki indexes the data as it's loaded, it does not index existing data on > startup. I see what you see in the Lucene directory before data is loaded. > > How is the data being loaded into the store? > > Have you tried the config-tdb-text.ttl example? I have just checked using > that, and also modified to add something more like the entity map you have > and it works for me. > > I've tried s-put and the web UI (SPARQL update) to load data into the current > snapshot build and text queries returned something. > > If you have a complete, minimal example of load-query lifecycle that would be > most useful. > > Andy > > >> >> Starting fuseki (jena-fuseki-0.2.8-20130618.075236-28-server.jar) with an >> empty index directory, for a very short time, it looks like this: >> >> -rw-r--r--. 1 root root 45 Jun 20 13:46 segments_1 >> -rw-r--r--. 1 root root 0 Jun 20 13:46 write.lock >> >> and then it stays like this: >> >> -rw-r--r--. 1 root root 45 Jun 20 13:46 segments_1 >> -rw-r--r--. 1 root root 20 Jun 20 13:46 segments.gen >> >> Text queries yield an empty result, while standard sparql queries work. >> >> I can't figure out what could be wrong with my config: >> >> ## --------------------------------------------------------------- >> ## Read-only TDB dataset (only read services enabled). >> >> <#service_stw_combined> rdf:type fuseki:Service ; >> rdfs:label "STW combined TDB Service (R)" ; >> fuseki:name "stw_combined" ; >> fuseki:serviceQuery "query" ; >> fuseki:serviceQuery "sparql" ; >> ##fuseki:serviceUpdate "update" ; >> fuseki:serviceReadGraphStore "data" ; >> fuseki:serviceReadGraphStore "get" ; >> fuseki:dataset :stw_combined ; >> . >> >> :stw_combined rdf:type text:TextDataset ; >> text:dataset <#stw> ; >> text:index <#stwIndex> ; >> . >> >> <#stw> rdf:type tdb:DatasetTDB ; >> tdb:location "/opt/thes/var/stw/latest/tdb" ; >> ##tdb:unionDefaultGraph true ; >> . >> >> <#stwIndex> a text:TextIndexLucene ; >> text:directory <file:/opt/thes/var/stw/latest/tdb_lucene> ; >> text:entityMap <#entMap> ; >> . >> >> <#entMap> a text:EntityMap ; >> text:entityField "uri" ; >> text:defaultField "text" ; ## Must be defined in the text:map >> text:map ( >> # skos:prefLabel >> [ text:field "text" ; text:predicate skos:prefLabel ] >> # skos:altLabel >> [ text:field "text" ; text:predicate skos:altLabel ] >> # skos:hiddenLabel >> [ text:field "text" ; text:predicate skos:hiddenLabel ] >> ) . >> >> Help would be much appreciated. >> >> Cheers, Joachim >> >