Dear all, I am sorry for my incompetence but I have a problem with indexing labels using Apache Jena. I have already checked several posts about this topic but can’t find my error. Is there anyone who could help me please?
My text-config.ttl file looks like this: > @prefix : <http://localhost/jena_example/#> . > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> . > @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . > @prefix text: <http://jena.apache.org/text#> . > > # TDB > [] ja:loadClass "org.apache.jena.tdb.TDB" . > tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset . > tdb:GraphTDB rdfs:subClassOf ja:Model . > > # Text > [] ja:loadClass "org.apache.jena.query.text.TextQuery" . > text:TextDataset rdfs:subClassOf ja:RDFDataset . > text:TextIndexLucene rdfs:subClassOf text:TextIndex . > > ## --------------------------------------------------------------- > ## This URI must be fixed - it's used to assemble the text dataset. > > :text_dataset rdf:type text:TextDataset ; > text:dataset <#dataset> ; > text:index <#indexLucene> ; > . > > <#dataset> rdf:type tdb:DatasetTDB ; > tdb:location "storage" ; > ## In the example, this would hide the real default graph. > # tdb:unionDefaultGraph true ; > . > > <#indexLucene> a text:TextIndexLucene ; > #text:directory <file:Lucene> ; > text:directory <file:storage> ; > text:entityMap <#entMap> ; > . > > <#entMap> a text:EntityMap ; > text:entityField "uri" ; > text:defaultField "text" ; ## Must be defined in the text:maps > text:map ( > # rdfs:label > [ text:field "text" ; text:predicate rdfs:label ] > ) . I have a dataset file from dbpedia with some labels that I load and index with these commands: > java -cp fuseki-server.jar tdb.tdbloader --tdb=config.ttl > infobox_property_definitions_en.ttl > java -cp fuseki-server.jar jena.textindexer --desc=config.ttl To test for the indexed labels I tried the following query: > PREFIX text: <http://jena.apache.org/text#> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > SELECT * > { ?s text:query (rdfs:label "Stage") ; > rdfs:label ?label > } > LIMIT 10 With this command: > java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl > --query=query.txt However, I just get the following results: > WARN Failed to find the text index : tried context and as a text-enabled > dataset > WARN No text index - no text search performed > ---------------------------------------------------------- > | s | label | > ========================================================== > | <http://dbpedia.org/property/colwidth> | "colwidth"@en | > | <http://dbpedia.org/property/voy> | "voy"@en | > | <http://dbpedia.org/property/n> | "n"@en | > | <http://dbpedia.org/property/v> | "v"@en | > | <http://dbpedia.org/property/b> | "b"@en | > | <http://dbpedia.org/property/s> | "s"@en | > | <http://dbpedia.org/property/d> | "d"@en | > | <http://dbpedia.org/property/name> | "Name"@en | > | <http://dbpedia.org/property/alt> | "Alt"@en | > | <http://dbpedia.org/property/caption> | "Caption"@en | > ---------------------------------------------------------- > Time: 0,065 sec Obviously, these are not the desired results and the script has a problem finding the text index. Just for clarification: I have decided to use a TDB backed solution because my aim is to reproduce some dbpedia data locally but without in-memory load. Actually, my intention is not to use the Fuseki Server but it seems that it is the only solution to use text indexing. Thank you very much for your help. Best regards Philipp
