Re: Failed to find the text index

Andy Seaborne Tue, 25 Apr 2017 10:24:33 -0700

Philipp - sorry for the delay.

On 20/04/17 13:01, Philipp Poschmann wrote:

Dear Andy,


thank you very much for your advice. Indeed, my purpose is to build a java 
application which queries the data stored by tdbloader. But I am not sure how 
to implement the AssemblerUtils.build command in my application. That is, how 
to choose the right dataset in my application.

What I have done now was to start the Fuseki Server. The administration panel 
shows that there is one dataset „ds“. When I run my query via the browser it 
works. So query my data with Fuseki would be a solution. However, I would like 
to avoid running the server parallel to my application. So is it possible to 
run my query targeting the text index without using the server at all?

Currently, my code gives me the same error („Failed to find the text index") 
and looks like this:
Dataset dataset = TDBFactory.assembleDataset("text-config.ttl");


You'll need to build a text dataset - this wraps the TDB one

    Dataset dataset = TextDatasetFactory.create("text-config.ttl");

It'll look for exactly one resource with type text:TextDataset whichyour config does have.

It's useful to attach the Jena code and drill down into what theoperations do.


    Andy

dataset.begin(ReadWrite.READ);
Model model = dataset.getDefaultModel();

Query query = QueryFactory.create("PREFIX 
rdfs:<http://www.w3.org/2000/01/rdf-schema#> "
                                + "PREFIX text: <http://jena.apache.org/text#> "
                                + "PREFIX dbo:<http://dbpedia.org/ontology/> "
                                + "SELECT * \n"
                                + "WHERE { \n"
                                + "?entity text:query (rdfs:label \"Stage\") ; 
\n"
                                + "rdfs:label ?label . \n"
                                + "} LIMIT " + LIMIT);

try {
        QueryExecution qexec = QueryExecutionFactory.create(query, model);      
        
        ResultSet rs = qexec.execSelect();
                        
        ResultSetFormatter.out(rs);
}
finally {
        close();
}

Again, thank you very much for your help.

Philipp

Am 20.04.2017 um 11:46 schrieb Andy Seaborne <[email protected]>:

Philipp,

I'm not completely sure what is going on but this:

java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl 
--query=query.txt


will not work because tdbquery looks for a TDB dataset so it will find 
<#dataset>.

What I don't know is how to use general command line tools to pick out the 
right dataset from the configuration where there are two.  Someon else may have 
a trick to do this.

Fuseki will pick the right one when you point the service to text dataset.

A small Java program can use
 AssemblerUtils.build(String assemblerFile, Resource type)

to pick the right dataset.

   Andy

On 19/04/17 19:57, Philipp Poschmann wrote:

Dear all,

I am sorry for my incompetence but I have a problem with indexing labels using 
Apache Jena. I have already checked several posts about this topic but can’t 
find my error. Is there anyone who could help me please?

My text-config.ttl file looks like this:

@prefix :        <http://localhost/jena_example/#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .

# TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

# Text
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.

:text_dataset rdf:type     text:TextDataset ;
   text:dataset   <#dataset> ;
   text:index     <#indexLucene> ;
   .

<#dataset> rdf:type      tdb:DatasetTDB ;
   tdb:location "storage" ;
   ## In the example, this would hide the real default graph.
   # tdb:unionDefaultGraph true ;
   .

<#indexLucene> a text:TextIndexLucene ;
   #text:directory <file:Lucene> ;
   text:directory <file:storage> ;
   text:entityMap <#entMap> ;
   .

<#entMap> a text:EntityMap ;
   text:entityField      "uri" ;
   text:defaultField     "text" ; ## Must be defined in the text:maps
   text:map (
        # rdfs:label
        [ text:field "text" ; text:predicate rdfs:label ]
        ) .


I have a dataset file from dbpedia with some labels that I load and index with 
these commands:

java -cp fuseki-server.jar tdb.tdbloader --tdb=config.ttl 
infobox_property_definitions_en.ttl
java -cp fuseki-server.jar jena.textindexer --desc=config.ttl


To test for the indexed labels I tried the following query:

PREFIX text: <http://jena.apache.org/text#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT *
{ ?s text:query (rdfs:label "Stage") ;
   rdfs:label ?label
}
LIMIT 10


With this command:

java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl 
--query=query.txt


However, I just get the following results:

WARN  Failed to find the text index : tried context and as a text-enabled 
dataset
WARN  No text index - no text search performed
----------------------------------------------------------
| s                                      | label         |
==========================================================
| <http://dbpedia.org/property/colwidth> | "colwidth"@en |
| <http://dbpedia.org/property/voy>      | "voy"@en      |
| <http://dbpedia.org/property/n>        | "n"@en        |
| <http://dbpedia.org/property/v>        | "v"@en        |
| <http://dbpedia.org/property/b>        | "b"@en        |
| <http://dbpedia.org/property/s>        | "s"@en        |
| <http://dbpedia.org/property/d>        | "d"@en        |
| <http://dbpedia.org/property/name>     | "Name"@en     |
| <http://dbpedia.org/property/alt>      | "Alt"@en      |
| <http://dbpedia.org/property/caption>  | "Caption"@en  |
----------------------------------------------------------
Time: 0,065 sec


Obviously, these are not the desired results and the script has a problem 
finding the text index. Just for clarification: I have decided to use a TDB 
backed solution because my aim is to reproduce some dbpedia data locally but 
without in-memory load. Actually, my intention is not to use the Fuseki Server 
but it seems that it is the only solution to use text indexing.

Thank you very much for your help.

Best regards
Philipp

Re: Failed to find the text index

Reply via email to