Re: Failed to find the text index

Philipp Poschmann Thu, 20 Apr 2017 05:02:40 -0700

Dear Andy,

thank you very much for your advice. Indeed, my purpose is to build a java 
application which queries the data stored by tdbloader. But I am not sure how 
to implement the AssemblerUtils.build command in my application. That is, how 
to choose the right dataset in my application.


What I have done now was to start the Fuseki Server. The administration panel 
shows that there is one dataset „ds“. When I run my query via the browser it 
works. So query my data with Fuseki would be a solution. However, I would like 
to avoid running the server parallel to my application. So is it possible to 
run my query targeting the text index without using the server at all?

Currently, my code gives me the same error („Failed to find the text index") 
and looks like this:
Dataset dataset = TDBFactory.assembleDataset("text-config.ttl");
dataset.begin(ReadWrite.READ);
Model model = dataset.getDefaultModel();

Query query = QueryFactory.create("PREFIX 
rdfs:<http://www.w3.org/2000/01/rdf-schema#> "
                                + "PREFIX text: <http://jena.apache.org/text#> "
                                + "PREFIX dbo:<http://dbpedia.org/ontology/> "
                                + "SELECT * \n"
                                + "WHERE { \n"
                                + "?entity text:query (rdfs:label \"Stage\") ; 
\n"
                                + "rdfs:label ?label . \n"
                                + "} LIMIT " + LIMIT);

try {
        QueryExecution qexec = QueryExecutionFactory.create(query, model);      
        
        ResultSet rs = qexec.execSelect();
                        
        ResultSetFormatter.out(rs);
}
finally {
        close();
}

Again, thank you very much for your help.

Philipp

> Am 20.04.2017 um 11:46 schrieb Andy Seaborne <[email protected]>:
> 
> Philipp,
> 
> I'm not completely sure what is going on but this:
> 
> >> java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl 
> >> --query=query.txt
> 
> will not work because tdbquery looks for a TDB dataset so it will find 
> <#dataset>.
> 
> What I don't know is how to use general command line tools to pick out the 
> right dataset from the configuration where there are two.  Someon else may 
> have a trick to do this.
> 
> Fuseki will pick the right one when you point the service to text dataset.
> 
> A small Java program can use
>  AssemblerUtils.build(String assemblerFile, Resource type)
> 
> to pick the right dataset.
> 
>    Andy
> 
> On 19/04/17 19:57, Philipp Poschmann wrote:
>> Dear all,
>> 
>> I am sorry for my incompetence but I have a problem with indexing labels 
>> using Apache Jena. I have already checked several posts about this topic but 
>> can’t find my error. Is there anyone who could help me please?
>> 
>> My text-config.ttl file looks like this:
>>> @prefix :        <http://localhost/jena_example/#> .
>>> @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>> @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
>>> @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
>>> @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
>>> @prefix text:    <http://jena.apache.org/text#> .
>>> 
>>> # TDB
>>> [] ja:loadClass "org.apache.jena.tdb.TDB" .
>>> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
>>> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
>>> 
>>> # Text
>>> [] ja:loadClass "org.apache.jena.query.text.TextQuery" .
>>> text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
>>> text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
>>> 
>>> ## ---------------------------------------------------------------
>>> ## This URI must be fixed - it's used to assemble the text dataset.
>>> 
>>> :text_dataset rdf:type     text:TextDataset ;
>>>    text:dataset   <#dataset> ;
>>>    text:index     <#indexLucene> ;
>>>    .
>>> 
>>> <#dataset> rdf:type      tdb:DatasetTDB ;
>>>    tdb:location "storage" ;
>>>    ## In the example, this would hide the real default graph.
>>>    # tdb:unionDefaultGraph true ;
>>>    .
>>> 
>>> <#indexLucene> a text:TextIndexLucene ;
>>>    #text:directory <file:Lucene> ;
>>>    text:directory <file:storage> ;
>>>    text:entityMap <#entMap> ;
>>>    .
>>> 
>>> <#entMap> a text:EntityMap ;
>>>    text:entityField      "uri" ;
>>>    text:defaultField     "text" ; ## Must be defined in the text:maps
>>>    text:map (
>>>         # rdfs:label
>>>         [ text:field "text" ; text:predicate rdfs:label ]
>>>         ) .
>> 
>> I have a dataset file from dbpedia with some labels that I load and index 
>> with these commands:
>>> java -cp fuseki-server.jar tdb.tdbloader --tdb=config.ttl 
>>> infobox_property_definitions_en.ttl
>>> java -cp fuseki-server.jar jena.textindexer --desc=config.ttl
>> 
>> To test for the indexed labels I tried the following query:
>>> PREFIX text: <http://jena.apache.org/text#>
>>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>>> SELECT *
>>> { ?s text:query (rdfs:label "Stage") ;
>>>    rdfs:label ?label
>>> }
>>> LIMIT 10
>> 
>> With this command:
>>> java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl 
>>> --query=query.txt
>> 
>> However, I just get the following results:
>>> WARN  Failed to find the text index : tried context and as a text-enabled 
>>> dataset
>>> WARN  No text index - no text search performed
>>> ----------------------------------------------------------
>>> | s                                      | label         |
>>> ==========================================================
>>> | <http://dbpedia.org/property/colwidth> | "colwidth"@en |
>>> | <http://dbpedia.org/property/voy>      | "voy"@en      |
>>> | <http://dbpedia.org/property/n>        | "n"@en        |
>>> | <http://dbpedia.org/property/v>        | "v"@en        |
>>> | <http://dbpedia.org/property/b>        | "b"@en        |
>>> | <http://dbpedia.org/property/s>        | "s"@en        |
>>> | <http://dbpedia.org/property/d>        | "d"@en        |
>>> | <http://dbpedia.org/property/name>     | "Name"@en     |
>>> | <http://dbpedia.org/property/alt>      | "Alt"@en      |
>>> | <http://dbpedia.org/property/caption>  | "Caption"@en  |
>>> ----------------------------------------------------------
>>> Time: 0,065 sec
>> 
>> Obviously, these are not the desired results and the script has a problem 
>> finding the text index. Just for clarification: I have decided to use a TDB 
>> backed solution because my aim is to reproduce some dbpedia data locally but 
>> without in-memory load. Actually, my intention is not to use the Fuseki 
>> Server but it seems that it is the only solution to use text indexing.
>> 
>> Thank you very much for your help.
>> 
>> Best regards
>> Philipp
>>

Re: Failed to find the text index

Reply via email to