Dear Andy,
thank you very much for your advice. Indeed, my purpose is to build a java
application which queries the data stored by tdbloader. But I am not sure how
to implement the AssemblerUtils.build command in my application. That is, how
to choose the right dataset in my application.
What I have done now was to start the Fuseki Server. The administration panel
shows that there is one dataset „ds“. When I run my query via the browser it
works. So query my data with Fuseki would be a solution. However, I would like
to avoid running the server parallel to my application. So is it possible to
run my query targeting the text index without using the server at all?
Currently, my code gives me the same error („Failed to find the text index")
and looks like this:
Dataset dataset = TDBFactory.assembleDataset("text-config.ttl");
dataset.begin(ReadWrite.READ);
Model model = dataset.getDefaultModel();
Query query = QueryFactory.create("PREFIX
rdfs:<http://www.w3.org/2000/01/rdf-schema#> "
+ "PREFIX text: <http://jena.apache.org/text#> "
+ "PREFIX dbo:<http://dbpedia.org/ontology/> "
+ "SELECT * \n"
+ "WHERE { \n"
+ "?entity text:query (rdfs:label \"Stage\") ;
\n"
+ "rdfs:label ?label . \n"
+ "} LIMIT " + LIMIT);
try {
QueryExecution qexec = QueryExecutionFactory.create(query, model);
ResultSet rs = qexec.execSelect();
ResultSetFormatter.out(rs);
}
finally {
close();
}
Again, thank you very much for your help.
Philipp
> Am 20.04.2017 um 11:46 schrieb Andy Seaborne <[email protected]>:
>
> Philipp,
>
> I'm not completely sure what is going on but this:
>
> >> java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl
> >> --query=query.txt
>
> will not work because tdbquery looks for a TDB dataset so it will find
> <#dataset>.
>
> What I don't know is how to use general command line tools to pick out the
> right dataset from the configuration where there are two. Someon else may
> have a trick to do this.
>
> Fuseki will pick the right one when you point the service to text dataset.
>
> A small Java program can use
> AssemblerUtils.build(String assemblerFile, Resource type)
>
> to pick the right dataset.
>
> Andy
>
> On 19/04/17 19:57, Philipp Poschmann wrote:
>> Dear all,
>>
>> I am sorry for my incompetence but I have a problem with indexing labels
>> using Apache Jena. I have already checked several posts about this topic but
>> can’t find my error. Is there anyone who could help me please?
>>
>> My text-config.ttl file looks like this:
>>> @prefix : <http://localhost/jena_example/#> .
>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>> @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
>>> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
>>> @prefix text: <http://jena.apache.org/text#> .
>>>
>>> # TDB
>>> [] ja:loadClass "org.apache.jena.tdb.TDB" .
>>> tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
>>> tdb:GraphTDB rdfs:subClassOf ja:Model .
>>>
>>> # Text
>>> [] ja:loadClass "org.apache.jena.query.text.TextQuery" .
>>> text:TextDataset rdfs:subClassOf ja:RDFDataset .
>>> text:TextIndexLucene rdfs:subClassOf text:TextIndex .
>>>
>>> ## ---------------------------------------------------------------
>>> ## This URI must be fixed - it's used to assemble the text dataset.
>>>
>>> :text_dataset rdf:type text:TextDataset ;
>>> text:dataset <#dataset> ;
>>> text:index <#indexLucene> ;
>>> .
>>>
>>> <#dataset> rdf:type tdb:DatasetTDB ;
>>> tdb:location "storage" ;
>>> ## In the example, this would hide the real default graph.
>>> # tdb:unionDefaultGraph true ;
>>> .
>>>
>>> <#indexLucene> a text:TextIndexLucene ;
>>> #text:directory <file:Lucene> ;
>>> text:directory <file:storage> ;
>>> text:entityMap <#entMap> ;
>>> .
>>>
>>> <#entMap> a text:EntityMap ;
>>> text:entityField "uri" ;
>>> text:defaultField "text" ; ## Must be defined in the text:maps
>>> text:map (
>>> # rdfs:label
>>> [ text:field "text" ; text:predicate rdfs:label ]
>>> ) .
>>
>> I have a dataset file from dbpedia with some labels that I load and index
>> with these commands:
>>> java -cp fuseki-server.jar tdb.tdbloader --tdb=config.ttl
>>> infobox_property_definitions_en.ttl
>>> java -cp fuseki-server.jar jena.textindexer --desc=config.ttl
>>
>> To test for the indexed labels I tried the following query:
>>> PREFIX text: <http://jena.apache.org/text#>
>>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>>> SELECT *
>>> { ?s text:query (rdfs:label "Stage") ;
>>> rdfs:label ?label
>>> }
>>> LIMIT 10
>>
>> With this command:
>>> java -cp fuseki-server.jar tdb.tdbquery --time --tdb=config.ttl
>>> --query=query.txt
>>
>> However, I just get the following results:
>>> WARN Failed to find the text index : tried context and as a text-enabled
>>> dataset
>>> WARN No text index - no text search performed
>>> ----------------------------------------------------------
>>> | s | label |
>>> ==========================================================
>>> | <http://dbpedia.org/property/colwidth> | "colwidth"@en |
>>> | <http://dbpedia.org/property/voy> | "voy"@en |
>>> | <http://dbpedia.org/property/n> | "n"@en |
>>> | <http://dbpedia.org/property/v> | "v"@en |
>>> | <http://dbpedia.org/property/b> | "b"@en |
>>> | <http://dbpedia.org/property/s> | "s"@en |
>>> | <http://dbpedia.org/property/d> | "d"@en |
>>> | <http://dbpedia.org/property/name> | "Name"@en |
>>> | <http://dbpedia.org/property/alt> | "Alt"@en |
>>> | <http://dbpedia.org/property/caption> | "Caption"@en |
>>> ----------------------------------------------------------
>>> Time: 0,065 sec
>>
>> Obviously, these are not the desired results and the script has a problem
>> finding the text index. Just for clarification: I have decided to use a TDB
>> backed solution because my aim is to reproduce some dbpedia data locally but
>> without in-memory load. Actually, my intention is not to use the Fuseki
>> Server but it seems that it is the only solution to use text indexing.
>>
>> Thank you very much for your help.
>>
>> Best regards
>> Philipp
>>