Hi Andy,

>Did you load the data before attaching the text index?

How do I do it (or not do it, wasn't sure from your post)?

Thanks,
Zhenya



On Sun, Mar 22, 2020, at 9:18 AM, Andy Seaborne wrote:
> Just checking one point:
> 
> Did you load the data before attaching the text index?
> 
> The text index is calculated as data is added so if you first load the 
> dataset then setup a text index, it will miss indexing the data.
> 
>  Andy
> 
> On 21/03/2020 07:55, Lorenz Buehmann wrote:
> > Hi,
> > 
> > welcome to Semantic Web and Apache Jena.
> > 
> > Comments inline:
> > 
> > On 20.03.20 15:36, Zhenya Antić wrote:
> >> Hello,
> >>
> >> I am a beginner with Fuseki, knowledge graphs and SPARQL, so please 
> >> forgive me if the questions seem obvious, the learning curve for this 
> >> turned out to be quite steep.
> > No problem, nothing is simple in the beginning,
> >>
> >> I am trying to get text indexing to work with my Fuseki knowledge graph.
> > Which DBpedia dataset did you load? I mean, which files?
> >>
> >> For starters, I tried using a regular expression, but that didn't work:
> >>
> >> Just a plain query like this:
> >> SELECT DISTINCT * WHERE {
> >> ?s ?p ?o
> >> }
> >> gives 98 results such as:
> >>
> >> 1
> >> <http://dbpedia.org/ontology/wikiPageID:9127632>
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> >> <http://dbpedia.org/resource/Biology>
> >> 2
> >> <http://dbpedia.org/ontology/wikiPageID:9127632>
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> >> <http://dbpedia.org/resource/Biology#Branches>
> >> 3
> >> <http://dbpedia.org/ontology/wikiPageID:9127632>
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym>
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#branches_of_biology>
> >> 4
> >> <http://dbpedia.org/ontology/wikiPageID:18393>
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> >> <http://dbpedia.org/resource/Life>
> > That can't be the correct output of this query. rdfs:label should return
> > literals as object (?o) - or you loaded some really weird data
> >>
> >> But a query with a regular expression:
> >> SELECT DISTINCT * WHERE {
> >> ?s ?p ?o
> >> FILTER regex(?o, "Biol", "i")
> >> }
> > 
> > 1. you should help the query engine and use rdfs:label as property
> > 
> > 2. you should use str() function on the ?o values:
> > 
> > SELECT DISTINCT * WHERE {
> > ?s rdfs:label ?o
> > FILTER regex(str(?o), "Biol", "i")
> > }
> > 
> >> gives 0 results, although there are clearly results that contain "Biol".
> > 
> > 
> > I've to try your config or maybe others will spot the issue in the meantime.
> > 
> >>
> >> I also tried setting up indexing with a .ttl file, however the result was 
> >> "INFO 0 (0 per second) properties indexed". .ttl file below:
> >>
> >> @prefix : <http://base/#> .
> >> @prefix tdb2: <http://jena.apache.org/2016/tdb#> .
> >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
> >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> >> @prefix fuseki: <http://jena.apache.org/fuseki#> .
> >> @prefix text: <http://jena.apache.org/text#> .
> >>
> >> <http://jena.apache.org/2016/tdb#DatasetTDB>
> >> rdfs:subClassOf ja:RDFDataset .
> >>
> >> ja:DatasetTxnMem rdfs:subClassOf ja:RDFDataset .
> >>
> >> tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset .
> >>
> >> tdb2:GraphTDB2 rdfs:subClassOf ja:Model .
> >>
> >> <http://jena.apache.org/2016/tdb#GraphTDB2>
> >> rdfs:subClassOf ja:Model .
> >>
> >> ja:MemoryDataset rdfs:subClassOf ja:RDFDataset .
> >>
> >> ja:RDFDatasetZero rdfs:subClassOf ja:RDFDataset .
> 
> The rdfs:subClassOf should not be necessary (recent versions of Fuseki).
> 
> If any are, let's use know so it can be fixed.
> 
> >>
> >> <http://jena.apache.org/text#TextDataset>
> >> rdfs:subClassOf ja:RDFDataset .
> >>
> >> :service_tdb_all a fuseki:Service ;
> >> rdfs:label "TDB biology" ;
> >> fuseki:dataset :tdb_dataset_readwrite ;
> >> fuseki:name "biology" ;
> >> fuseki:serviceQuery "query" , "" , "sparql" ;
> >> fuseki:serviceReadGraphStore "get" ;
> >> fuseki:serviceReadQuads "" ;
> >> fuseki:serviceReadWriteGraphStore
> >> "data" ;
> >> fuseki:serviceReadWriteQuads "" ;
> >> fuseki:serviceUpdate "" , "update" ;
> >> fuseki:serviceUpload "upload" .
> >>
> >> :tdb_dataset_readwrite
> >> a tdb2:DatasetTDB2 ;
> >> tdb2:location "db" .
> >>
> >> <http://jena.apache.org/2016/tdb#GraphTDB>
> >> rdfs:subClassOf ja:Model .
> >>
> >> ja:RDFDatasetOne rdfs:subClassOf ja:RDFDataset .
> >>
> >> ja:RDFDatasetSink rdfs:subClassOf ja:RDFDataset .
> >>
> >> <http://jena.apache.org/2016/tdb#DatasetTDB2>
> >> rdfs:subClassOf ja:RDFDataset .
> >>
> >> <#dataset> rdf:type tdb2:DatasetTDB2 ;
> >> tdb2:location "db" ; #path to TDB;
> >> .
> >>
> >> # Text index description
> >> :text_dataset rdf:type text:TextDataset ;
> >> text:dataset <#dataset> ; # <-- replace `:my_dataset` with the desired URI
> >> text:index <#indexLucene> ;
> >> .
> >>
> >> <#indexLucene> a text:TextIndexLucene ;
> >> text:directory <file:data/luceneIndexing> ;
> >> text:entityMap <#entMap> ;
> >> .
> >>
> >> <#entMap> a text:EntityMap ;
> >> text:defaultField "text" ;
> >> text:entityField "uri" ;
> >> text:map (
> >> #RDF label abstracts
> >> [ text:field "text" ;
> >> text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> ;
> >> text:analyzer [
> >> a text:StandardAnalyzer
> >> ]
> >> ]
> >> [ text:field "text" ;
> >> text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> ;
> >> text:analyzer [
> >> a text:StandardAnalyzer
> >> ]
> >> ]
> >> ) .
> >>
> >>
> >>
> >> <#service_text_tdb> rdf:type fuseki:Service ;
> >> fuseki:name "ds" ;
> >> fuseki:serviceQuery "query" ;
> >> fuseki:serviceQuery "sparql" ;
> >> fuseki:serviceUpdate "update" ;
> >> fuseki:serviceUpload "upload" ;
> >> fuseki:serviceReadGraphStore "get" ;
> >> fuseki:serviceReadWriteGraphStore "data" ;
> >> fuseki:dataset :text_dataset ;
> >> .
> >>
> >> Thank you so much in advance,
> >>
> >> __________________________
> >> Zhenya Antić, PhD
> >> Natural Language Processing
> >> https://www.linkedin.com/in/zhenya-antic/
> >>
> >> Practical Linguistics Inc
> >> http://www.practicallinguistics.com
> >>
> >>
> >>
> > 
> 

Reply via email to