Hi Andy, >Did you load the data before attaching the text index?
How do I do it (or not do it, wasn't sure from your post)? Thanks, Zhenya On Sun, Mar 22, 2020, at 9:18 AM, Andy Seaborne wrote: > Just checking one point: > > Did you load the data before attaching the text index? > > The text index is calculated as data is added so if you first load the > dataset then setup a text index, it will miss indexing the data. > > Andy > > On 21/03/2020 07:55, Lorenz Buehmann wrote: > > Hi, > > > > welcome to Semantic Web and Apache Jena. > > > > Comments inline: > > > > On 20.03.20 15:36, Zhenya Antić wrote: > >> Hello, > >> > >> I am a beginner with Fuseki, knowledge graphs and SPARQL, so please > >> forgive me if the questions seem obvious, the learning curve for this > >> turned out to be quite steep. > > No problem, nothing is simple in the beginning, > >> > >> I am trying to get text indexing to work with my Fuseki knowledge graph. > > Which DBpedia dataset did you load? I mean, which files? > >> > >> For starters, I tried using a regular expression, but that didn't work: > >> > >> Just a plain query like this: > >> SELECT DISTINCT * WHERE { > >> ?s ?p ?o > >> } > >> gives 98 results such as: > >> > >> 1 > >> <http://dbpedia.org/ontology/wikiPageID:9127632> > >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > >> <http://dbpedia.org/resource/Biology> > >> 2 > >> <http://dbpedia.org/ontology/wikiPageID:9127632> > >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > >> <http://dbpedia.org/resource/Biology#Branches> > >> 3 > >> <http://dbpedia.org/ontology/wikiPageID:9127632> > >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> > >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#branches_of_biology> > >> 4 > >> <http://dbpedia.org/ontology/wikiPageID:18393> > >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > >> <http://dbpedia.org/resource/Life> > > That can't be the correct output of this query. rdfs:label should return > > literals as object (?o) - or you loaded some really weird data > >> > >> But a query with a regular expression: > >> SELECT DISTINCT * WHERE { > >> ?s ?p ?o > >> FILTER regex(?o, "Biol", "i") > >> } > > > > 1. you should help the query engine and use rdfs:label as property > > > > 2. you should use str() function on the ?o values: > > > > SELECT DISTINCT * WHERE { > > ?s rdfs:label ?o > > FILTER regex(str(?o), "Biol", "i") > > } > > > >> gives 0 results, although there are clearly results that contain "Biol". > > > > > > I've to try your config or maybe others will spot the issue in the meantime. > > > >> > >> I also tried setting up indexing with a .ttl file, however the result was > >> "INFO 0 (0 per second) properties indexed". .ttl file below: > >> > >> @prefix : <http://base/#> . > >> @prefix tdb2: <http://jena.apache.org/2016/tdb#> . > >> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > >> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . > >> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > >> @prefix fuseki: <http://jena.apache.org/fuseki#> . > >> @prefix text: <http://jena.apache.org/text#> . > >> > >> <http://jena.apache.org/2016/tdb#DatasetTDB> > >> rdfs:subClassOf ja:RDFDataset . > >> > >> ja:DatasetTxnMem rdfs:subClassOf ja:RDFDataset . > >> > >> tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset . > >> > >> tdb2:GraphTDB2 rdfs:subClassOf ja:Model . > >> > >> <http://jena.apache.org/2016/tdb#GraphTDB2> > >> rdfs:subClassOf ja:Model . > >> > >> ja:MemoryDataset rdfs:subClassOf ja:RDFDataset . > >> > >> ja:RDFDatasetZero rdfs:subClassOf ja:RDFDataset . > > The rdfs:subClassOf should not be necessary (recent versions of Fuseki). > > If any are, let's use know so it can be fixed. > > >> > >> <http://jena.apache.org/text#TextDataset> > >> rdfs:subClassOf ja:RDFDataset . > >> > >> :service_tdb_all a fuseki:Service ; > >> rdfs:label "TDB biology" ; > >> fuseki:dataset :tdb_dataset_readwrite ; > >> fuseki:name "biology" ; > >> fuseki:serviceQuery "query" , "" , "sparql" ; > >> fuseki:serviceReadGraphStore "get" ; > >> fuseki:serviceReadQuads "" ; > >> fuseki:serviceReadWriteGraphStore > >> "data" ; > >> fuseki:serviceReadWriteQuads "" ; > >> fuseki:serviceUpdate "" , "update" ; > >> fuseki:serviceUpload "upload" . > >> > >> :tdb_dataset_readwrite > >> a tdb2:DatasetTDB2 ; > >> tdb2:location "db" . > >> > >> <http://jena.apache.org/2016/tdb#GraphTDB> > >> rdfs:subClassOf ja:Model . > >> > >> ja:RDFDatasetOne rdfs:subClassOf ja:RDFDataset . > >> > >> ja:RDFDatasetSink rdfs:subClassOf ja:RDFDataset . > >> > >> <http://jena.apache.org/2016/tdb#DatasetTDB2> > >> rdfs:subClassOf ja:RDFDataset . > >> > >> <#dataset> rdf:type tdb2:DatasetTDB2 ; > >> tdb2:location "db" ; #path to TDB; > >> . > >> > >> # Text index description > >> :text_dataset rdf:type text:TextDataset ; > >> text:dataset <#dataset> ; # <-- replace `:my_dataset` with the desired URI > >> text:index <#indexLucene> ; > >> . > >> > >> <#indexLucene> a text:TextIndexLucene ; > >> text:directory <file:data/luceneIndexing> ; > >> text:entityMap <#entMap> ; > >> . > >> > >> <#entMap> a text:EntityMap ; > >> text:defaultField "text" ; > >> text:entityField "uri" ; > >> text:map ( > >> #RDF label abstracts > >> [ text:field "text" ; > >> text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> ; > >> text:analyzer [ > >> a text:StandardAnalyzer > >> ] > >> ] > >> [ text:field "text" ; > >> text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> ; > >> text:analyzer [ > >> a text:StandardAnalyzer > >> ] > >> ] > >> ) . > >> > >> > >> > >> <#service_text_tdb> rdf:type fuseki:Service ; > >> fuseki:name "ds" ; > >> fuseki:serviceQuery "query" ; > >> fuseki:serviceQuery "sparql" ; > >> fuseki:serviceUpdate "update" ; > >> fuseki:serviceUpload "upload" ; > >> fuseki:serviceReadGraphStore "get" ; > >> fuseki:serviceReadWriteGraphStore "data" ; > >> fuseki:dataset :text_dataset ; > >> . > >> > >> Thank you so much in advance, > >> > >> __________________________ > >> Zhenya Antić, PhD > >> Natural Language Processing > >> https://www.linkedin.com/in/zhenya-antic/ > >> > >> Practical Linguistics Inc > >> http://www.practicallinguistics.com > >> > >> > >> > > >