Hi, welcome to Semantic Web and Apache Jena.
Comments inline: On 20.03.20 15:36, Zhenya Antić wrote: > Hello, > > I am a beginner with Fuseki, knowledge graphs and SPARQL, so please forgive > me if the questions seem obvious, the learning curve for this turned out to > be quite steep. No problem, nothing is simple in the beginning, > > I am trying to get text indexing to work with my Fuseki knowledge graph. Which DBpedia dataset did you load? I mean, which files? > > For starters, I tried using a regular expression, but that didn't work: > > Just a plain query like this: > SELECT DISTINCT * WHERE { > ?s ?p ?o > } > gives 98 results such as: > > 1 > <http://dbpedia.org/ontology/wikiPageID:9127632> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > <http://dbpedia.org/resource/Biology> > 2 > <http://dbpedia.org/ontology/wikiPageID:9127632> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > <http://dbpedia.org/resource/Biology#Branches> > 3 > <http://dbpedia.org/ontology/wikiPageID:9127632> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#branches_of_biology> > 4 > <http://dbpedia.org/ontology/wikiPageID:18393> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> > <http://dbpedia.org/resource/Life> That can't be the correct output of this query. rdfs:label should return literals as object (?o) - or you loaded some really weird data > > But a query with a regular expression: > SELECT DISTINCT * WHERE { > ?s ?p ?o > FILTER regex(?o, "Biol", "i") > } 1. you should help the query engine and use rdfs:label as property 2. you should use str() function on the ?o values: SELECT DISTINCT * WHERE { ?s rdfs:label ?o FILTER regex(str(?o), "Biol", "i") } > gives 0 results, although there are clearly results that contain "Biol". I've to try your config or maybe others will spot the issue in the meantime. > > I also tried setting up indexing with a .ttl file, however the result was > "INFO 0 (0 per second) properties indexed". .ttl file below: > > @prefix : <http://base/#> . > @prefix tdb2: <http://jena.apache.org/2016/tdb#> . > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix fuseki: <http://jena.apache.org/fuseki#> . > @prefix text: <http://jena.apache.org/text#> . > > <http://jena.apache.org/2016/tdb#DatasetTDB> > rdfs:subClassOf ja:RDFDataset . > > ja:DatasetTxnMem rdfs:subClassOf ja:RDFDataset . > > tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset . > > tdb2:GraphTDB2 rdfs:subClassOf ja:Model . > > <http://jena.apache.org/2016/tdb#GraphTDB2> > rdfs:subClassOf ja:Model . > > ja:MemoryDataset rdfs:subClassOf ja:RDFDataset . > > ja:RDFDatasetZero rdfs:subClassOf ja:RDFDataset . > > <http://jena.apache.org/text#TextDataset> > rdfs:subClassOf ja:RDFDataset . > > :service_tdb_all a fuseki:Service ; > rdfs:label "TDB biology" ; > fuseki:dataset :tdb_dataset_readwrite ; > fuseki:name "biology" ; > fuseki:serviceQuery "query" , "" , "sparql" ; > fuseki:serviceReadGraphStore "get" ; > fuseki:serviceReadQuads "" ; > fuseki:serviceReadWriteGraphStore > "data" ; > fuseki:serviceReadWriteQuads "" ; > fuseki:serviceUpdate "" , "update" ; > fuseki:serviceUpload "upload" . > > :tdb_dataset_readwrite > a tdb2:DatasetTDB2 ; > tdb2:location "db" . > > <http://jena.apache.org/2016/tdb#GraphTDB> > rdfs:subClassOf ja:Model . > > ja:RDFDatasetOne rdfs:subClassOf ja:RDFDataset . > > ja:RDFDatasetSink rdfs:subClassOf ja:RDFDataset . > > <http://jena.apache.org/2016/tdb#DatasetTDB2> > rdfs:subClassOf ja:RDFDataset . > > <#dataset> rdf:type tdb2:DatasetTDB2 ; > tdb2:location "db" ; #path to TDB; > . > > # Text index description > :text_dataset rdf:type text:TextDataset ; > text:dataset <#dataset> ; # <-- replace `:my_dataset` with the desired URI > text:index <#indexLucene> ; > . > > <#indexLucene> a text:TextIndexLucene ; > text:directory <file:data/luceneIndexing> ; > text:entityMap <#entMap> ; > . > > <#entMap> a text:EntityMap ; > text:defaultField "text" ; > text:entityField "uri" ; > text:map ( > #RDF label abstracts > [ text:field "text" ; > text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> ; > text:analyzer [ > a text:StandardAnalyzer > ] > ] > [ text:field "text" ; > text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> ; > text:analyzer [ > a text:StandardAnalyzer > ] > ] > ) . > > > > <#service_text_tdb> rdf:type fuseki:Service ; > fuseki:name "ds" ; > fuseki:serviceQuery "query" ; > fuseki:serviceQuery "sparql" ; > fuseki:serviceUpdate "update" ; > fuseki:serviceUpload "upload" ; > fuseki:serviceReadGraphStore "get" ; > fuseki:serviceReadWriteGraphStore "data" ; > fuseki:dataset :text_dataset ; > . > > Thank you so much in advance, > > __________________________ > Zhenya Antić, PhD > Natural Language Processing > https://www.linkedin.com/in/zhenya-antic/ > > Practical Linguistics Inc > http://www.practicallinguistics.com > > >