Hi,

welcome to Semantic Web and Apache Jena.

Comments inline:

On 20.03.20 15:36, Zhenya Antić wrote:
> Hello,
>
> I am a beginner with Fuseki, knowledge graphs and SPARQL, so please forgive 
> me if the questions seem obvious, the learning curve for this turned out to 
> be quite steep.
No problem, nothing is simple in the beginning,
>
> I am trying to get text indexing to work with my Fuseki knowledge graph.
Which DBpedia dataset did you load? I mean, which files?
>
> For starters, I tried using a regular expression, but that didn't work:
>
> Just a plain query like this:
> SELECT DISTINCT * WHERE {
>  ?s ?p ?o
> } 
> gives 98 results such as:
>
> 1
> <http://dbpedia.org/ontology/wikiPageID:9127632>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> <http://dbpedia.org/resource/Biology>
> 2
> <http://dbpedia.org/ontology/wikiPageID:9127632>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> <http://dbpedia.org/resource/Biology#Branches>
> 3
> <http://dbpedia.org/ontology/wikiPageID:9127632>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#branches_of_biology>
> 4
> <http://dbpedia.org/ontology/wikiPageID:18393>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#label>
> <http://dbpedia.org/resource/Life>
That can't be the correct output of this query. rdfs:label should return
literals as object (?o) - or you loaded some really weird data
>
> But a query with a regular expression:
> SELECT DISTINCT * WHERE {
>  ?s ?p ?o
>  FILTER regex(?o, "Biol", "i")
> }

1. you should help the query engine and use rdfs:label as property

2. you should use str() function on the ?o values:

SELECT DISTINCT * WHERE {
 ?s rdfs:label ?o
 FILTER regex(str(?o), "Biol", "i")
}

> gives 0 results, although there are clearly results that contain "Biol".


I've to try your config or maybe others will spot the issue in the meantime.

>
> I also tried setting up indexing with a .ttl file, however the result was 
> "INFO 0 (0 per second) properties indexed". .ttl file below:
>
> @prefix : <http://base/#> .
> @prefix tdb2: <http://jena.apache.org/2016/tdb#> .
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix fuseki: <http://jena.apache.org/fuseki#> .
> @prefix text: <http://jena.apache.org/text#> .
>
> <http://jena.apache.org/2016/tdb#DatasetTDB>
>  rdfs:subClassOf ja:RDFDataset .
>
> ja:DatasetTxnMem rdfs:subClassOf ja:RDFDataset .
>
> tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset .
>
> tdb2:GraphTDB2 rdfs:subClassOf ja:Model .
>
> <http://jena.apache.org/2016/tdb#GraphTDB2>
>  rdfs:subClassOf ja:Model .
>
> ja:MemoryDataset rdfs:subClassOf ja:RDFDataset .
>
> ja:RDFDatasetZero rdfs:subClassOf ja:RDFDataset .
>
> <http://jena.apache.org/text#TextDataset>
>  rdfs:subClassOf ja:RDFDataset .
>
> :service_tdb_all a fuseki:Service ;
>  rdfs:label "TDB biology" ;
>  fuseki:dataset :tdb_dataset_readwrite ;
>  fuseki:name "biology" ;
>  fuseki:serviceQuery "query" , "" , "sparql" ;
>  fuseki:serviceReadGraphStore "get" ;
>  fuseki:serviceReadQuads "" ;
>  fuseki:serviceReadWriteGraphStore
>  "data" ;
>  fuseki:serviceReadWriteQuads "" ;
>  fuseki:serviceUpdate "" , "update" ;
>  fuseki:serviceUpload "upload" .
>
> :tdb_dataset_readwrite
>  a tdb2:DatasetTDB2 ;
>  tdb2:location "db" .
>
> <http://jena.apache.org/2016/tdb#GraphTDB>
>  rdfs:subClassOf ja:Model .
>
> ja:RDFDatasetOne rdfs:subClassOf ja:RDFDataset .
>
> ja:RDFDatasetSink rdfs:subClassOf ja:RDFDataset .
>
> <http://jena.apache.org/2016/tdb#DatasetTDB2>
>  rdfs:subClassOf ja:RDFDataset .
>
> <#dataset> rdf:type tdb2:DatasetTDB2 ;
> tdb2:location "db" ; #path to TDB;
> .
>
> # Text index description
> :text_dataset rdf:type text:TextDataset ;
>  text:dataset <#dataset> ; # <-- replace `:my_dataset` with the desired URI
>  text:index <#indexLucene> ;
> .
>
> <#indexLucene> a text:TextIndexLucene ;
>  text:directory <file:data/luceneIndexing> ;
>  text:entityMap <#entMap> ;
>  .
>
> <#entMap> a text:EntityMap ;
>  text:defaultField "text" ;
>  text:entityField "uri" ;
>  text:map (
>  #RDF label abstracts
>  [ text:field "text" ;
>  text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#label> ;
>  text:analyzer [
>  a text:StandardAnalyzer
>  ] 
>  ]
>  [ text:field "text" ;
>  text:predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#synonym> ;
>  text:analyzer [
>  a text:StandardAnalyzer
>  ] 
>  ]
>  ) .
>
>
>
> <#service_text_tdb> rdf:type fuseki:Service ;
>  fuseki:name "ds" ;
>  fuseki:serviceQuery "query" ;
>  fuseki:serviceQuery "sparql" ;
>  fuseki:serviceUpdate "update" ;
>  fuseki:serviceUpload "upload" ;
>  fuseki:serviceReadGraphStore "get" ;
>  fuseki:serviceReadWriteGraphStore "data" ;
>  fuseki:dataset :text_dataset ;
>  .
>
> Thank you so much in advance,
>
> __________________________
> Zhenya Antić, PhD
> Natural Language Processing
> https://www.linkedin.com/in/zhenya-antic/
>
> Practical Linguistics Inc
> http://www.practicallinguistics.com
>
>
>

Reply via email to