Osma, That makes sense, and the first tests are not bad.
Although I'm surprised that "par*" does not get dbpedia:Paris in the first 10; but "pari*" does get dbpedia:Paris in the first position: "count" "s" "3090"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/Paris "2676"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/London "72"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/Émile_Durkheim "68"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ Henri_Bergson "66"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ 20th_arrondissement_of_Paris "64"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ Cornelius_Castoriadis "64"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ Jacques_Derrida "63"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ Michel_Foucault "62"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/Louis,_Grand_Condé "60"^^http://www.w3.org/2001/XMLSchema#integer http://dbpedia.org/resource/ Jean-Jacques_Rousseau I'll add that SPARQL in my sandbox as a replacement of dbpedia lookup service, and tell you how it goes. But I foresee that using the Lucene implementation after adding the weights will be more efficient. But that demands more work... 2016-11-03 14:30 GMT+01:00 Osma Suominen <osma.suomi...@helsinki.fi>: > Hi Jean-Marc! > > AFAIK using the weights to order results is intimately linked to the text >> index querying. >> If I want the top 10 results, the search must have the weights beforehand >> otherwise I must get all the results to filter later. >> This is the reason for using AnalyzingInfixSuggester. >> Lucene 4_9_1 >> https://lucene.apache.org/core/4_9_1/suggest/org/apache/luce >> ne/search/suggest/analyzing/AnalyzingInfixSuggester.html >> Lucene 6_2_1 >> https://lucene.apache.org/core/6_2_1/suggest/org/apache/luce >> ne/search/suggest/analyzing/AnalyzingInfixSuggester.html >> >> I guess this is what you call "performance reasons" . >> > > I don't see why you couldn't, in principle, do something like this: > > SELECT ?s (COUNT(*) as ?count) > WHERE { > ?s text:query "édu*" . > ?s ?p ?o . > } > GROUP BY ?s > ORDER BY DESC(?count) > LIMIT 10 > > (note: untested query) > > I'm sure it will get slow if the number of hits from the text index is > more than a few dozen. But for a small number of results at a time, it > might work. > > As I wrote in the original post, "I'll have to implement also the callback >> for updates >> like class TextDocProducerTriples in Jena-text." . >> http://jena.apache.org/documentation/javadoc/text/org/apache >> /jena/query/text/TextDocProducerTriples.html >> > > Isn't that called only when the indexed triple changes (e.g. the one with > rdfs:label or skos:prefLabel or whatever property you are indexing), but > not when other data related to the same subject changes? So if new triples > are added for the same subject, but its label is unchanged, then the text > index won't see the update and thus the count of references/triples won't > be updated either. > > I may be wrong here, I'm not sure how the update tracking works. > > -Osma > > > > -- > Osma Suominen > D.Sc. (Tech), Information Systems Specialist > National Library of Finland > P.O. Box 26 (Kaikukatu 4) > 00014 HELSINGIN YLIOPISTO > Tel. +358 50 3199529 > osma.suomi...@helsinki.fi > http://www.nationallibrary.fi > -- Jean-Marc Vanel Profil: http://163.172.179.125:9111/display?displayuri=http%3A%2F% 2Fjmvanel.free.fr%2Fjmv.rdf%23me Déductions SARL - Consulting, services, training, Rule-based programming, Semantic Web +33 (0)6 89 16 29 52 Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui