I's too bad that the * joker feature, and other details of the SPARQL to Lucene query translation, are not documented on the Jena text search page.
Anyway, it works for my use case, I now have on my laptop a (kind of) replacement of dbPedia lookup service. To experiment with the original dbPedia lookup service, you can go to semantic_forms sandbox: http://163.172.179.125:9111/create?uri=&uri=http%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2FPerson and type a few letters in the dct:subject field. I don't need the full original literal value, because the URI results of the query are labelled in the application: a foaf:Person is labelled by given and family names, etc. BUT, there is a "but", the dbPedia lookup service are apropriately ordered by "notoriety". Instead, I currently get with http://localhost:9000/lookup?q=*Pari* on my TDB that mirrors dbPedia. <ArrayOfResult> <Result> <Label>Université Pierre-et-Marie-Curie</Label> <URI>http://dbpedia.org/resource/Pierre_and_Marie_Curie_University </URI> </Result><Result> <Label>Guillaume Le Gentil</Label> <URI>http://dbpedia.org/resource/Guillaume_Le_Gentil</URI> </Result><Result> <Label>1 E1 m</Label> <URI>http://dbpedia.org/resource/1_decametre</URI> </Result><Result> <Label>1 E4 m</Label> <URI>http://dbpedia.org/resource/1_myriametre</URI> </Result><Result> <Label>Nadia Boulanger</Label> <URI>http://dbpedia.org/resource/Nadia_Boulanger</URI> </Result><Result> <Label>Luis Mariano</Label> <URI>http://dbpedia.org/resource/Luis_Mariano</URI> </Result><Result> <Label>Paul Chemetov</Label> <URI>http://dbpedia.org/resource/Paul_Chemetov</URI> </Result><Result> <Label>Marc Boegner</Label> <URI>http://dbpedia.org/resource/Marc_Boegner</URI> </Result><Result> <Label>Cassandre (graphiste)</Label> <URI>http://dbpedia.org/resource/Cassandre_(artist)</URI> </Result><Result> <Label>La Norville</Label> <URI>http://dbpedia.org/resource/La_Norville</URI> </Result> </ArrayOfResult> My understanding is that I need to set a weight on URI's in Lucene to reflect their "notoriety". I see 2 ways: 1. easy to implement: just count the triples from and to the URI 2. also take in account the the URI's consulted by user in my application (but currently I don't record that information); there is also the issue of combining weights 1) and 2) Google search does both weightings. So, in the short term I have to figure out how to add weights to the Lucene - Jena index. Then I have to read what dbPedia lookup does, and other background material. 2016-10-31 16:42 GMT+01:00 Osma Suominen <[email protected]>: > Hi Jean-Marc, > > Depending on what exactly you want from such a service, this may be > already possible with jena-text. > > I'm assuming that you want to perform a prefix search such as "édu*" and > get possible completions for that, such as "éducation". > > You can of course already do a prefix search with jena-text. What you will > get back will be the RDF resources which have labels that contain this > prefix. If the text index is configured to store literal values, you can > ask for the actual values as well. > > E.g. with this data: > > ex:cse rdfs:label "Conseil supérieur de l'éducation"@fr . > > and a suitably configured jena-text index, you can perform this query: > > (?s ?score ?literal) text:query (rdfs:label "édu*") . > > and get back these bindings: > > ?s=ex:cse ?literal="Conseil supérieur de l'éducation"@fr > > However, you will get the full original literal value, not just the > individual word that matched ("éducation"). If you want just the matched > word, you will need special support that jena-text doesn't currently have. > > -Osma > > On 17/10/16 11:37, Jean-Marc Vanel wrote: > >> Hi >> >> I'm implementing an equivalent of dbPedia lookup service [1] in >> semantic_forms, leveraging on Lucene integration in TDB, and dbPedia >> mirror >> with TDB [2] . >> >> The dbPedia lookup service is really nice but: >> >> - the hosted service is often down >> - completion is in english only >> >> A lookup service with TDB and Lucene would overcome these 2 problems. >> >> So I would need completion with Lucene from SPARQL. >> According to Jena doc., this does not seems to be implemented: >> https://jena.apache.org/documentation/query/text-query.html# >> query-with-sparql >> >> There are plenty of pages when searching for >> lucene completion >> >> From these pages there is a code snippet here >> http://stackoverflow.com/questions/120180/how-to-do-query- >> auto-completion-suggestions-in-lucene >> but a regular Lucene API may exist. >> >> [1] https://github.com/dbpedia/lookup >> [2] >> https://github.com/jmvanel/semantic_forms/blob/master/doc/ >> en/administration.md#populating-with-dbpedia-mirroring-dbpedia >> >> > > -- > Osma Suominen > D.Sc. (Tech), Information Systems Specialist > National Library of Finland > P.O. Box 26 (Kaikukatu 4) > 00014 HELSINGIN YLIOPISTO > Tel. +358 50 3199529 > [email protected] > http://www.nationallibrary.fi > -- Jean-Marc Vanel Profil: http://163.172.179.125:9111/display?displayuri=http%3A%2F%2Fjmvanel.free.fr%2Fjmv.rdf%23me Déductions SARL - Consulting, services, training, Rule-based programming, Semantic Web +33 (0)6 89 16 29 52 Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui
