Hi

I've checked lots of things for 2 days.

I have this scenario in semantic_forms:

    - on fresh TDB and LUCENE directories
    - load rdfs: (the ontology)
    - create instance of class bli:bli (sic !)
    - enter rdfs:comment bli
    - search bli => NOTHING !!! :(

I wrote a  self-contained test with no semantic_forms nor Banana
dependency, that reproduces the same scenario by theAPI:
https://github.com/jmvanel/semantic_forms/blob/master/scala/forms/src/main/scala/deductions/runtime/jena/lucene/TestTextIndex2.scala

But it succeds !!!

So I wrote another test that runs on the TDB that was prepared in the above
scenario in semantic_forms:
https://github.com/jmvanel/semantic_forms/blob/master/scala/forms/src/main/scala/deductions/runtime/jena/lucene/QueryTextIndex.scala

The indexing seems normal on Lucene + Jena side, but NOT the SPARQL search
with text:query .

runMain deductions.runtime.jena.lucene.QueryTextIndex bli TDB
...
[info] search with Lucene: bli
[info] Doc: 30
[info]   1 stored,indexed,tokenized,indexOptions=DOCS<uri:
http://localhost:9000/ldp/1501237821055-8217451390491>
[info]   uri = http://localhost:9000/ldp/1501237821055-8217451390491
[info]   2 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<lang:fr>
[info]   lang = fr
[info]   3
stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:f1e70540a1cd751b78e29b31b4ae57c5520b71a728f8e1c7b24c698e8cd85e83>
[info]   uid =
f1e70540a1cd751b78e29b31b4ae57c5520b71a728f8e1c7b24c698e8cd85e83
[info]   4 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<lang:fr>
[info]   lang = fr
[info]   5
stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:435b1578a796765c441ad43a9147e1952abbc44facfa5aebab3d6cb67e98f844>
[info]   uid =
435b1578a796765c441ad43a9147e1952abbc44facfa5aebab3d6cb67e98f844
[info] query
[info]     PREFIX text: <http://jena.apache.org/text#>
[info]     PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
[info]     SELECT * WHERE {
[info]     graph ?g {
[info]     # ?thing text:query (rdfs:label  "bli" ) .
[info]     ?thing text:query 'bli' .
[info]     ?thing ?p ?o .
[info]   }
[info] } LIMIT 22
[info]
[info] ---------------------
[info] | thing | p | o | g |
[info] =====================
[info] ---------------------

The URI in Lucene dump is correct. I'm surprised that field "lang" appears
2 times, and "graph" not at all .

I've looked in the Jena code, and the member fields in EntityDefinition
https://github.com/apache/jena/blob/master/jena-text/src/main/java/org/apache/jena/query/text/EntityDefinition.java#L39
looks as if it is not always updated.
fields is initialized once from fieldToPredicate, and I'm not sure that
fieldToPredicate is initialized before;
moreover it is modified by method
void set(String field, Node predicate)
https://github.com/apache/jena/blob/master/jena-text/src/main/java/org/apache/jena/query/text/EntityDefinition.java#L126

-- 
Jean-Marc Vanel
http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me
Déductions SARL - Consulting, services, training,
Rule-based programming, Semantic Web
+33 (0)6 89 16 29 52
Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui

Reply via email to