I modified the test according to your request: https://github.com/jmvanel/semantic_forms/blob/master/scala/ forms/src/main/scala/deductions/runtime/jena/lucene/TestTextIndex2.scala
and here is the result: [info] Doc: 0 [info] 1 stored,indexed,tokenized,indexOptions=DOCS<uri:test:/test1> [info] uri = test:/test1 [info] 2 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:19d3a93327bdf0b91b03170dceb6e012423dece6a8a9e0ec48f098e8f742a5f6> [info] uid = 19d3a93327bdf0b91b03170dceb6e012423dece6a8a9e0ec48f098e8f742a5f6 [info] 3 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:59074d3e3a183c6fd25f3ee84b2603dcbd9de496fbaca72d4f42093bca3ad169> [info] uid = 59074d3e3a183c6fd25f3ee84b2603dcbd9de496fbaca72d4f42093bca3ad169 [info] search "test1" [info] Doc: 0 [info] 1 stored,indexed,tokenized,indexOptions=DOCS<uri:test:/test1> [info] uri = test:/test1 [info] 2 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:19d3a93327bdf0b91b03170dceb6e012423dece6a8a9e0ec48f098e8f742a5f6> [info] uid = 19d3a93327bdf0b91b03170dceb6e012423dece6a8a9e0ec48f098e8f742a5f6 [info] 3 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:59074d3e3a183c6fd25f3ee84b2603dcbd9de496fbaca72d4f42093bca3ad169> [info] uid = 59074d3e3a183c6fd25f3ee84b2603dcbd9de496fbaca72d4f42093bca3ad169 [info] sparql Query [info] PREFIX text: <http://jena.apache.org/text#> [info] SELECT * WHERE { [info] graph ?g { [info] ?thing ?p ?o . [info] } [info] } [info] [info] -------------------------------------------------------------------------------------------------------------- [info] | thing | p | o | g | [info] ============================================================================================================== [info] | <test:/test1> | <http://www.w3.org/2000/01/rdf-schema#label> | "test-extra-data" | <test:/test-extra-data> | [info] | <test:/test1> | <http://www.w3.org/2000/01/rdf-schema#label> | "test1" | <test:/test1> | [info] | <test:/test1> | <http://xmlns.com/foaf/0.1/givenName> | "test1" | <test:/test1> | [info] -------------------------------------------------------------------------------------------------------------- [info] sparql Query [info] PREFIX text: <http://jena.apache.org/text#> [info] SELECT * WHERE { [info] graph ?g { [info] ?thing text:query 'test1' . [info] ?thing ?p ?o . [info] } [info] } [info] [info] --------------------- [info] | thing | p | o | g | [info] ===================== [info] --------------------- [info] tdb.tdbdump (after dataset.close() ) [info] <test:/test1> <http://www.w3.org/2000/01/rdf-schema#label> "test-extra-data" <test:/test-extra-data> . [info] <test:/test1> <http://www.w3.org/2000/01/rdf-schema#label> "test1" <test:/test1> . [info] <test:/test1> <http://xmlns.com/foaf/0.1/givenName> "test1" <test:/test1> . [success] Total time: 4 s, completed 30 juil. 2017 10:39:11 I can help you with compile and run the test in Scala, or even translate it in Java, or any other help :) . 2017-07-29 19:04 GMT+02:00 Andy Seaborne <a...@apache.org>: > > On 29/07/17 09:54, Jean-Marc Vanel wrote: > >> The self-contained test with no semantic_forms nor Banana dependency, that >> reproduces the scenario by the API: >> https://github.com/jmvanel/semantic_forms/blob/master/ >> scala/forms/src/main/scala/deductions/runtime/jena/ >> lucene/TestTextIndex2.scala >> >> now FAILS! >> >> Jena problem: >> when adding first a named graph with no relevant data, >> and second the named graph with relevant data, >> the SPARQL query with text:query FAILS. >> >> It looks as if only the first named graph is used in SPARQL processing. >> >> > Could you please try a query that focus on that point: > > SELECT * { GRAPH ?g { ?s ?p ?o } } > > Andy > > It is a regression after the migration to recent Lucene version. >> It looks as if nobody tested Jena + Lucene with several named graphs ... >> >> >> >> 2017-07-28 14:50 GMT+02:00 Jean-Marc Vanel <jeanmarc.va...@gmail.com>: >> >> Forgot to say that I'm using Jena 3.3.0 on Ubuntu 17.04 , and >>> java -version >>> java version "1.8.0_121" >>> Java(TM) SE Runtime Environment (build 1.8.0_121-b13) >>> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) >>> >>> The semantic_forms sandbox is up-to-date with the source code and the >>> scenario above: >>> http://semantic-forms.cc:9111/ >>> >>> >>> >>> >>> 2017-07-28 13:14 GMT+02:00 Jean-Marc Vanel <jeanmarc.va...@gmail.com>: >>> >>> Hi >>>> >>>> I've checked lots of things for 2 days. >>>> >>>> I have this scenario in semantic_forms: >>>> >>>> - on fresh TDB and LUCENE directories >>>> - load rdfs: (the ontology) >>>> - create instance of class bli:bli (sic !) >>>> - enter rdfs:comment bli >>>> - search bli => NOTHING !!! :( >>>> >>>> I wrote a self-contained test with no semantic_forms nor Banana >>>> dependency, that reproduces the same scenario by theAPI: >>>> https://github.com/jmvanel/semantic_forms/blob/master/scala/ >>>> forms/src/main/scala/deductions/runtime/jena/lucene/ >>>> TestTextIndex2.scala >>>> >>>> But it succeds !!! >>>> >>>> So I wrote another test that runs on the TDB that was prepared in the >>>> above scenario in semantic_forms: >>>> https://github.com/jmvanel/semantic_forms/blob/master/scala/ >>>> forms/src/main/scala/deductions/runtime/jena/lucene/ >>>> QueryTextIndex.scala >>>> >>>> The indexing seems normal on Lucene + Jena side, but NOT the SPARQL >>>> search with text:query . >>>> >>>> runMain deductions.runtime.jena.lucene.QueryTextIndex bli TDB >>>> ... >>>> [info] search with Lucene: bli >>>> [info] Doc: 30 >>>> [info] 1 stored,indexed,tokenized,indexOptions=DOCS<uri:http://localh >>>> ost:9000/ldp/1501237821055-8217451390491> >>>> [info] uri = http://localhost:9000/ldp/1501237821055-8217451390491 >>>> [info] 2 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<lang: >>>> fr> >>>> [info] lang = fr >>>> [info] 3 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:f1e >>>> 70540a1cd751b78e29b31b4ae57c5520b71a728f8e1c7b24c698e8cd85e83> >>>> [info] uid = f1e70540a1cd751b78e29b31b4ae57 >>>> c5520b71a728f8e1c7b24c698e8cd85e83 >>>> [info] 4 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<lang: >>>> fr> >>>> [info] lang = fr >>>> [info] 5 stored,indexed,tokenized,omitNorms,indexOptions=DOCS<uid:435 >>>> b1578a796765c441ad43a9147e1952abbc44facfa5aebab3d6cb67e98f844> >>>> [info] uid = 435b1578a796765c441ad43a9147e1 >>>> 952abbc44facfa5aebab3d6cb67e98f844 >>>> [info] query >>>> [info] PREFIX text: <http://jena.apache.org/text#> >>>> [info] PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> >>>> [info] SELECT * WHERE { >>>> [info] graph ?g { >>>> [info] # ?thing text:query (rdfs:label "bli" ) . >>>> [info] ?thing text:query 'bli' . >>>> [info] ?thing ?p ?o . >>>> [info] } >>>> [info] } LIMIT 22 >>>> [info] >>>> [info] --------------------- >>>> [info] | thing | p | o | g | >>>> [info] ===================== >>>> [info] --------------------- >>>> >>>> The URI in Lucene dump is correct. I'm surprised that field "lang" >>>> appears 2 times, and "graph" not at all . >>>> >>>> I've looked in the Jena code, and the member fields in EntityDefinition >>>> https://github.com/apache/jena/blob/master/jena-text/src/ >>>> main/java/org/apache/jena/query/text/EntityDefinition.java#L39 >>>> looks as if it is not always updated. >>>> fields is initialized once from fieldToPredicate, and I'm not sure that >>>> fieldToPredicate is initialized before; >>>> moreover it is modified by method >>>> void set(String field, Node predicate) >>>> https://github.com/apache/jena/blob/master/jena-text/src/ >>>> main/java/org/apache/jena/query/text/EntityDefinition.java#L126 >>>> >>>> -- >>>> Jean-Marc Vanel >>>> http://www.semantic-forms.cc:9111/display?displayuri=http:// >>>> jmvanel.free.fr/jmv.rdf%23me >>>> Déductions SARL - Consulting, services, training, >>>> Rule-based programming, Semantic Web >>>> +33 (0)6 89 16 29 52 <+33%206%2089%2016%2029%2052> >>>> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui >>>> >>>> >>> >>> >>> -- >>> Jean-Marc Vanel >>> http://www.semantic-forms.cc:9111/display?displayuri=http:/ >>> /jmvanel.free.fr/jmv.rdf%23me >>> Déductions SARL - Consulting, services, training, >>> Rule-based programming, Semantic Web >>> +33 (0)6 89 16 29 52 <+33%206%2089%2016%2029%2052> >>> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui >>> >>> >> >> >> -- Jean-Marc Vanel http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me Déductions SARL - Consulting, services, training, Rule-based programming, Semantic Web +33 (0)6 89 16 29 52 Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui