Osma Suominen created JENA-1093:
-----------------------------------

             Summary: jena-text query doesn't return all matching literals
                 Key: JENA-1093
                 URL: https://issues.apache.org/jira/browse/JENA-1093
             Project: Apache Jena
          Issue Type: Bug
          Components: Text
    Affects Versions: Jena 3.0.1
            Reporter: Osma Suominen
            Assignee: Osma Suominen


After the optimizations in JENA-999, the text:query property function, when 
asked for stored literal values, no longer returns all matching literals. 
Instead, each subject is returned with a random TextHit (i.e. score+literal 
pair). This is a problem for me because I want to show to the user the most 
relevant reason why the search matched a particular SKOS concept (there may be 
many matching labels in various languages), or in some cases all the reasons. 

Also the returned match may not have the highest score, which could be a 
problem if one is interested in the score (I'm not).

For example, with storeLiterals enabled and this data:

{noformat}
ex:subject rdfs:label "one reason", "another reason" .
{noformat}

this query

{noformat}
(?s ?score ?literal) text:query "reason" .
{noformat}

will return a single binding where ?literal is bound to either "one reason" or 
"another reason".

Before JENA-999 it returned two bindings, one per literal.

The culprit is the post-JENA-999 code in the TextQueryPF.exec method, 
particularly around this line that suppresses subsequent hits with the same 
subject URI:
https://github.com/apache/jena/blob/master/jena-text/src/main/java/org/apache/jena/query/text/TextQueryPF.java#L188

I already have a failing unit test that shows what I'd like to accomplish. I 
will try to make a PR at some point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to