Code Ferret created JENA-1459:
---------------------------------

             Summary: add highlighting support to jena-text
                 Key: JENA-1459
                 URL: https://issues.apache.org/jira/browse/JENA-1459
             Project: Apache Jena
          Issue Type: Improvement
          Components: Jena, Text
    Affects Versions: Jena 3.6.0
            Reporter: Code Ferret
            Assignee: Code Ferret


This issue proposes an improvement to jena-text to include optional 
highlighting of results via:

{{org.apache.lucene.search.highlight.Highlighter}}

and 

{{org.apache.lucene.search.highlight.SimpleHTMLFormatter}}

The improvement will add an optional input argument to {{TextQueryPF}} that 
signals that highlighting should be performed on the Lucene search results; 
optionally indicates the _start_ and _end_ char sequences of a highlighted 
term; optionally indicates the maximum number of fragments to highlight; and 
optionally indicates a fragment separator.

The highlighted results are bound to the {{?literal}} output argument of  
{{TextQueryPF}}.

Inclusion of this improvement will introduce a simple extraction of the 
_highlight_ option string and a single test for the presence of the option 
string so that the inclusion of the improvement will be of minimal impact when 
highlighting is not used. The _highlight_ option string is passed directly to 
{{TextIndex.query(...)}} and so can be used from code other than 
{{TextQueryPF}}.

The simplest use of highlighting is like:
{code}
select ?s ?lit
where {
  (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight:") .
}
{code}
which will produce results such as:
{code}
"another ↦one↤ abc"@en
{code}
the right-arrow (\u21a6) and left-arrow (\u21a4) are the default _start_ and 
_end_ highlighting character sequences. These are chosen to be very unlikely to 
occur in literals. These can be changed easily via {{"s:"}} and {{"e:"}} in the 
highlight options, for example:
{code}
select ?s ?lit
where {
  (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight: 
s:<em class='hilite'> | e:</em>") .
{code}
which will produce results such as:
{code}
"another <em class='hilite'>one</em> abc"@en
{code}

Coding of this improvement is complete and a PR can be issued if there is 
agreement that this improvement should be included in jena-text.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to