Code Ferret created JENA-1459: --------------------------------- Summary: add highlighting support to jena-text Key: JENA-1459 URL: https://issues.apache.org/jira/browse/JENA-1459 Project: Apache Jena Issue Type: Improvement Components: Jena, Text Affects Versions: Jena 3.6.0 Reporter: Code Ferret Assignee: Code Ferret
This issue proposes an improvement to jena-text to include optional highlighting of results via: {{org.apache.lucene.search.highlight.Highlighter}} and {{org.apache.lucene.search.highlight.SimpleHTMLFormatter}} The improvement will add an optional input argument to {{TextQueryPF}} that signals that highlighting should be performed on the Lucene search results; optionally indicates the _start_ and _end_ char sequences of a highlighted term; optionally indicates the maximum number of fragments to highlight; and optionally indicates a fragment separator. The highlighted results are bound to the {{?literal}} output argument of {{TextQueryPF}}. Inclusion of this improvement will introduce a simple extraction of the _highlight_ option string and a single test for the presence of the option string so that the inclusion of the improvement will be of minimal impact when highlighting is not used. The _highlight_ option string is passed directly to {{TextIndex.query(...)}} and so can be used from code other than {{TextQueryPF}}. The simplest use of highlighting is like: {code} select ?s ?lit where { (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight:") . } {code} which will produce results such as: {code} "another ↦one↤ abc"@en {code} the right-arrow (\u21a6) and left-arrow (\u21a4) are the default _start_ and _end_ highlighting character sequences. These are chosen to be very unlikely to occur in literals. These can be changed easily via {{"s:"}} and {{"e:"}} in the highlight options, for example: {code} select ?s ?lit where { (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight: s:<em class='hilite'> | e:</em>") . {code} which will produce results such as: {code} "another <em class='hilite'>one</em> abc"@en {code} Coding of this improvement is complete and a PR can be issued if there is agreement that this improvement should be included in jena-text. -- This message was sent by Atlassian JIRA (v6.4.14#64029)