Highlighter component should expose snippet character offsets and the score.
----------------------------------------------------------------------------
Key: SOLR-1954
URL: https://issues.apache.org/jira/browse/SOLR-1954
Project: Solr
Issue Type: New Feature
Components: highlighter
Reporter: David Smiley
Priority: Minor
The Highlighter Component does not currently expose the snippet character
offsets nor the score. There is a TODO in DefaultSolrHighlighter indicating
the intention to add this eventually. This information is needed when doing
highlighting on external content. The data is there so its pretty easy to
output it in some way. The challenge is deciding on the output and its
ramifications on backwards compatibility. The current highlighter component
response structure doesn't lend itself to adding any new data, unfortunately.
I wish the original implementer had some foresight. Unfortunately all the
highlighting tests assume this structure. Here is a snippet of the current
response structure in Solr's sample data searching for "sdram" for reference:
{code:xml}
<lst name="highlighting">
<lst name="VS1GB400C3">
<arr name="text">
<str>CORSAIR ValueSelect 1GB 184-Pin DDR <em>SDRAM</em>
Unbuffered DDR 400 (PC 3200) System Memory - Retail</str>
</arr>
</lst>
</lst>
{code}
Perhaps as a little hack, we introduce a pseudo field called
text_startCharOffset which is the concatenation of the matching field and
"_startCharOffset". This would be an array of ints. Likewise, there would be
another array for endCharOffset and score.
Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]