[jira] Commented: (SOLR-1954) Highlighter component should expose snippet character offsets and the score.

Hoss Man (JIRA) Wed, 16 Jun 2010 16:33:50 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879580#action_12879580
 ]


Hoss Man commented on SOLR-1954:
--------------------------------

if the structure is poor and hard to add additional metadata to which would be 
beneficial to new users let's change it

As long as there is an option people can turn on to force the legacy behavior 
there's nothing wrong with that.

In it's simplest form we can just add a new Highlighting Component (with a 
different class name) that is registered by default as the component 
"highlight" and document in CHANGES.txt that if people need/want the old one 
they should modify their solrconfig.xml to register it explicitly .

alternately we can keep using hte existing class, and modify it so that it 
changes it's behavior based on some init param, ditto previous comments about 
default behavior and CHANGES.txt

(back compat should be *easy* on upgrade, but i'd rather tell existing users 
"add this one line to your config if you really need the exact same response 
structure instead of this new better structure" then tell new and existing 
users "this is the really klunky hoop you have to jump through to make sense of 
all this hot new data we are returning")

> Highlighter component should expose snippet character offsets and the score.
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-1954
>                 URL: https://issues.apache.org/jira/browse/SOLR-1954
>             Project: Solr
>          Issue Type: New Feature
>          Components: highlighter
>            Reporter: David Smiley
>            Priority: Minor
>         Attachments: SOLR-1954_start_and_end_offsets.patch
>
>
> The Highlighter Component does not currently expose the snippet character 
> offsets nor the score.  There is a TODO in DefaultSolrHighlighter indicating 
> the intention to add this eventually.  This information is needed when doing 
> highlighting on external content.  The data is there so its pretty easy to 
> output it in some way.  The challenge is deciding on the output and its 
> ramifications on backwards compatibility.  The current highlighter component 
> response structure doesn't lend itself to adding any new data, unfortunately. 
>  I wish the original implementer had some foresight.  Unfortunately all the 
> highlighting tests assume this structure.  Here is a snippet of the current 
> response structure in Solr's sample data searching for "sdram" for reference:
> {code:xml}
> <lst name="highlighting">
>  <lst name="VS1GB400C3">
>   <arr name="text">
>       <str>CORSAIR ValueSelect 1GB 184-Pin DDR &lt;em&gt;SDRAM&lt;/em&gt; 
> Unbuffered DDR 400 (PC 3200) System Memory - Retail</str>
>   </arr>
>  </lst>
> </lst>
> {code}
> Perhaps as a little hack, we introduce a pseudo field called 
> text_startCharOffset which is the concatenation of the matching field and 
> "_startCharOffset".  This would be an array of ints.  Likewise, there would 
> be another array for endCharOffset and score.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (SOLR-1954) Highlighter component should expose snippet character offsets and the score.

Reply via email to