[ 
https://issues.apache.org/jira/browse/SOLR-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618367#comment-13618367
 ] 

Erick Erickson commented on SOLR-4656:
--------------------------------------

I plan on committing this Tuesday or so unless there are objections....

hl.maxMultiValuedToMatch   - stops looking in the values in a multiValued field 
after N matches are found. Default is Integer.MAX_VALUE

hl.maxMultiValuedToExamine - stops looking in the values in a multiValued field 
after N values are examined, regardless of how many have been found (no matches 
is perfectly reasonable). Defaults to Integer.MAX_VALUE

If both are specified, the first condition met stops the comparisons.

The patch also restructures traversing the fields in the document so we aren't 
copying things around so much, I'd particularly like someone to glance at that 
code. All tests pass, but a second set of eyes would be welcome.

Also along the way I found this parameter that I'd never seen before: 
hl.preserveMulti and added it to the highlight parameter page 
(http://wiki.apache.org/solr/HighlightingParameters) with the explanation from 
a comment in the code, some clarification there might be a good thing.

Fortunately, the changes are actually relatively minor, most of the bulk of the 
patch is additional tests.
                
> Add hl.maxMultiValuedToExamine to limit the number of multiValued entries 
> examined while highlighting
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4656
>                 URL: https://issues.apache.org/jira/browse/SOLR-4656
>             Project: Solr
>          Issue Type: Improvement
>          Components: highlighter
>    Affects Versions: 4.3, 5.0
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-4656-4x.patch, SOLR-4656.patch, 
> SOLR-4656-trunk.patch
>
>
> I'm looking at an admittedly pathological case of many, many entries in a 
> multiValued field, and trying to implement a way to limit the number 
> examined, analogous to maxAnalyzedChars, see the patch.
> Along the way, I noticed that we do what looks like unnecessary copying of 
> the fields to be examined. We call Document.getFields, which copies all of 
> the fields and values to the returned array. Then we copy all of those to 
> another array, converting them to Strings. Then we actually examine them. a> 
> this doesn't seem very efficient and b> reduces the benefit from limiting the 
> number of mv values examined.
> So the attached does two things:
> 1> attempts to fix this
> 2> implements hl.maxMultiValuedToExamine
> I'd _really_ love it if someone who knows the highlighting code takes a peek 
> at the fix to see if I've messed things up, the changes are actually pretty 
> minimal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to