Interesting, and I see the Muir put in Solr support in 4.1. But in
this particular case they don't have the field indexed, so I don't
think that will help them? (it's a long story, but they have good
reasons for this)...

Thanks,
Erick

On Thu, Mar 28, 2013 at 11:10 AM, Michael McCandless
<[email protected]> wrote:
> Maybe try PostingsHighlighter?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Mar 28, 2013 at 10:09 AM, Erick Erickson
> <[email protected]> wrote:
>> I have a situation where a client has a zillion small entries in a
>> multivalued field. When the number of values, and values here means
>> number of times doc.addFIeld(bigfield, value) is called on a doc,
>> highlighting becomes very slow. Doing highlighting on the same data
>> but all concatenated into a single field is much faster. I'm guessing
>> the slowdown is due to setup/analyze being done for each entry and the
>> attendant bookkeeping.
>>
>> We've already talked about using FVH, and we're going to see if the
>> index bloat is acceptable, but meanwhile....
>>
>> I'm _guessing_ that the issue is analogous to scoring. You have to
>> examine every entry to find the "best" snippet. This particular
>> application would be fine with not worrying about "best", and willing
>> to stop after the first N.
>>
>> hl.snippets doesn't seem to work. If I'm reading the code correctly
>> (and I admit I didn't look very deeply), all the entries need to be
>> examined to find the top N "best" fits.
>>
>> hl.maxAnalyzedChars doesn't seem to help either, since it's applied to
>> each individual value not the field as a whole.
>>
>> So here's my question: What are the options besides FVH? Is there
>> something I'm overlooking? The client is quite willing to write custom
>> components and also will consider donating them back to the community
>> FWIW.
>>
>> One possibility that's come up is to have something analogous to
>> maxAnalyzedChars but where the unit was MV entries, something like
>> hl.mvBailCount=N (needs to be per-field override) where if it was
>> specified, terminate after examining the values after N matches (or
>> perhaps N values were examined whether any matches occurred or not).
>> This seems relatively safe, it wouldn't change existing behavior but
>> still would be available when needed.
>>
>> Or am I just misunderstanding highlighting and there's really an
>> underlying bug in the highlighting code and we'd expect highlighting
>> times to be consistent regardless of whether the field was MV or not?
>>
>> Thanks,
>> Erick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to