Interesting, and I see the Muir put in Solr support in 4.1. But in this particular case they don't have the field indexed, so I don't think that will help them? (it's a long story, but they have good reasons for this)...
Thanks, Erick On Thu, Mar 28, 2013 at 11:10 AM, Michael McCandless <[email protected]> wrote: > Maybe try PostingsHighlighter? > > Mike McCandless > > http://blog.mikemccandless.com > > On Thu, Mar 28, 2013 at 10:09 AM, Erick Erickson > <[email protected]> wrote: >> I have a situation where a client has a zillion small entries in a >> multivalued field. When the number of values, and values here means >> number of times doc.addFIeld(bigfield, value) is called on a doc, >> highlighting becomes very slow. Doing highlighting on the same data >> but all concatenated into a single field is much faster. I'm guessing >> the slowdown is due to setup/analyze being done for each entry and the >> attendant bookkeeping. >> >> We've already talked about using FVH, and we're going to see if the >> index bloat is acceptable, but meanwhile.... >> >> I'm _guessing_ that the issue is analogous to scoring. You have to >> examine every entry to find the "best" snippet. This particular >> application would be fine with not worrying about "best", and willing >> to stop after the first N. >> >> hl.snippets doesn't seem to work. If I'm reading the code correctly >> (and I admit I didn't look very deeply), all the entries need to be >> examined to find the top N "best" fits. >> >> hl.maxAnalyzedChars doesn't seem to help either, since it's applied to >> each individual value not the field as a whole. >> >> So here's my question: What are the options besides FVH? Is there >> something I'm overlooking? The client is quite willing to write custom >> components and also will consider donating them back to the community >> FWIW. >> >> One possibility that's come up is to have something analogous to >> maxAnalyzedChars but where the unit was MV entries, something like >> hl.mvBailCount=N (needs to be per-field override) where if it was >> specified, terminate after examining the values after N matches (or >> perhaps N values were examined whether any matches occurred or not). >> This seems relatively safe, it wouldn't change existing behavior but >> still would be available when needed. >> >> Or am I just misunderstanding highlighting and there's really an >> underlying bug in the highlighting code and we'd expect highlighting >> times to be consistent regardless of whether the field was MV or not? >> >> Thanks, >> Erick >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
