Maybe try PostingsHighlighter? Mike McCandless
http://blog.mikemccandless.com On Thu, Mar 28, 2013 at 10:09 AM, Erick Erickson <[email protected]> wrote: > I have a situation where a client has a zillion small entries in a > multivalued field. When the number of values, and values here means > number of times doc.addFIeld(bigfield, value) is called on a doc, > highlighting becomes very slow. Doing highlighting on the same data > but all concatenated into a single field is much faster. I'm guessing > the slowdown is due to setup/analyze being done for each entry and the > attendant bookkeeping. > > We've already talked about using FVH, and we're going to see if the > index bloat is acceptable, but meanwhile.... > > I'm _guessing_ that the issue is analogous to scoring. You have to > examine every entry to find the "best" snippet. This particular > application would be fine with not worrying about "best", and willing > to stop after the first N. > > hl.snippets doesn't seem to work. If I'm reading the code correctly > (and I admit I didn't look very deeply), all the entries need to be > examined to find the top N "best" fits. > > hl.maxAnalyzedChars doesn't seem to help either, since it's applied to > each individual value not the field as a whole. > > So here's my question: What are the options besides FVH? Is there > something I'm overlooking? The client is quite willing to write custom > components and also will consider donating them back to the community > FWIW. > > One possibility that's come up is to have something analogous to > maxAnalyzedChars but where the unit was MV entries, something like > hl.mvBailCount=N (needs to be per-field override) where if it was > specified, terminate after examining the values after N matches (or > perhaps N values were examined whether any matches occurred or not). > This seems relatively safe, it wouldn't change existing behavior but > still would be available when needed. > > Or am I just misunderstanding highlighting and there's really an > underlying bug in the highlighting code and we'd expect highlighting > times to be consistent regardless of whether the field was MV or not? > > Thanks, > Erick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
