Come on dude :) Spend a half ounce of effort first. Mike's time is too
valuable !
Luckily mine is not.
There is no default impl - the class is dead simple (and the class has
been pointed out like 3 times in this thread - I'm not even fully
following and I know where to find it):
public static abstract class IndexReaderWarmer {
public abstract void warm(IndexReader reader) throws IOException;
}
Now pass something in that warms the reader. Load a fieldcache - do a
search. Do the hokey pokey and turn your self around ...
Investigation time: 5 seconds.
John Wang wrote:
> Hi Michael:
>
> Thanks for the pointer!
>
> Pardon my ignorance, but I am still no seeing the connection
> between this api to per/segment loading of FieldCache. (the api takes
> in an IndexReader instead of maybe SegmentReader[])
>
> Can you point me to maybe the default impl of IndexReaderWarmer
> to help me understand?
>
> Thanks
>
> -John
>
> On Wed, Sep 23, 2009 at 7:17 AM, Michael McCandless
> <[email protected] <mailto:[email protected]>> wrote:
>
> This is exactly why we added IndexWriter.setMergedSegmentWarmer -- you
> can warm the reader w/o blocking ongoing updates.
>
> Mike
>
> On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller
> <[email protected] <mailto:[email protected]>> wrote:
> > Right - when a large segment is invalidated, you will have a bigger
> > fieldcache piece to reload - pre 2.9, you'd be reloading the *whole*
> > field cache every time though. Sounds like you are trying to
> deal with
> > those large segments changing anyway :) They are always an issue
> when
> > doing RT it seems.
> >
> > I don't believe deletes invalidate a field cache - terms from
> deleted
> > docs stay in a field cache and segmentreaders use their
> freqStream as
> > the fieldcache key. Only when the deletes are merged out would they
> > invalidate - but because your writing a new segment anyway ...
> >
> > - Mark
> >
> > John Wang wrote:
> >> I understand what you are saying. Let me detail what I am
> trying to say:
> >>
> >> When "currently processed segments" are flushed down, merge may
> >> happen. When merges happen, some of those "stable segments" will be
> >> invalidated, and so will the fieldcache data keyed by them.
> >>
> >> In a high update environment, such scenarios can happen quite
> often.
> >>
> >> The way the default mergePolicy works is that small segments get
> >> merged into the larger segments. Eventually, what will be
> invalidated
> >> would be a large segment, and when that happens, a large chunk
> of the
> >> field cache would be invalidated.
> >>
> >> Furthermore, in the case where there are high updates, the stable
> >> segments can be invalidate much sooner when there are deletes
> in those
> >> segments, and I would guess the corresponding FieldCache needs
> to be
> >> adjusted. Not sure how it is handled right now.
> >>
> >> Just my two cents, and of course when I find the time I will
> need to
> >> run some tests to see.
> >>
> >> -John
> >>
> >> On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <[email protected]
> <mailto:[email protected]>
> >> <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >>
> >> The NRT reader coming from the IndexWriter.getReader() has only
> >> changes in the currently processed segments, the other segments
> >> keep stable (and even their IndexReader keys used for the
> >> FieldCache). The rest of the segments keep stable. For the
> >> consumer it looks like a normal reader (it is in fact a
> >> ReadOnlyDirectoryReader) supporting
> getSequentialSubReaders() and
> >> so on.
> >>
> >>
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: [email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>
> >>
> >>
> ------------------------------------------------------------------------
> >>
> >> *From:* John Wang [mailto:[email protected]
> <mailto:[email protected]>
> >> <mailto:[email protected] <mailto:[email protected]>>]
> >> *Sent:* Tuesday, September 22, 2009 9:32 AM
> >> *To:* [email protected]
> <mailto:[email protected]>
> <mailto:[email protected]
> <mailto:[email protected]>>
> >> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache
> >>
> >>
> >>
> >> Thanks Mark for the pointer!
> >>
> >> I guess my point is with NRT, and when segment files change
> often,
> >> this would be an issue, no?
> >>
> >> Anyway, I can run some tests.
> >>
> >> Thanks
> >>
> >> -John
> >>
> >> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller
> >> <[email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >>
> >> 1483 - indexsearcher pulls out a readers subreaders
> >> (segmentreaders) and sends a collector over them one by one,
> >> rather than using the multireader. So only fc for seg
> readers that
> >> change need to be reloaded.
> >>
> >> - Mark
> >>
> >>
> >>
> >> http://www.lucidimagination.com (mobile)
> >>
> >>
> >> On Sep 22, 2009, at 1:27 AM, John Wang <[email protected]
> <mailto:[email protected]>
> >> <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >>
> >>> Hi Yonik:
> >>>
> >>> Actually that is what I am looking for. Can you
> please point
> >>> me to where/how sorting is done per-segment?
> >>>
> >>> When heaving indexing introduces or modifies
> segments, would
> >>> it cause reloading of FieldCache at query time and thus would
> >>> impact search performance?
> >>>
> >>> thanks
> >>>
> >>> -John
> >>>
> >>> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley
> >>> <[email protected]
> <mailto:[email protected]>
> <mailto:[email protected]
> <mailto:[email protected]>>>
> >>> wrote:
> >>>
> >>> On Tue, Sep 22, 2009 at 12:56 AM, John Wang
> <[email protected] <mailto:[email protected]>
> >>> <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >>> > Looking at the code, seems there is a disconnect between
> >>> how/when field
> >>> > cache is loaded when IndexWriter.getReader() is called.
> >>>
> >>> I'm not sure what you mean by "disconnect"
> >>>
> >>> > Is FieldCache updated?
> >>>
> >>> FieldCache entries are populated on demand, as they always
> have been.
> >>>
> >>>
> >>> > Otherwise, are we reloading FieldCache for each
> >>> > reader instance?
> >>>
> >>> Searching/sorting is now per-segment, and so is the use of the
> >>> FieldCache. Segments that don't change shouldn't have to
> reload
> >>> their
> >>> FieldCache entries.
> >>>
> >>> -Yonik
> >>> http://www.lucidimagination.com
> >>>
> >>>
> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail:
> [email protected]
> <mailto:[email protected]>
> >>> <mailto:[email protected]
> <mailto:[email protected]>>
> >>> For additional commands, e-mail:
> [email protected]
> <mailto:[email protected]>
> >>> <mailto:[email protected]
> <mailto:[email protected]>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> <mailto:[email protected]>
> > For additional commands, e-mail: [email protected]
> <mailto:[email protected]>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> <mailto:[email protected]>
> For additional commands, e-mail: [email protected]
> <mailto:[email protected]>
>
>
--
- Mark
http://www.lucidimagination.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]