Yonik Seeley wrote:
On Tue, Sep 9, 2008 at 5:28 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
What about something like term freq? Would it need to count the
number of docs after the local maxDoc or is there a better way?
Good question...
I think we'd have to take a full copy of the term -> termFreq on
reopen? I
don't see how else to do it (I don't understand your suggestion
above). So,
this will clearly add to the cost of reopen.
One could adjust the freq by iterating over the terms documents...
skipTo(localMaxDoc) and count how many are after that, then subtract
from the freq. I didn't say it was a *good* idea :-)
Ahh, OK :)
For reading stored fields and term vectors, which are now flushed
immediately to disk, we need to somehow get an IndexInput from the
IndexOutputs that IndexWriter holds open on these files. Or,
maybe, just
open new IndexInputs?
Hmmm, seems like a case of our nice and simple Directory model not
having quite enough features in this case.
I think we can simply open IndexInputs on these files. I believe
Java does
the right thing on windows, such that if we are already writing to
the file,
it does not prevent another file handle from opening the file for
reading.
Yeah, I think the underlying RandomAccessFile might do the right
thing, but IndexInput isn't required to see any changes on the fly
(and current implementations don't) so at a minimum it would be a
change of IndexInput semantics. Maybe there would need to be a
refresh() function added, or we would need to require a specific
Directory impl?
OR, if all writes are append-only, perhaps we don't ever need to
invalidate the read buffer and would just need to remove the current
logic that caches the file length and then let the underlying
RandomAccessFile do the EOF checking.
All writes to these files are append only, and, when we open the
IndexInput we would never read beyond it's current length (once we
flush our IndexOutput) because that's the local maxDocID limit.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]