One correction - I use 2.3.0 and not 2.3.1 On Wed, Mar 19, 2008 at 4:25 PM, Shai Erera <[EMAIL PROTECTED]> wrote:
> Hi > > I have a question on the setting of RAMBufferSizeMB on IndexWriter. It may > sound like it belongs to the user list, but I actually think there is a > problem with it, so I'm posting it to the dev list. > > I'm using 2.3.1 to index a set of documents (500K Amazon books to be > exact). I don't use norms and most of the fields I index are also stored. > I'm setting IndexWriter like this: > indexwriter.setRAMBufferSizeMB(128); > indexwriter.setMaxBufferedDocs(IndexWriter.DISABLE_AUTO_FLUSH > ); > > AFAIU, the first line would set the RAM usage by IW to 128MB and the > second would disable flushing by doc count. Naturally, I'd expect nothing to > be written to the file system until those 128MB are consumed. However, that > does not seem to be the case. I watch the file system and do periodic > refresh (Windows) and I notice that stuff gets written to the disk (.fdt > file) every few KB. Task Manager shows the application is not consuming > 128MB ... > So I debug-traced the application and noticed the following: > - DocumentsWriter calls fieldsWriter.flushDocument in writeDocument(), > passing a RAMOutputStream instance (fdtLocal). > - FieldsWriter calls RAMOutputStream.writeTo() and passes fieldsStream, > which is of type FSIndexOutput. > - FSIndexOutput maintains an internal buffer of size 16KB (fixed) and > eventually flushes the buffer to the RandomAccessFile it maintains. > > So far, the 128MB setting was not applied anywhere, AFAIK. > > Can someone please explain me how this works? Am I missing something > (maybe a patch post 2.3.1). > > One other thing I forgot to mention, I've started this investigation after > playing with the RAM usage and maxBufferredDocs usage. Setting MBD to 10,000 > resulted in the same performance as setting RAM to 128MB, however it > consumed much less RAM (~70MB according to Windows' Task Manager, which is > not the most accurate thing). > > Thanks in advance, > Shai > -- Regards, Shai Erera