Re: Explanation on RAMBufferSizeMB

2008-03-21 Thread Shai Erera
Thanks a lot for your responses on this. If I'll have more results on this issue, I'll post them back here. Shai On Fri, Mar 21, 2008 at 11:38 PM, Michael McCandless < [EMAIL PROTECTED]> wrote: > > Shai Erera wrote: > > What do you mean by "does your test do any merging"? > > All I do is create

Re: Explanation on RAMBufferSizeMB

2008-03-21 Thread Michael McCandless
Shai Erera wrote: What do you mean by "does your test do any merging"? All I do is create IndexWriter w/ the RAM and MBD settings as I've described before. Then I just call addDocument. At the end I call optimize() (it is a one time created index, after that I need it optimized for search).

Re: Explanation on RAMBufferSizeMB

2008-03-21 Thread Shai Erera
What do you mean by "does your test do any merging"? All I do is create IndexWriter w/ the RAM and MBD settings as I've described before. Then I just call addDocument. At the end I call optimize() (it is a one time created index, after that I need it optimized for search). I guess Lucene performs s

Re: Explanation on RAMBufferSizeMB

2008-03-21 Thread Michael McCandless
Shai Erera wrote: Besides the content field, everything is stored, so that may explain the large CFS files. You could run w/o CFS turned on and then look at the size of fdt/fdx to see if this explains the size. Regarding the RAM-usage performance, I tried setting to 128, 256 and 512, a

Re: Explanation on RAMBufferSizeMB

2008-03-20 Thread Shai Erera
Besides the content field, everything is stored, so that may explain the large CFS files. Regarding the RAM-usage performance, I tried setting to 128, 256 and 512, all gave the same time measurements (give or take ~5%) as the MBD (set to 10,000) run. I think it needs further investigation. Was it

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Michael McCandless
Shai Erera wrote: I think you misunderstood me - ultimately, the process reached 128MB. However it was flushing the .fdt file before it reached that. Your explanation on stored fields explains that behavior, but it did consume128MB. Ahh, phew. Also, the CFS files that were written were of siz

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Shai Erera
I think you misunderstood me - ultimately, the process reached 128MB. However it was flushing the .fdt file before it reached that. Your explanation on stored fields explains that behavior, but it did consume128MB. Also, the CFS files that were written were of size >200MB (but less than 256) - whi

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Michael McCandless
Shai Erera wrote: Thanks for clarifying that up. I thought I miss something :-) No .. I don't use term vectors, only stored fields and indexed ones, no norms or term vectors. Hmm, then it's hard to explain why when you set buffer to 128 MB you never saw the process get up to that usage.

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Shai Erera
Thanks for clarifying that up. I thought I miss something :-) No .. I don't use term vectors, only stored fields and indexed ones, no norms or term vectors. As for the efficiency of RAM usage by IndexWriter - what would perform better: setting the RAM limit to 128MB, or create a RAMDirectory and

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Michael McCandless
Shai Erera wrote: Hi I have a question on the setting of RAMBufferSizeMB on IndexWriter. It may sound like it belongs to the user list, but I actually think there is a problem with it, so I'm posting it to the dev list. I'm using 2.3.1 to index a set of documents (500K Amazon books to b

Re: Explanation on RAMBufferSizeMB

2008-03-19 Thread Shai Erera
One correction - I use 2.3.0 and not 2.3.1 On Wed, Mar 19, 2008 at 4:25 PM, Shai Erera <[EMAIL PROTECTED]> wrote: > Hi > > I have a question on the setting of RAMBufferSizeMB on IndexWriter. It may > sound like it belongs to the user list, but I actually think there is a > problem with it, so I'm

Explanation on RAMBufferSizeMB

2008-03-19 Thread Shai Erera
Hi I have a question on the setting of RAMBufferSizeMB on IndexWriter. It may sound like it belongs to the user list, but I actually think there is a problem with it, so I'm posting it to the dev list. I'm using 2.3.1 to index a set of documents (500K Amazon books to be exact). I don't use norms