Re: docMap array in SegmentMergeInfo

2005-10-13 Thread Peter Keegan
Hi Yonik, Your patch has corrected the thread thrashing problem on multi-cpu systems. I've tested it with both 1.4.3 and 1.9. I haven't seen 100X performance gain, but that's because I'm caching QueryFilters and Lucene is caching the sort fields. Thanks for the fast response! btw, I had

Re: docMap array in SegmentMergeInfo

2005-10-12 Thread Peter Keegan
Here is one stack trace: Full thread dump Java HotSpot(TM) Client VM (1.5.0_03-b07 mixed mode): Thread-6 prio=5 tid=0x6cf7a7f0 nid=0x59e50 waiting for monitor entry [0x6d2cf000..0x6d2cfd6c] at org.apache.lucene.index.SegmentReader.isDeleted(SegmentReader.java:241) - waiting to lock 0x04e40278 (a

Re: docMap array in SegmentMergeInfo

2005-10-12 Thread Yonik Seeley
Thanks for the trace Peter, and great catch! It certainly does look like avoiding the construction of the docMap for a MultiTermEnum will be a significant optimization. -Yonik Now hiring -- http://tinyurl.com/7m67g On 10/12/05, Peter Keegan [EMAIL PROTECTED] wrote: Here is one stack trace:

Re: docMap array in SegmentMergeInfo

2005-10-12 Thread Yonik Seeley
Here's the patch: http://issues.apache.org/jira/browse/LUCENE-454 It resulted in quite a performance boost indeed! On 10/12/05, Yonik Seeley [EMAIL PROTECTED] wrote: Thanks for the trace Peter, and great catch! It certainly does look like avoiding the construction of the docMap for a

Re: docMap array in SegmentMergeInfo

2005-10-11 Thread Peter Keegan
On a multi-cpu system, this loop to build the docMap array can cause severe thread thrashing because of the synchronized method 'isDeleted'. I have observed this on an index with over 1 million documents (which contains a few thousand deleted docs) when multiple threads perform a search with

Re: docMap array in SegmentMergeInfo

2005-10-11 Thread Yonik Seeley
I'm not sure that looks like a safe patch. Synchronization does more than help prevent races... it also introduces memory barriers. Removing synchronization to objects that can change is very tricky business (witness the double-checked locking antipattern). -Yonik Now hiring --

docMap array in SegmentMergeInfo

2005-07-13 Thread Lokesh Bajaj
I noticed the following code that builds the docMap array in SegmentMergeInfo.java for the case where some documents might be deleted from an index: // build array which maps document numbers around deletions if (reader.hasDeletions()) { int maxDoc = reader.maxDoc();

Re: docMap array in SegmentMergeInfo

2005-07-13 Thread Doug Cutting
Lokesh Bajaj wrote: For a very large index where we might want to delete/replace some documents, this would require a lot of memory (for 100 million documents, this would need 381 MB of memory). Is there any reason why this was implemented this way? In practice this has not been an issue. A