RE: Lucene1.4.1 + OutOf Memory
Hi Guy's Apologies . I am NOT Using sorting code hits = multiSearcher.search(query, new Sort(new SortField(filename, SortField.STRING))); but using multiSearcher.search(query) in Core Files setup and still getting the Error. More Advises Required.. Karthik -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 12:46 PM To: Lucene Users List Subject: Re: Lucene1.4.1 + OutOf Memory There is a memory leak in the sorting code of Lucene 1.4.1. 1.4.2 has the fix! --- Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies.. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field Only for 2000 hits = Exception [ Too many Files Open ] IInd type : 40 Mergerd Indexes [1000 subindexes each] + MultiSearcher /ParallelSearcher + Search on Content Field Only for 2 hits = Exception [ OutOf Memeory ] System Config [same for both type] Amd Processor [High End Single] RAM 1GB O/s Linux ( jantoo type ) Appserver Tomcat 5.05 Jdk [ IBM Blackdown-1.4.1-01 ( == Jdk1.4.1) ] Index contains 15 Fields Search Done only on 1 field Retrieve 11 corrosponding fields 3 Fields are for debug details Switched from Ist type to IInd Type Can some body suggest me Why is this Happening Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene1.4.1 + OutOf Memory
Exception too many files open means: - searcher object is nor closed after query execution - too little file handlers Regards J. Karthik N S [EMAIL PROTECTED]To: Lucene Users List [EMAIL PROTECTED], et.co.in [EMAIL PROTECTED] cc: (bcc: Iouli Golovatyi/X/GP/Novartis) 10.11.2004 09:41 Subject: RE: Lucene1.4.1 + OutOf Memory Please respond to Lucene UsersCategory: |-| List| ( ) Action needed | | ( ) Decision needed | | ( ) General Information | |-| Hi Guy's Apologies . I am NOT Using sorting code hits = multiSearcher.search(query, new Sort(new SortField(filename, SortField.STRING))); but using multiSearcher.search(query) in Core Files setup and still getting the Error. More Advises Required.. Karthik -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 12:46 PM To: Lucene Users List Subject: Re: Lucene1.4.1 + OutOf Memory There is a memory leak in the sorting code of Lucene 1.4.1. 1.4.2 has the fix! --- Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies.. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field Only for 2000 hits = Exception [ Too many Files Open ] IInd type : 40 Mergerd Indexes [1000 subindexes each] + MultiSearcher /ParallelSearcher + Search on Content Field Only for 2 hits = Exception [ OutOf Memeory ] System Config [same for both type] Amd Processor [High End Single] RAM 1GB O/s Linux ( jantoo type ) Appserver Tomcat 5.05 Jdk [ IBM Blackdown-1.4.1-01 ( == Jdk1.4.1) ] Index contains 15 Fields Search Done only on 1 field Retrieve 11 corrosponding fields 3 Fields are for debug details Switched from Ist type to IInd Type Can some body suggest me Why is this Happening Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene1.4.1 + OutOf Memory
Hi Guy's Apologies. That's Why Somebody on the form asked me to Switch to : 40 Mergerd Indexes [1000 subindexes each] + MultiSearcher / ParallelSearcher + Search on Content Field Only for 2 the problem of to many Files open was solved since now there were only 40 MergerIndexes - [1 MergerIndex has 1000 sub indexes] instead of 4 subindexes. Now I am gettinf Out of Memory Exception. Any Idea On how to Solve this problem. Thx in Advance -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 2:16 PM To: Lucene Users List Subject: RE: Lucene1.4.1 + OutOf Memory Exception too many files open means: - searcher object is nor closed after query execution - too little file handlers Regards J. Karthik N S [EMAIL PROTECTED]To: Lucene Users List [EMAIL PROTECTED], et.co.in [EMAIL PROTECTED] cc: (bcc: Iouli Golovatyi/X/GP/Novartis) 10.11.2004 09:41 Subject: RE: Lucene1.4.1 + OutOf Memory Please respond to Lucene UsersCategory: |-| List| ( ) Action needed | | ( ) Decision needed | | ( ) General Information | |-| Hi Guy's Apologies . I am NOT Using sorting code hits = multiSearcher.search(query, new Sort(new SortField(filename, SortField.STRING))); but using multiSearcher.search(query) in Core Files setup and still getting the Error. More Advises Required.. Karthik -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 12:46 PM To: Lucene Users List Subject: Re: Lucene1.4.1 + OutOf Memory There is a memory leak in the sorting code of Lucene 1.4.1. 1.4.2 has the fix! --- Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies.. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field Only for 2000 hits = Exception [ Too many Files Open ] IInd type : 40 Mergerd Indexes [1000 subindexes each] + MultiSearcher /ParallelSearcher + Search on Content Field Only for 2 hits = Exception [ OutOf Memeory ] System Config [same for both type] Amd Processor [High End Single] RAM 1GB O/s Linux ( jantoo type ) Appserver Tomcat 5.05 Jdk [ IBM Blackdown-1.4.1-01 ( == Jdk1.4.1) ] Index contains 15 Fields Search Done only on 1 field Retrieve 11 corrosponding fields 3 Fields are for debug details Switched from Ist type to IInd Type Can some body suggest me Why is this Happening Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene1.4.1 + OutOf Memory
On Nov 10, 2004, at 1:55 AM, Karthik N S wrote: Hi Guys Apologies.. No need to apologize for asking questions. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field You've got 40,000 indexes aggregated under a MultiSearcher and you're wondering why you're running out of memory?! :O Exception [ Too many Files Open ] Are you using the compound file format? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene1.4.1 + OutOf Memory
to deal with. Maybe not correctly addressed in this newsgroup, after all... Anyway: any idea if there is an API command to re-init caches? Thanks, Daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: 10 November 2004 09:35 To: Lucene Users List Subject: Re: Lucene1.4.1 + OutOf Memory On Nov 10, 2004, at 1:55 AM, Karthik N S wrote: Hi Guys Apologies.. No need to apologize for asking questions. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field You've got 40,000 indexes aggregated under a MultiSearcher and you're wondering why you're running out of memory?! :O Exception [ Too many Files Open ] Are you using the compound file format? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene1.4.1 + OutOf Memory
Hi Guy's Apologies.. Yes Erik The Day I switched from Lucene1.3.1 to Lucene1.4.1 We are using the CompoundFile format to writer.setUseCompoundFile(true); Some More Advises Please. Thx in advance -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 3:05 PM To: Lucene Users List Subject: Re: Lucene1.4.1 + OutOf Memory On Nov 10, 2004, at 1:55 AM, Karthik N S wrote: Hi Guys Apologies.. No need to apologize for asking questions. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field You've got 40,000 indexes aggregated under a MultiSearcher and you're wondering why you're running out of memory?! :O Exception [ Too many Files Open ] Are you using the compound file format? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene1.4.1 + OutOf Memory
Hi Rupinder Singh Mazara Apologies Can u Past the code on to the Mail instead of Attachement... [ Cause I am not bale to get the Attachement on the Company's mail ] Thx in advance Karthik -Original Message- From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 3:10 PM To: Lucene Users List Subject: RE: Lucene1.4.1 + OutOf Memory hi all I had a similar problem with jdk1.4.1, Doug had sent me a patch which I am attaching following is the mail from Doug It sounds like the ThreadLocal in TermInfosReader is not getting correctly garbage collected when the TermInfosReader is collected. Researching a bit, this was a bug in JVMs prior to 1.4.2, so my guess is that you're running in an older JVM. Is that right? I've attached a patch which should fix this. Please tell me if it works for you. Doug Daniel Taurat wrote: Okay, that (1.4rc3)worked fine, too! Got only 257 SegmentTermEnums for 1900 objects. Now I will go for the final test on the production server with the 1.4rc3 version and about 40.000 objects. Daniel Daniel Taurat schrieb: Hi all, here is some update for you: I switched back to Lucene 1.3-final and now the number of the SegmentTermEnum objects is controlled by gc again: it goes up to about 1000 and then it is down again to 254 after indexing my 1900 test-objects. Stay tuned, I will try 1.4RC3 now, the last version before FieldCache was introduced... Daniel Rupinder Singh Mazara schrieb: hi all I had a similar problem, i have database of documents with 24 fields, and a average content of 7K, with 16M+ records i had to split the jobs into slabs of 1M each and merging the resulting indexes, submissions to our job queue looked like java -Xms100M -Xcompactexplicitgc -cp $CLASSPATH lucene.Indexer 22 and i still had outofmemory exception , the solution that i created was to after every 200K, documents create a temp directory, and merge them together, this was done to do the first production run, updates are now being handled incrementally Exception in thread main java.lang.OutOfMemoryError at org.apache.lucene.store.RAMOutputStream.flushBuffer(RAMOutputStream.java(Com piled Code)) at org.apache.lucene.store.OutputStream.flush(OutputStream.java(Inlined Compiled Code)) at org.apache.lucene.store.OutputStream.writeByte(OutputStream.java(Inlined Compiled Code)) at org.apache.lucene.store.OutputStream.writeBytes(OutputStream.java(Compiled Code)) at org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java( Compiled Code)) at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java(Com piled Code)) at org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java( Compiled Code)) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java(Compiled Code)) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java(Compiled Code)) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366) at lucene.Indexer.doIndex(CDBIndexer.java(Compiled Code)) at lucene.Indexer.main(CDBIndexer.java:168) -Original Message- From: Daniel Taurat [mailto:[EMAIL PROTECTED] Sent: 10 September 2004 14:42 To: Lucene Users List Subject: Re: Out of memory in lucene 1.4.1 when re-indexing large number of documents Hi Pete, good hint, but we actually do have physical memory of 4Gb on the system. But then: we also have experienced that the gc of ibm jdk1.3.1 that we use is sometimes behaving strangely with too large heap space anyway. (Limit seems to be 1.2 Gb) I can say that gc is not collecting these objects since I forced gc runs when indexing every now and then (when parsing pdf-type objects, that is): No effect. regards, Daniel Pete Lewis wrote: Hi all Reading the thread with interest, there is another way I've come across out of memory errors when indexing large batches of documents. If you have your heap space settings too high, then you get swapping (which impacts performance) plus you never reach the trigger for garbage collection, hence you don't garbage collect and hence you run out of memory. Can you check whether or not your garbage collection is being triggered? Anomalously therefore if this is the case, by reducing the heap space you can improve performance get rid of the out of memory errors. Cheers Pete Lewis - Original Message - From: Daniel Taurat [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, September 10, 2004 1:10 PM Subject: Re: Out of memory in lucene 1.4.1 when re-indexing large number of documents Daniel Aber schrieb: On Thursday 09 September 2004 19:47, Daniel Taurat wrote: I am facing an out of memory problem using Lucene 1.4.1. Could you try with a recent CVS version? There has been a fix about
RE: Lucene1.4.1 + OutOf Memory
karthik i think the core problem in your case is the use of compound files, i would be best to switch it off or alternatively issue a optimize as soon as the indexing is over. i am copying the file contents between file tags, the patch is to be applied on TermInfosReader.java, this was done to help out of memory exceptions while doing indexing file Index: src/java/org/apache/lucene/index/TermInfosReader.java === RCS file: /home/cvs/jakarta-lucene/src/java/org/apache/lucene/index/TermInfosReader.ja va,v retrieving revision 1.9 diff -u -r1.9 TermInfosReader.java --- src/java/org/apache/lucene/index/TermInfosReader.java 6 Aug 2004 20:50:29 - 1.9 +++ src/java/org/apache/lucene/index/TermInfosReader.java 10 Sep 2004 17:46:47 - @@ -45,6 +45,11 @@ readIndex(); } + protected final void finalize() { +// patch for pre-1.4.2 JVMs, whose ThreadLocals leak +enumerators.set(null); + } + public int getSkipInterval() { return origEnum.skipInterval; } /file however tomcat does react in strange ways to to-many open files, try to restrict the number of IndexReader or Searchable objects that you create while doing searches, I usually keep one object to handle all my user requests public static Searcher fetchCitationSearcher(HttpServletRequest request) throws Exception { Searcher rval = (Searcher) request.getSession().getServletContext().getAttribute( luceneSearchable); if (rval == null) { rval = new IndexSearcher( fetchCitationReader(request) ); request.getSession().getServletContext().setAttribute(luceneSearchable, rval); } return rval; } -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED] Sent: 10 November 2004 11:41 To: Lucene Users List Subject: RE: Lucene1.4.1 + OutOf Memory Hi Rupinder Singh Mazara Apologies Can u Past the code on to the Mail instead of Attachement... [ Cause I am not bale to get the Attachement on the Company's mail ] Thx in advance Karthik -Original Message- From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 3:10 PM To: Lucene Users List Subject: RE: Lucene1.4.1 + OutOf Memory hi all I had a similar problem with jdk1.4.1, Doug had sent me a patch which I am attaching following is the mail from Doug It sounds like the ThreadLocal in TermInfosReader is not getting correctly garbage collected when the TermInfosReader is collected. Researching a bit, this was a bug in JVMs prior to 1.4.2, so my guess is that you're running in an older JVM. Is that right? I've attached a patch which should fix this. Please tell me if it works for you. Doug Daniel Taurat wrote: Okay, that (1.4rc3)worked fine, too! Got only 257 SegmentTermEnums for 1900 objects. Now I will go for the final test on the production server with the 1.4rc3 version and about 40.000 objects. Daniel Daniel Taurat schrieb: Hi all, here is some update for you: I switched back to Lucene 1.3-final and now the number of the SegmentTermEnum objects is controlled by gc again: it goes up to about 1000 and then it is down again to 254 after indexing my 1900 test-objects. Stay tuned, I will try 1.4RC3 now, the last version before FieldCache was introduced... Daniel Rupinder Singh Mazara schrieb: hi all I had a similar problem, i have database of documents with 24 fields, and a average content of 7K, with 16M+ records i had to split the jobs into slabs of 1M each and merging the resulting indexes, submissions to our job queue looked like java -Xms100M -Xcompactexplicitgc -cp $CLASSPATH lucene.Indexer 22 and i still had outofmemory exception , the solution that i created was to after every 200K, documents create a temp directory, and merge them together, this was done to do the first production run, updates are now being handled incrementally Exception in thread main java.lang.OutOfMemoryError at org.apache.lucene.store.RAMOutputStream.flushBuffer(RAMOutputStream .java(Com piled Code)) at org.apache.lucene.store.OutputStream.flush(OutputStream.java(Inlined Compiled Code)) at org.apache.lucene.store.OutputStream.writeByte(OutputStream.java(Inlined Compiled Code)) at org.apache.lucene.store.OutputStream.writeBytes(OutputStream.java(Compiled Code)) at org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWri ter.java( Compiled Code)) at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter .java(Com piled Code)) at org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMer ger.java( Compiled Code)) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java(Compiled Code)) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java(Compiled Code)) at org.apache.lucene.index.IndexWriter.optimize
Re: Lucene1.4.1 + OutOf Memory
There is a memory leak in the sorting code of Lucene 1.4.1. 1.4.2 has the fix! --- Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies.. History Ist type : 4 subindexes + MultiSearcher + Search on Content Field Only for 2000 hits = Exception [ Too many Files Open ] IInd type : 40 Mergerd Indexes [1000 subindexes each] + MultiSearcher /ParallelSearcher + Search on Content Field Only for 2 hits = Exception [ OutOf Memeory ] System Config [same for both type] Amd Processor [High End Single] RAM 1GB O/s Linux ( jantoo type ) Appserver Tomcat 5.05 Jdk [ IBM Blackdown-1.4.1-01 ( == Jdk1.4.1) ] Index contains 15 Fields Search Done only on 1 field Retrieve 11 corrosponding fields 3 Fields are for debug details Switched from Ist type to IInd Type Can some body suggest me Why is this Happening Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]