RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Karthik N S
Hi Guy's

Apologies .


  I am NOT Using sorting code

  hits = multiSearcher.search(query, new Sort(new SortField(filename,
SortField.STRING)));

 but using multiSearcher.search(query)

 in Core Files setup and still getting the Error.



 More Advises Required..


Karthik



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 12:46 PM
To: Lucene Users List
Subject: Re: Lucene1.4.1 + OutOf Memory


There is a memory leak in the sorting code of Lucene 1.4.1.
1.4.2 has the fix!

--- Karthik N S [EMAIL PROTECTED] wrote:


 Hi
 Guys

 Apologies..



 History

 Ist type :  4  subindexes   +  MultiSearcher  + Search on
 Content Field
 Only  for 2000 hits


=
 Exception  [ Too many Files Open ]





 IInd type :  40 Mergerd Indexes [1000 subindexes each]   +
 MultiSearcher
 /ParallelSearcher +  Search on Content Field Only for 2
 hits


=
 Exception  [ OutOf Memeory  ]



 System Config  [same for both type]

 Amd Processor [High End Single]
 RAM  1GB
 O/s Linux  ( jantoo type )
 Appserver Tomcat 5.05
 Jdk [ IBM  Blackdown-1.4.1-01  ( == Jdk1.4.1) ]

 Index contains 15 Fields
 Search
 Done only on 1 field
 Retrieve 11 corrosponding fields
 3 Fields  are for debug details


 Switched from Ist type to IInd Type

 Can some body suggest me Why is this Happening

 Thx in advance




   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]





-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread iouli . golovatyi

Exception too many files open means:
- searcher object is nor closed after query execution
- too little file handlers

Regards
J.



 
  Karthik N S 
 
  [EMAIL PROTECTED]To:   Lucene Users List 
[EMAIL PROTECTED],   
  et.co.in [EMAIL PROTECTED] 
  
   cc:   (bcc: Iouli 
Golovatyi/X/GP/Novartis)
  10.11.2004 09:41 Subject:  RE: Lucene1.4.1 + 
OutOf Memory  
  Please respond to 
 
  Lucene UsersCategory:   
|-|   
  List| ( ) Action 
needed   |   
   | ( ) Decision 
needed |   
   | ( ) General 
Information |   
   
|-|   

 

 




Hi Guy's

Apologies .


  I am NOT Using sorting code

  hits = multiSearcher.search(query, new Sort(new SortField(filename,
SortField.STRING)));

 but using multiSearcher.search(query)

 in Core Files setup and still getting the Error.



 More Advises Required..


Karthik



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 12:46 PM
To: Lucene Users List
Subject: Re: Lucene1.4.1 + OutOf Memory


There is a memory leak in the sorting code of Lucene 1.4.1.
1.4.2 has the fix!

--- Karthik N S [EMAIL PROTECTED] wrote:


 Hi
 Guys

 Apologies..



 History

 Ist type :  4  subindexes   +  MultiSearcher  + Search on
 Content Field
 Only  for 2000 hits


=
 Exception  [ Too many Files Open ]





 IInd type :  40 Mergerd Indexes [1000 subindexes each]   +
 MultiSearcher
 /ParallelSearcher +  Search on Content Field Only for 2
 hits


=
 Exception  [ OutOf Memeory  ]



 System Config  [same for both type]

 Amd Processor [High End Single]
 RAM  1GB
 O/s Linux  ( jantoo type )
 Appserver Tomcat 5.05
 Jdk [ IBM  Blackdown-1.4.1-01  ( == Jdk1.4.1) ]

 Index contains 15 Fields
 Search
 Done only on 1 field
 Retrieve 11 corrosponding fields
 3 Fields  are for debug details


 Switched from Ist type to IInd Type

 Can some body suggest me Why is this Happening

 Thx in advance




   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]





-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Karthik N S
Hi Guy's


Apologies.


  That's Why  Somebody on the form asked me to Switch to


 : 40 Mergerd Indexes [1000 subindexes each]   +  MultiSearcher /
ParallelSearcher +  Search on Content Field Only for 2

  the problem of to many Files open was solved since now there were only 40
MergerIndexes - [1 MergerIndex has 1000 sub indexes]

  instead of  4 subindexes.

 Now I am gettinf Out of Memory Exception.


  Any Idea On how to Solve this problem.



Thx in Advance






-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 2:16 PM
To: Lucene Users List
Subject: RE: Lucene1.4.1 + OutOf Memory



Exception too many files open means:
- searcher object is nor closed after query execution
- too little file handlers

Regards
J.



  Karthik N S
  [EMAIL PROTECTED]To:   Lucene Users List
[EMAIL PROTECTED],
  et.co.in
[EMAIL PROTECTED]
   cc:   (bcc: Iouli
Golovatyi/X/GP/Novartis)
  10.11.2004 09:41 Subject:  RE: Lucene1.4.1 +
OutOf Memory
  Please respond to
  Lucene UsersCategory:
|-|
  List| ( ) Action
needed   |
   | ( )
Decision needed |
   | ( ) General
Information |

|-|






Hi Guy's

Apologies .


  I am NOT Using sorting code

  hits = multiSearcher.search(query, new Sort(new SortField(filename,
SortField.STRING)));

 but using multiSearcher.search(query)

 in Core Files setup and still getting the Error.



 More Advises Required..


Karthik



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 12:46 PM
To: Lucene Users List
Subject: Re: Lucene1.4.1 + OutOf Memory


There is a memory leak in the sorting code of Lucene 1.4.1.
1.4.2 has the fix!

--- Karthik N S [EMAIL PROTECTED] wrote:


 Hi
 Guys

 Apologies..



 History

 Ist type :  4  subindexes   +  MultiSearcher  + Search on
 Content Field
 Only  for 2000 hits


=
 Exception  [ Too many Files Open ]





 IInd type :  40 Mergerd Indexes [1000 subindexes each]   +
 MultiSearcher
 /ParallelSearcher +  Search on Content Field Only for 2
 hits


=
 Exception  [ OutOf Memeory  ]



 System Config  [same for both type]

 Amd Processor [High End Single]
 RAM  1GB
 O/s Linux  ( jantoo type )
 Appserver Tomcat 5.05
 Jdk [ IBM  Blackdown-1.4.1-01  ( == Jdk1.4.1) ]

 Index contains 15 Fields
 Search
 Done only on 1 field
 Retrieve 11 corrosponding fields
 3 Fields  are for debug details


 Switched from Ist type to IInd Type

 Can some body suggest me Why is this Happening

 Thx in advance




   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]





-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Erik Hatcher
On Nov 10, 2004, at 1:55 AM, Karthik N S wrote:
Hi
Guys
Apologies..
No need to apologize for asking questions.
History
Ist type :  4  subindexes   +  MultiSearcher  + Search on Content 
Field
You've got 40,000 indexes aggregated under a MultiSearcher and you're 
wondering why you're running out of memory?!  :O

Exception  [ Too many Files Open ]
Are you using the compound file format?
Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Rupinder Singh Mazara
 to deal with.
 Maybe not  correctly addressed in this newsgroup, after all...

 Anyway: any idea if there is an API command to re-init caches?

 Thanks,

 Daniel



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]





 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]







 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]







 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]









-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 10 November 2004 09:35
To: Lucene Users List
Subject: Re: Lucene1.4.1 + OutOf Memory


On Nov 10, 2004, at 1:55 AM, Karthik N S wrote:

 Hi
 Guys

 Apologies..

No need to apologize for asking questions.

 History

 Ist type :  4  subindexes   +  MultiSearcher  + Search on Content
 Field

You've got 40,000 indexes aggregated under a MultiSearcher and you're
wondering why you're running out of memory?!  :O

 Exception  [ Too many Files Open ]

Are you using the compound file format?

   Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Karthik N S
Hi Guy's


Apologies..


 Yes  Erik

  The Day I switched from Lucene1.3.1 to Lucene1.4.1  We  are using  the
CompoundFile format to


writer.setUseCompoundFile(true);


Some More Advises Please.


Thx in advance

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 3:05 PM
To: Lucene Users List
Subject: Re: Lucene1.4.1 + OutOf Memory


On Nov 10, 2004, at 1:55 AM, Karthik N S wrote:

 Hi
 Guys

 Apologies..

No need to apologize for asking questions.

 History

 Ist type :  4  subindexes   +  MultiSearcher  + Search on Content
 Field

You've got 40,000 indexes aggregated under a MultiSearcher and you're
wondering why you're running out of memory?!  :O

 Exception  [ Too many Files Open ]

Are you using the compound file format?

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Karthik N S
Hi

  Rupinder Singh Mazara

Apologies



  Can u Past the code on to the Mail instead of Attachement...

  [ Cause I am not bale to get the Attachement  on the Company's mail ]


 Thx in advance
Karthik


-Original Message-
From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 3:10 PM
To: Lucene Users List
Subject: RE: Lucene1.4.1 + OutOf Memory


hi all

 I had a similar problem with jdk1.4.1, Doug had sent me a patch which I am
attaching following is the mail from Doug

 It sounds like the ThreadLocal in TermInfosReader is not getting
correctly garbage collected when the TermInfosReader is collected.
Researching a bit, this was a bug in JVMs prior to 1.4.2, so my guess is
that you're running in an older JVM.  Is that right?

I've attached a patch which should fix this.  Please tell me if it works
for you.

Doug

Daniel Taurat wrote:
 Okay, that (1.4rc3)worked fine, too!
 Got only 257 SegmentTermEnums for 1900 objects.

 Now I will go for the final test on the production server with the
 1.4rc3 version  and about 40.000 objects.

 Daniel

 Daniel Taurat schrieb:

 Hi all,
 here is some update for you:
 I switched back to Lucene 1.3-final and now the  number of the
 SegmentTermEnum objects is controlled by gc again:
 it goes up to about 1000 and then it is down again to 254 after
 indexing my 1900 test-objects.
 Stay tuned, I will try 1.4RC3 now, the last version before FieldCache
 was introduced...

 Daniel


 Rupinder Singh Mazara schrieb:

 hi all
  I had a similar problem, i have  database of documents with 24
 fields, and a average content of 7K, with  16M+ records

  i had to split the jobs into slabs of 1M each and merging the
 resulting indexes, submissions to our job queue looked like

  java -Xms100M -Xcompactexplicitgc -cp $CLASSPATH lucene.Indexer 22

 and i still had outofmemory exception , the solution that i created
 was to after every 200K, documents create a temp directory, and merge
 them together, this was done to do the first production run, updates
 are now being handled incrementally



 Exception in thread main java.lang.OutOfMemoryError
 at

org.apache.lucene.store.RAMOutputStream.flushBuffer(RAMOutputStream.java(Com
piled
 Code))
 at
 org.apache.lucene.store.OutputStream.flush(OutputStream.java(Inlined
 Compiled Code))
 at
 org.apache.lucene.store.OutputStream.writeByte(OutputStream.java(Inlined
 Compiled Code))
 at

org.apache.lucene.store.OutputStream.writeBytes(OutputStream.java(Compiled
 Code))
 at

org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java(
Compiled
 Code))
 at

org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java(Com
piled
 Code))
 at

org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java(
Compiled
 Code))
 at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java(Compiled
 Code))
 at

org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java(Compiled
 Code))
 at
 org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
 at lucene.Indexer.doIndex(CDBIndexer.java(Compiled Code))
 at lucene.Indexer.main(CDBIndexer.java:168)



 -Original Message-
 From: Daniel Taurat [mailto:[EMAIL PROTECTED]
 Sent: 10 September 2004 14:42
 To: Lucene Users List
 Subject: Re: Out of memory in lucene 1.4.1 when re-indexing large
 number
 of documents


 Hi Pete,
 good hint, but we actually do have physical memory of  4Gb on the
 system. But then: we also have experienced that the gc of ibm
 jdk1.3.1 that we use is sometimes
 behaving strangely with too large heap space anyway. (Limit seems to
 be 1.2 Gb)
 I can say that gc is not collecting these objects since I  forced gc
 runs when indexing every now and then (when parsing pdf-type
 objects, that is): No effect.

 regards,

 Daniel


 Pete Lewis wrote:



 Hi all

 Reading the thread with interest, there is another way I've come


 across out


 of memory errors when indexing large batches of documents.

 If you have your heap space settings too high, then you get


 swapping (which


 impacts performance) plus you never reach the trigger for garbage
 collection, hence you don't garbage collect and hence you run out


 of memory.


 Can you check whether or not your garbage collection is being
 triggered?

 Anomalously therefore if this is the case, by reducing the heap
 space you
 can improve performance get rid of the out of memory errors.

 Cheers
 Pete Lewis

 - Original Message - From: Daniel Taurat
 [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Friday, September 10, 2004 1:10 PM
 Subject: Re: Out of memory in lucene 1.4.1 when re-indexing large


 number of


 documents






 Daniel Aber schrieb:




 On Thursday 09 September 2004 19:47, Daniel Taurat wrote:





 I am facing an out of memory problem using  Lucene 1.4.1.





 Could you try with a recent CVS version? There has been a fix



 about

RE: Lucene1.4.1 + OutOf Memory

2004-11-10 Thread Rupinder Singh Mazara
karthik

 i think the core problem in your case is the use of compound files, i would
be best to switch it off
 or alternatively issue a optimize as soon as the indexing is over.

  i am copying the file contents between file tags, the patch is to be
applied on TermInfosReader.java, this
 was done to help out of memory exceptions while doing indexing
  file
Index: src/java/org/apache/lucene/index/TermInfosReader.java
===
RCS file:
/home/cvs/jakarta-lucene/src/java/org/apache/lucene/index/TermInfosReader.ja
va,v
retrieving revision 1.9
diff -u -r1.9 TermInfosReader.java
--- src/java/org/apache/lucene/index/TermInfosReader.java   6 Aug 2004
20:50:29 -  1.9
+++ src/java/org/apache/lucene/index/TermInfosReader.java   10 Sep 2004
17:46:47 -
@@ -45,6 +45,11 @@
 readIndex();
   }

+  protected final void finalize() {
+// patch for pre-1.4.2 JVMs, whose ThreadLocals leak
+enumerators.set(null);
+  }
+
   public int getSkipInterval() {
 return origEnum.skipInterval;
   }
/file



 however tomcat does react in strange ways to to-many open files,
 try to restrict the number of IndexReader or Searchable objects
  that you create while  doing searches,
I  usually keep one object to handle all my user requests

 public static Searcher fetchCitationSearcher(HttpServletRequest request)
throws Exception {
Searcher rval = (Searcher)
request.getSession().getServletContext().getAttribute(
luceneSearchable);
if (rval == null) {
  rval = new IndexSearcher( fetchCitationReader(request) );

request.getSession().getServletContext().setAttribute(luceneSearchable,
rval);
}
return rval;
}




-Original Message-
From: Karthik N S [mailto:[EMAIL PROTECTED]
Sent: 10 November 2004 11:41
To: Lucene Users List
Subject: RE: Lucene1.4.1 + OutOf Memory


Hi

  Rupinder Singh Mazara

Apologies



  Can u Past the code on to the Mail instead of Attachement...

  [ Cause I am not bale to get the Attachement  on the Company's mail ]


 Thx in advance
Karthik


-Original Message-
From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 10, 2004 3:10 PM
To: Lucene Users List
Subject: RE: Lucene1.4.1 + OutOf Memory


hi all

 I had a similar problem with jdk1.4.1, Doug had sent me a patch which I am
attaching following is the mail from Doug

 It sounds like the ThreadLocal in TermInfosReader is not getting
correctly garbage collected when the TermInfosReader is collected.
Researching a bit, this was a bug in JVMs prior to 1.4.2, so my guess is
that you're running in an older JVM.  Is that right?

I've attached a patch which should fix this.  Please tell me if it works
for you.

Doug

Daniel Taurat wrote:
 Okay, that (1.4rc3)worked fine, too!
 Got only 257 SegmentTermEnums for 1900 objects.

 Now I will go for the final test on the production server with the
 1.4rc3 version  and about 40.000 objects.

 Daniel

 Daniel Taurat schrieb:

 Hi all,
 here is some update for you:
 I switched back to Lucene 1.3-final and now the  number of the
 SegmentTermEnum objects is controlled by gc again:
 it goes up to about 1000 and then it is down again to 254 after
 indexing my 1900 test-objects.
 Stay tuned, I will try 1.4RC3 now, the last version before FieldCache
 was introduced...

 Daniel


 Rupinder Singh Mazara schrieb:

 hi all
  I had a similar problem, i have  database of documents with 24
 fields, and a average content of 7K, with  16M+ records

  i had to split the jobs into slabs of 1M each and merging the
 resulting indexes, submissions to our job queue looked like

  java -Xms100M -Xcompactexplicitgc -cp $CLASSPATH lucene.Indexer 22

 and i still had outofmemory exception , the solution that i created
 was to after every 200K, documents create a temp directory, and merge
 them together, this was done to do the first production run, updates
 are now being handled incrementally



 Exception in thread main java.lang.OutOfMemoryError
 at

org.apache.lucene.store.RAMOutputStream.flushBuffer(RAMOutputStream
.java(Com
piled
 Code))
 at
 org.apache.lucene.store.OutputStream.flush(OutputStream.java(Inlined
 Compiled Code))
 at

org.apache.lucene.store.OutputStream.writeByte(OutputStream.java(Inlined
 Compiled Code))
 at

org.apache.lucene.store.OutputStream.writeBytes(OutputStream.java(Compiled
 Code))
 at

org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWri
ter.java(
Compiled
 Code))
 at

org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter
.java(Com
piled
 Code))
 at

org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMer
ger.java(
Compiled
 Code))
 at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java(Compiled
 Code))
 at

org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java(Compiled
 Code))
 at
 org.apache.lucene.index.IndexWriter.optimize

Re: Lucene1.4.1 + OutOf Memory

2004-11-09 Thread yahootintin-lucene
There is a memory leak in the sorting code of Lucene 1.4.1. 
1.4.2 has the fix!

--- Karthik N S [EMAIL PROTECTED] wrote:

 
 Hi
 Guys
 
 Apologies..
 
 
 
 History
 
 Ist type :  4  subindexes   +  MultiSearcher  + Search on
 Content Field
 Only  for 2000 hits
 
   
=
 Exception  [ Too many Files Open ]
 
 
 
 
 
 IInd type :  40 Mergerd Indexes [1000 subindexes each]   + 
 MultiSearcher
 /ParallelSearcher +  Search on Content Field Only for 2
 hits
 
   
=
 Exception  [ OutOf Memeory  ]
 
 
 
 System Config  [same for both type]
 
 Amd Processor [High End Single]
 RAM  1GB
 O/s Linux  ( jantoo type )
 Appserver Tomcat 5.05
 Jdk [ IBM  Blackdown-1.4.1-01  ( == Jdk1.4.1) ]
 
 Index contains 15 Fields
 Search
 Done only on 1 field
 Retrieve 11 corrosponding fields
 3 Fields  are for debug details
 
 
 Switched from Ist type to IInd Type
 
 Can some body suggest me Why is this Happening
 
 Thx in advance
 
 
 
 
   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]
 
 
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]