Re: OutOfMemory example
On Tuesday 14 September 2004 08:32, JiÅÃ Kuhn wrote: > The error is thrown in exactly the same point as before. This morning I > downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM > is 1.4.2_05-b04, both Linux and Windows. Now I can reproduce the problem. I first tried running the code inside Eclipse, but the Exception doesn't occur there. It does occur on the command line. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: OutOfMemory example
The error is thrown in exactly the same point as before. This morning I downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM is 1.4.2_05-b04, both Linux and Windows. Jiri. -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 10:58 PM To: Lucene Users List Subject: Re: OutOfMemory example On Monday 13 September 2004 15:06, JiÅÃ Kuhn wrote: > I think I can reproduce memory leaking problem while reopening > an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My > JVM is: Could you try with the latest Lucene version from CVS? I cannot reproduce your problem with that version (Sun's Java 1.4.2_03, Linux). Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
Daniel Naber wrote: On Monday 13 September 2004 15:06, JiÅÃ Kuhn wrote: I think I can reproduce memory leaking problem while reopening an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is: Could you try with the latest Lucene version from CVS? I cannot reproduce your problem with that version (Sun's Java 1.4.2_03, Linux). I verified it w/ the latest lucene code from CVS under win xp. Regards Daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
Jiří Kuhn wrote: Hi, I think I can reproduce memory leaking problem while reopening an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is: $ java -version java version "1.4.2_05" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04) Java HotSpot(TM) Client VM (build 1.4.2_05-b04, mixed mode) The code you can test is below, there are only 3 iterations for me if I use -Xmx5m, the 4th fails. At least this test seems tied to the Sort API... I removed the sort under Lucene 1.3 and it worked fine... Kevin -- Please reply using PGP. http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/ Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965 AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OptimizeIt -- Re: force gc idiom - Re: OutOfMemory example
David Spencer wrote: JiÅÃ Kuhn wrote: This doesn't work either! You're right. I'm running under JDK1.5 and trying larger values for -Xmx and it still fails. Running under (Borlands) OptimzeIt shows the number of Terms and Terminfos (both in org.apache.lucene.index) increase every time thru the loop, by several hundred instances each. Yes... I'm running into a similar situation on JDK 1.4.2 with Lucene 1.3... I used the JMP debugger and all my memory is taken by Terms and TermInfo... I can trace thru some Term instances on the reference graph of OptimizeIt but it's unclear to me what's right. One *guess* is that maybe the WeakHashMap in either SegmentReader or FieldCacheImpl is the problem. Kevin -- Please reply using PGP. http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/ Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965 AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
On Monday 13 September 2004 15:06, JiÅÃ Kuhn wrote: > I think I can reproduce memory leaking problem while reopening > an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My > JVM is: Could you try with the latest Lucene version from CVS? I cannot reproduce your problem with that version (Sun's Java 1.4.2_03, Linux). Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
SegmentReader - Re: FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example
Another clue, the SegmentReaders are piling up too, which may be why the Comparator map is increasing in size, because SegmentReaders are the keys to Comparator...though again, I don't know enough about the Lucene internals to know what refs to SegmentReaders are valid which which ones may be causing this leak. David Spencer wrote: David Spencer wrote: Just noticed something else suspicious. FieldSortedHitQueue has a field called Comparators and it seems like things are never removed from it Replying to my own postthis could be the problem. If I put in a print statement here in FieldSortedHitQueue, recompile, and run w/ the new jar then I see Comparators.size() go up after every iteration thru ReopenTest's loop and the size() never goes down... static Object store (IndexReader reader, String field, int type, Object factory, Object value) { FieldCacheImpl.Entry entry = (factory != null) ? new FieldCacheImpl.Entry (field, factory) : new FieldCacheImpl.Entry (field, type); synchronized (Comparators) { HashMap readerCache = (HashMap)Comparators.get(reader); if (readerCache == null) { readerCache = new HashMap(); Comparators.put(reader,readerCache); System.out.println( "*\t* NOW: "+ Comparators.size()); } return readerCache.put (entry, value); } } JiÅÃ Kuhn wrote: This doesn't work either! Lets concentrate on the first version of my code. I believe that the code should run endlesly (I have said it before: in version 1.4 final it does). Jiri. -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:34 PM To: Lucene Users List Subject: force gc idiom - Re: OutOfMemory example JiÅÃ Kuhn wrote: Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example
David Spencer wrote: Just noticed something else suspicious. FieldSortedHitQueue has a field called Comparators and it seems like things are never removed from it Replying to my own postthis could be the problem. If I put in a print statement here in FieldSortedHitQueue, recompile, and run w/ the new jar then I see Comparators.size() go up after every iteration thru ReopenTest's loop and the size() never goes down... static Object store (IndexReader reader, String field, int type, Object factory, Object value) { FieldCacheImpl.Entry entry = (factory != null) ? new FieldCacheImpl.Entry (field, factory) : new FieldCacheImpl.Entry (field, type); synchronized (Comparators) { HashMap readerCache = (HashMap)Comparators.get(reader); if (readerCache == null) { readerCache = new HashMap(); Comparators.put(reader,readerCache); System.out.println( "*\t* NOW: "+ Comparators.size()); } return readerCache.put (entry, value); } } JiÅÃ Kuhn wrote: This doesn't work either! Lets concentrate on the first version of my code. I believe that the code should run endlesly (I have said it before: in version 1.4 final it does). Jiri. -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:34 PM To: Lucene Users List Subject: force gc idiom - Re: OutOfMemory example JiÅÃ Kuhn wrote: Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example
Just noticed something else suspicious. FieldSortedHitQueue has a field called Comparators and it seems like things are never removed from it JiÅÃ Kuhn wrote: This doesn't work either! Lets concentrate on the first version of my code. I believe that the code should run endlesly (I have said it before: in version 1.4 final it does). Jiri. -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:34 PM To: Lucene Users List Subject: force gc idiom - Re: OutOfMemory example JiÅÃ Kuhn wrote: Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
OptimizeIt -- Re: force gc idiom - Re: OutOfMemory example
JiÅÃ Kuhn wrote: This doesn't work either! You're right. I'm running under JDK1.5 and trying larger values for -Xmx and it still fails. Running under (Borlands) OptimzeIt shows the number of Terms and Terminfos (both in org.apache.lucene.index) increase every time thru the loop, by several hundred instances each. I can trace thru some Term instances on the reference graph of OptimizeIt but it's unclear to me what's right. One *guess* is that maybe the WeakHashMap in either SegmentReader or FieldCacheImpl is the problem. Lets concentrate on the first version of my code. I believe that the code should run endlesly (I have said it before: in version 1.4 final it does). Jiri. -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:34 PM To: Lucene Users List Subject: force gc idiom - Re: OutOfMemory example JiÅÃ Kuhn wrote: Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
then probably is my mistake ...I havn't read all the emails in the thread. So ... your goal is to produce errors ... I try to avoid them :)) All the best, Sergiu Jiří Kuhn wrote: You don't see the point of my post. I sent an application which can everyone run only with lucene jar and in deterministic way produce OutOfMemoryError. That's all. Jiri. -Original Message- From: sergiu gordea [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:16 PM To: Lucene Users List Subject: Re: OutOfMemory example I have a few comments regarding your code ... 1. Why do you use RamDirectory and not the hard disk? 2. as John said, you should reuse the index instead of creating it each time in the main function if(!indexExists(File indexFile)) IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); else IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); (in some cases indexExists can be as simple as verifying if the file exits on the hard disk) 3. you iterate in a loop over 10.000 times and you create a lot of objects for (int i = 0; i < 365 * 30; i++) { Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(c.getTimeInMillis(); doc.add(Field.Keyword("id", "AB" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle je text " + i)); writer.addDocument(doc); c.add(Calendar.DAY_OF_YEAR, 1); } all the underlined lines of code create new ojects, and all of them are kept in memory. This is a lot of memory allocated only by this loop. I think that you create more than 100.000 object in this loop ... What do you think? And none of them cannot be realeased (collected by gc) untill you close the index writer. None says that your code is complicated, but all programmers should understand that this is a poor design... And ... more then that your information is kept in a RamDirectory when you will close the writer you will still keep the information in memory ... Sory if I was too agressive with my comments but ... I cannot see what were you thinking when you wrote that code ... If you are trying to make a test then I sugest you to replace the hard codded 365 value ... with a variable, to iterate over it and to test the power of your machine (PC + JVM) :)) I wish you luck, Sergiu Jiří Kuhn wrote: I disagree or I don't understand. I can change the code as it is shown below. Now I must reopen the index to see the changes, but the memory problem remains. I realy don't know what I'm doing wrong, the code is so simple. Jiri. ... public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); } } private static void add_to_index(Directory directory, int i) throws IOException { IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(System.currentTimeMillis(); doc.add(Field.Keyword("id", "CD" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle neni text " + i)); writer.addDocument(doc); System.err.println("index size: " + writer.docCount()); writer.close(); } ... -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 3:25 PM To: Lucene Users List Subject: Re: OutOfMemory example You should reuse your old index (as eg an application variable) unless it has changed - use getCurrentVersion to check the index for updates. This has come up before. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: force gc idiom - Re: OutOfMemory example
This doesn't work either! Lets concentrate on the first version of my code. I believe that the code should run endlesly (I have said it before: in version 1.4 final it does). Jiri. -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:34 PM To: Lucene Users List Subject: force gc idiom - Re: OutOfMemory example JiÅÃ Kuhn wrote: > Thanks for the bug's id, it seems like my problem and I have a stand-alone code with > main(). > > What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); > > Let change the code once again: > > ... > public static void main(String[] args) throws IOException, InterruptedException > { > Directory directory = create_index(); > > for (int i = 1; i < 100; i++) { > System.err.println("loop " + i + ", index version: " + > IndexReader.getCurrentVersion(directory)); > search_index(directory); > add_to_index(directory, i); > System.gc(); > Thread.sleep(1000);// whatever value you want > } > } > ... > > and in the 4th iteration java.lang.OutOfMemoryError appears again. > > Jiri. > > > -Original Message- > From: John Moylan [mailto:[EMAIL PROTECTED] > Sent: Monday, September 13, 2004 4:53 PM > To: Lucene Users List > Subject: Re: OutOfMemory example > > > http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 > > you can close the index, but the Garbage Collector still needs to > reclaim the memory and it may be taking longer than your loop to do so. > > John > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
force gc idiom - Re: OutOfMemory example
JiÅÃ Kuhn wrote: Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. I've seen this written up before (javaworld?) as a way to probably "force" GC instead of just a System.gc() call. I think the 2nd gc() call is supposed to clean up junk from the runFinalization() call... System.gc(); Thread.sleep( 100); System.runFinalization(); Thread.sleep( 100); System.gc(); Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: OutOfMemory example
You don't see the point of my post. I sent an application which can everyone run only with lucene jar and in deterministic way produce OutOfMemoryError. That's all. Jiri. -Original Message- From: sergiu gordea [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 5:16 PM To: Lucene Users List Subject: Re: OutOfMemory example I have a few comments regarding your code ... 1. Why do you use RamDirectory and not the hard disk? 2. as John said, you should reuse the index instead of creating it each time in the main function if(!indexExists(File indexFile)) IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); else IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); (in some cases indexExists can be as simple as verifying if the file exits on the hard disk) 3. you iterate in a loop over 10.000 times and you create a lot of objects for (int i = 0; i < 365 * 30; i++) { Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(c.getTimeInMillis(); doc.add(Field.Keyword("id", "AB" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle je text " + i)); writer.addDocument(doc); c.add(Calendar.DAY_OF_YEAR, 1); } all the underlined lines of code create new ojects, and all of them are kept in memory. This is a lot of memory allocated only by this loop. I think that you create more than 100.000 object in this loop ... What do you think? And none of them cannot be realeased (collected by gc) untill you close the index writer. None says that your code is complicated, but all programmers should understand that this is a poor design... And ... more then that your information is kept in a RamDirectory when you will close the writer you will still keep the information in memory ... Sory if I was too agressive with my comments but ... I cannot see what were you thinking when you wrote that code ... If you are trying to make a test then I sugest you to replace the hard codded 365 value ... with a variable, to iterate over it and to test the power of your machine (PC + JVM) :)) I wish you luck, Sergiu Jiří Kuhn wrote: >I disagree or I don't understand. > >I can change the code as it is shown below. Now I must reopen the index to see the >changes, but the memory problem remains. I realy don't know what I'm doing wrong, the >code is so simple. > >Jiri. > > ... > >public static void main(String[] args) throws IOException >{ >Directory directory = create_index(); > >for (int i = 1; i < 100; i++) { >System.err.println("loop " + i + ", index version: " + > IndexReader.getCurrentVersion(directory)); >search_index(directory); >add_to_index(directory, i); >} >} > >private static void add_to_index(Directory directory, int i) throws IOException >{ >IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), > false); > >SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); >Document doc = new Document(); > >doc.add(Field.Keyword("date", df.format(new > Date(System.currentTimeMillis(); >doc.add(Field.Keyword("id", "CD" + String.valueOf(i))); >doc.add(Field.Text("text", "Tohle neni text " + i)); >writer.addDocument(doc); > > System.err.println("index size: " + writer.docCount()); >writer.close(); >} > > ... > >-Original Message- >From: John Moylan [mailto:[EMAIL PROTECTED] >Sent: Monday, September 13, 2004 3:25 PM >To: Lucene Users List >Subject: Re: OutOfMemory example > > >You should reuse your old index (as eg an application variable) unless >it has changed - use getCurrentVersion to check the index for updates. >This has come up before. > >John > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: OutOfMemory example
Thanks for the bug's id, it seems like my problem and I have a stand-alone code with main(). What about slow garbage collector? This looks for me as wrong suggestion. Let change the code once again: ... public static void main(String[] args) throws IOException, InterruptedException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); System.gc(); Thread.sleep(1000);// whatever value you want } } ... and in the 4th iteration java.lang.OutOfMemoryError appears again. Jiri. -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 4:53 PM To: Lucene Users List Subject: Re: OutOfMemory example http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John
Re: OutOfMemory example
I have a few comments regarding your code ... 1. Why do you use RamDirectory and not the hard disk? 2. as John said, you should reuse the index instead of creating it each time in the main function if(!indexExists(File indexFile)) IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); else IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); (in some cases indexExists can be as simple as verifying if the file exits on the hard disk) 3. you iterate in a loop over 10.000 times and you create a lot of objects for (int i = 0; i < 365 * 30; i++) { Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(c.getTimeInMillis(); doc.add(Field.Keyword("id", "AB" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle je text " + i)); writer.addDocument(doc); c.add(Calendar.DAY_OF_YEAR, 1); } all the underlined lines of code create new ojects, and all of them are kept in memory. This is a lot of memory allocated only by this loop. I think that you create more than 100.000 object in this loop ... What do you think? And none of them cannot be realeased (collected by gc) untill you close the index writer. None says that your code is complicated, but all programmers should understand that this is a poor design... And ... more then that your information is kept in a RamDirectory when you will close the writer you will still keep the information in memory ... Sory if I was too agressive with my comments but ... I cannot see what were you thinking when you wrote that code ... If you are trying to make a test then I sugest you to replace the hard codded 365 value ... with a variable, to iterate over it and to test the power of your machine (PC + JVM) :)) I wish you luck, Sergiu Jiří Kuhn wrote: I disagree or I don't understand. I can change the code as it is shown below. Now I must reopen the index to see the changes, but the memory problem remains. I realy don't know what I'm doing wrong, the code is so simple. Jiri. ... public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); } } private static void add_to_index(Directory directory, int i) throws IOException { IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(System.currentTimeMillis(); doc.add(Field.Keyword("id", "CD" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle neni text " + i)); writer.addDocument(doc); System.err.println("index size: " + writer.docCount()); writer.close(); } ... -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 3:25 PM To: Lucene Users List Subject: Re: OutOfMemory example You should reuse your old index (as eg an application variable) unless it has changed - use getCurrentVersion to check the index for updates. This has come up before. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628 you can close the index, but the Garbage Collector still needs to reclaim the memory and it may be taking longer than your loop to do so. John JiÅÃ Kuhn wrote: I disagree or I don't understand. I can change the code as it is shown below. Now I must reopen the index to see the changes, but the memory problem remains. I realy don't know what I'm doing wrong, the code is so simple. Jiri. ... public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); } } private static void add_to_index(Directory directory, int i) throws IOException { IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(System.currentTimeMillis(); doc.add(Field.Keyword("id", "CD" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle neni text " + i)); writer.addDocument(doc); System.err.println("index size: " + writer.docCount()); writer.close(); } ... -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 3:25 PM To: Lucene Users List Subject: Re: OutOfMemory example You should reuse your old index (as eg an application variable) unless it has changed - use getCurrentVersion to check the index for updates. This has come up before. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Please note that emails to, from and within RTÉ may be subject to the Freedom of Information Act 1997 and may be liable to disclosure. ** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: OutOfMemory example
I disagree or I don't understand. I can change the code as it is shown below. Now I must reopen the index to see the changes, but the memory problem remains. I realy don't know what I'm doing wrong, the code is so simple. Jiri. ... public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i + ", index version: " + IndexReader.getCurrentVersion(directory)); search_index(directory); add_to_index(directory, i); } } private static void add_to_index(Directory directory, int i) throws IOException { IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(System.currentTimeMillis(); doc.add(Field.Keyword("id", "CD" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle neni text " + i)); writer.addDocument(doc); System.err.println("index size: " + writer.docCount()); writer.close(); } ... -Original Message- From: John Moylan [mailto:[EMAIL PROTECTED] Sent: Monday, September 13, 2004 3:25 PM To: Lucene Users List Subject: Re: OutOfMemory example You should reuse your old index (as eg an application variable) unless it has changed - use getCurrentVersion to check the index for updates. This has come up before. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OutOfMemory example
You should reuse your old index (as eg an application variable) unless it has changed - use getCurrentVersion to check the index for updates. This has come up before. John Jiří Kuhn wrote: Hi, I think I can reproduce memory leaking problem while reopening an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is: $ java -version java version "1.4.2_05" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04) Java HotSpot(TM) Client VM (build 1.4.2_05-b04, mixed mode) The code you can test is below, there are only 3 iterations for me if I use -Xmx5m, the 4th fails. Jiri. package test; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Searcher; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Calendar; import java.util.Date; /** * Run this test with Lucene 1.4.1 and -Xmx5m */ public class ReopenTest { private static long mem_last = 0; public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i); search_index(directory); } } private static void search_index(Directory directory) throws IOException { IndexReader reader = IndexReader.open(directory); Searcher searcher = new IndexSearcher(reader); print_mem("search 1"); SortField[] fields = new SortField[2]; fields[0] = new SortField("date", SortField.STRING, true); fields[1] = new SortField("id", SortField.STRING, false); Sort sort = new Sort(fields); TermQuery query = new TermQuery(new Term("text", "\"text 5\"")); print_mem("search 2"); Hits hits = searcher.search(query, sort); print_mem("search 3"); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); System.out.println("doc " + i + ": " + doc.toString()); } print_mem("search 4"); searcher.close(); reader.close(); } private static void print_mem(String log) { long mem_free = Runtime.getRuntime().freeMemory(); long mem_total = Runtime.getRuntime().totalMemory(); long mem_max = Runtime.getRuntime().maxMemory(); long delta = (mem_last - mem_free) * -1; System.out.println(log + "= delta: " + delta + ", free: " + mem_free + ", used: " + (mem_total-mem_free) + ", total: " + mem_total + ", max: " + mem_max); mem_last = mem_free; } private static Directory create_index() throws IOException { print_mem("create 1"); Directory directory = new RAMDirectory(); Calendar c = Calendar.getInstance(); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); for (int i = 0; i < 365 * 30; i++) { Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(c.getTimeInMillis(); doc.add(Field.Keyword("id", "AB" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle je text " + i)); writer.addDocument(doc); c.add(Calendar.DAY_OF_YEAR, 1); } writer.optimize(); System.err.println("index size: " + writer.docCount()); writer.close(); print_mem("create 2"); return directory; } } - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Please note that emails to, from and within RTÉ may be subject to the Freedom of Information Act 1997 and may be liable to disclosure. ** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
OutOfMemory example
Hi, I think I can reproduce memory leaking problem while reopening an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is: $ java -version java version "1.4.2_05" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04) Java HotSpot(TM) Client VM (build 1.4.2_05-b04, mixed mode) The code you can test is below, there are only 3 iterations for me if I use -Xmx5m, the 4th fails. Jiri. package test; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Searcher; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TermQuery; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Calendar; import java.util.Date; /** * Run this test with Lucene 1.4.1 and -Xmx5m */ public class ReopenTest { private static long mem_last = 0; public static void main(String[] args) throws IOException { Directory directory = create_index(); for (int i = 1; i < 100; i++) { System.err.println("loop " + i); search_index(directory); } } private static void search_index(Directory directory) throws IOException { IndexReader reader = IndexReader.open(directory); Searcher searcher = new IndexSearcher(reader); print_mem("search 1"); SortField[] fields = new SortField[2]; fields[0] = new SortField("date", SortField.STRING, true); fields[1] = new SortField("id", SortField.STRING, false); Sort sort = new Sort(fields); TermQuery query = new TermQuery(new Term("text", "\"text 5\"")); print_mem("search 2"); Hits hits = searcher.search(query, sort); print_mem("search 3"); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); System.out.println("doc " + i + ": " + doc.toString()); } print_mem("search 4"); searcher.close(); reader.close(); } private static void print_mem(String log) { long mem_free = Runtime.getRuntime().freeMemory(); long mem_total = Runtime.getRuntime().totalMemory(); long mem_max = Runtime.getRuntime().maxMemory(); long delta = (mem_last - mem_free) * -1; System.out.println(log + "= delta: " + delta + ", free: " + mem_free + ", used: " + (mem_total-mem_free) + ", total: " + mem_total + ", max: " + mem_max); mem_last = mem_free; } private static Directory create_index() throws IOException { print_mem("create 1"); Directory directory = new RAMDirectory(); Calendar c = Calendar.getInstance(); SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true); for (int i = 0; i < 365 * 30; i++) { Document doc = new Document(); doc.add(Field.Keyword("date", df.format(new Date(c.getTimeInMillis(); doc.add(Field.Keyword("id", "AB" + String.valueOf(i))); doc.add(Field.Text("text", "Tohle je text " + i)); writer.addDocument(doc); c.add(Calendar.DAY_OF_YEAR, 1); } writer.optimize(); System.err.println("index size: " + writer.docCount()); writer.close(); print_mem("create 2"); return directory; } } - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]