RE: OutOfMemory example

2004-09-14 Thread Ji Kuhn
The error is thrown in exactly the same point as before. This morning I downloaded 
Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM is 1.4.2_05-b04, both 
Linux and Windows.

Jiri.

-Original Message-
From: Daniel Naber [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 10:58 PM
To: Lucene Users List
Subject: Re: OutOfMemory example


On Monday 13 September 2004 15:06, Ji Kuhn wrote:

 I think I can reproduce memory leaking problem while reopening
 an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My
 JVM is:

Could you try with the latest Lucene version from CVS? I cannot reproduce 
your problem with that version (Sun's Java 1.4.2_03, Linux).

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: OutOfMemory example

2004-09-14 Thread Daniel Naber
On Tuesday 14 September 2004 08:32, Ji Kuhn wrote:

 The error is thrown in exactly the same point as before. This morning I
 downloaded Lucene from CVS, now the jar is lucene-1.5-rc1-dev.jar, JVM
 is 1.4.2_05-b04, both Linux and Windows.

Now I can reproduce the problem. I first tried running the code inside 
Eclipse, but the Exception doesn't occur there. It does occur on the 
command line.

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: OutOfMemory example

2004-09-13 Thread John Moylan
You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John
Ji Kuhn wrote:
Hi,
I think I can reproduce memory leaking problem while reopening an index. 
Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is:
$ java -version
java version 1.4.2_05
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04)
Java HotSpot(TM) Client VM (build 1.4.2_05-b04, mixed mode)
The code you can test is below, there are only 3 iterations for me if I use 
-Xmx5m, the 4th fails.
Jiri.
package test;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Searcher;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Date;
/**
 * Run this test with Lucene 1.4.1 and -Xmx5m
 */
public class ReopenTest
{
private static long mem_last = 0;
public static void main(String[] args) throws IOException
{
Directory directory = create_index();
for (int i = 1; i  100; i++) {
System.err.println(loop  + i);
search_index(directory);
}
}
private static void search_index(Directory directory) throws IOException
{
IndexReader reader = IndexReader.open(directory);
Searcher searcher = new IndexSearcher(reader);
print_mem(search 1);
SortField[] fields = new SortField[2];
fields[0] = new SortField(date, SortField.STRING, true);
fields[1] = new SortField(id, SortField.STRING, false);
Sort sort = new Sort(fields);
TermQuery query = new TermQuery(new Term(text, \text 5\));
print_mem(search 2);
Hits hits = searcher.search(query, sort);
print_mem(search 3);
for (int i = 0; i  hits.length(); i++) {
Document doc = hits.doc(i);
System.out.println(doc  + i + :  + doc.toString());
}
print_mem(search 4);
searcher.close();
reader.close();
}
private static void print_mem(String log)
{
long mem_free = Runtime.getRuntime().freeMemory();
long mem_total = Runtime.getRuntime().totalMemory();
long mem_max = Runtime.getRuntime().maxMemory();
long delta = (mem_last - mem_free) * -1;
System.out.println(log + = delta:  + delta + , free:  + mem_free + , used:  + 
	(mem_total-mem_free) + , total:  + mem_total + , max:  + mem_max);

mem_last = mem_free;
}
private static Directory create_index() throws IOException
{
print_mem(create 1);
Directory directory = new RAMDirectory();
Calendar c = Calendar.getInstance();
SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), true);
for (int i = 0; i  365 * 30; i++) {
Document doc = new Document();
doc.add(Field.Keyword(date, df.format(new Date(c.getTimeInMillis();
doc.add(Field.Keyword(id, AB + String.valueOf(i)));
doc.add(Field.Text(text, Tohle je text  + i));
writer.addDocument(doc);
c.add(Calendar.DAY_OF_YEAR, 1);
}
writer.optimize();
System.err.println(index size:  + writer.docCount());
writer.close();
print_mem(create 2);
return directory;
}
}
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
**
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful.
Please note that emails to, from and within RT may be subject to the Freedom
of Information Act 1997 and may be liable to disclosure.
**
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: OutOfMemory example

2004-09-13 Thread Ji Kuhn
I disagree or I don't understand. 

I can change the code as it is shown below. Now I must reopen the index to see the 
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the 
code is so simple.

Jiri.

...

public static void main(String[] args) throws IOException
{
Directory directory = create_index();

for (int i = 1; i  100; i++) {
System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
search_index(directory);
add_to_index(directory, i);
}
}

private static void add_to_index(Directory directory, int i) throws IOException
{
IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false);

SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
Document doc = new Document();

doc.add(Field.Keyword(date, df.format(new 
Date(System.currentTimeMillis();
doc.add(Field.Keyword(id, CD + String.valueOf(i)));
doc.add(Field.Text(text, Tohle neni text  + i));
writer.addDocument(doc);

System.err.println(index size:  + writer.docCount());
writer.close();
}

...

-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 3:25 PM
To: Lucene Users List
Subject: Re: OutOfMemory example


You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: OutOfMemory example

2004-09-13 Thread John Moylan
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John
Ji Kuhn wrote:
I disagree or I don't understand. 

I can change the code as it is shown below. Now I must reopen the index to see the 
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the 
code is so simple.
Jiri.
...
public static void main(String[] args) throws IOException
{
Directory directory = create_index();
for (int i = 1; i  100; i++) {
System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
search_index(directory);
add_to_index(directory, i);
}
}
private static void add_to_index(Directory directory, int i) throws IOException
{
IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false);
SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
Document doc = new Document();
doc.add(Field.Keyword(date, df.format(new 
Date(System.currentTimeMillis();
doc.add(Field.Keyword(id, CD + String.valueOf(i)));
doc.add(Field.Text(text, Tohle neni text  + i));
writer.addDocument(doc);
System.err.println(index size:  + writer.docCount());
writer.close();
}
...
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 3:25 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
**
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful.
Please note that emails to, from and within RT may be subject to the Freedom
of Information Act 1997 and may be liable to disclosure.
**
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: OutOfMemory example

2004-09-13 Thread sergiu gordea
I have a few comments regarding your code ...
1. Why do you use RamDirectory and not the hard disk?
2. as John said, you should reuse the index instead of creating it each 
time in the main function
   if(!indexExists(File indexFile))
IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), true);
   else
IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), false);
   (in some cases indexExists can be as simple as verifying if the file 
exits on the hard disk)

3. you iterate in a loop over 10.000 times and you create a lot of objects
  

for (int i = 0; i  365 * 30; i++) {
   Document doc = new Document();
   doc.add(Field.Keyword(date, df.format(new 
Date(c.getTimeInMillis();
   doc.add(Field.Keyword(id, AB + String.valueOf(i)));
   doc.add(Field.Text(text, Tohle je text  + i));
   writer.addDocument(doc);

   c.add(Calendar.DAY_OF_YEAR, 1);
   }
all the underlined lines of code create new  ojects, and all of them are 
kept in memory.
This is a lot of memory allocated only by this loop. I think that you 
create more than 100.000 object in this loop ...
What do you think?
And none of them cannot be realeased (collected by gc) untill you close 
the index writer.

None says that your code is complicated, but all programmers should 
understand that this is a poor design...
And ... more then that your information is kept in a RamDirectory 
when you will close the writer you will still keep the information 
in memory ...

Sory if I was too agressive with my comments  but ... I cannot see 
what were you thinking when you wrote that code ...

If you are trying to make a test  then I sugest you to replace the 
hard codded 365 value ... with a variable, to iterate over it and to 
test the power of your machine
(PC + JVM) :))

I wish you luck,
Sergiu


Ji Kuhn wrote:
I disagree or I don't understand. 

I can change the code as it is shown below. Now I must reopen the index to see the 
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the 
code is so simple.
Jiri.
...
   public static void main(String[] args) throws IOException
   {
   Directory directory = create_index();
   for (int i = 1; i  100; i++) {
   System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
   search_index(directory);
   add_to_index(directory, i);
   }
   }
   private static void add_to_index(Directory directory, int i) throws IOException
   {
   IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false);
   SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
   Document doc = new Document();
   doc.add(Field.Keyword(date, df.format(new Date(System.currentTimeMillis();
   doc.add(Field.Keyword(id, CD + String.valueOf(i)));
   doc.add(Field.Text(text, Tohle neni text  + i));
   writer.addDocument(doc);
   System.err.println(index size:  + writer.docCount());
   writer.close();
   }
...
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 3:25 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: OutOfMemory example

2004-09-13 Thread Ji Kuhn
Thanks for the bug's id, it seems like my problem and I have a stand-alone code with 
main().

What about slow garbage collector? This looks for me as wrong suggestion.

Let change the code once again:

...
public static void main(String[] args) throws IOException, InterruptedException
{
Directory directory = create_index();

for (int i = 1; i  100; i++) {
System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
search_index(directory);
add_to_index(directory, i);
System.gc();
Thread.sleep(1000);// whatever value you want
}
}
...

and in the 4th iteration java.lang.OutOfMemoryError appears again.

Jiri.


-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example


http://issues.apache.org/bugzilla/show_bug.cgi?id=30628

you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John



RE: OutOfMemory example

2004-09-13 Thread Ji Kuhn
You don't see the point of my post. I sent an application which can everyone run only 
with lucene jar and in deterministic way produce OutOfMemoryError.

That's all.

Jiri.


-Original Message-
From: sergiu gordea [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:16 PM
To: Lucene Users List
Subject: Re: OutOfMemory example


I have a few comments regarding your code ...
1. Why do you use RamDirectory and not the hard disk?
2. as John said, you should reuse the index instead of creating it each 
time in the main function
if(!indexExists(File indexFile))
 IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), true);
else
 IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), false);
(in some cases indexExists can be as simple as verifying if the file 
exits on the hard disk)

3. you iterate in a loop over 10.000 times and you create a lot of objects
   

for (int i = 0; i  365 * 30; i++) {
Document doc = new Document();

doc.add(Field.Keyword(date, df.format(new 
Date(c.getTimeInMillis();
doc.add(Field.Keyword(id, AB + String.valueOf(i)));
doc.add(Field.Text(text, Tohle je text  + i));
writer.addDocument(doc);

c.add(Calendar.DAY_OF_YEAR, 1);
}
all the underlined lines of code create new  ojects, and all of them are 
kept in memory.
This is a lot of memory allocated only by this loop. I think that you 
create more than 100.000 object in this loop ...
What do you think?
And none of them cannot be realeased (collected by gc) untill you close 
the index writer.

None says that your code is complicated, but all programmers should 
understand that this is a poor design...
And ... more then that your information is kept in a RamDirectory 
when you will close the writer you will still keep the information 
in memory ...

Sory if I was too agressive with my comments  but ... I cannot see 
what were you thinking when you wrote that code ...

If you are trying to make a test  then I sugest you to replace the 
hard codded 365 value ... with a variable, to iterate over it and to 
test the power of your machine
(PC + JVM) :))

I wish you luck,

 Sergiu






Ji Kuhn wrote:

I disagree or I don't understand. 

I can change the code as it is shown below. Now I must reopen the index to see the 
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the 
code is so simple.

Jiri.

   ...

public static void main(String[] args) throws IOException
{
Directory directory = create_index();

for (int i = 1; i  100; i++) {
System.err.println(loop  + i + , index version:  + 
 IndexReader.getCurrentVersion(directory));
search_index(directory);
add_to_index(directory, i);
}
}

private static void add_to_index(Directory directory, int i) throws IOException
{
IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), 
 false);

SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
Document doc = new Document();

doc.add(Field.Keyword(date, df.format(new 
 Date(System.currentTimeMillis();
doc.add(Field.Keyword(id, CD + String.valueOf(i)));
doc.add(Field.Text(text, Tohle neni text  + i));
writer.addDocument(doc);

System.err.println(index size:  + writer.docCount());
writer.close();
}

   ...

-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 3:25 PM
To: Lucene Users List
Subject: Re: OutOfMemory example


You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

  



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



force gc idiom - Re: OutOfMemory example

2004-09-13 Thread David Spencer
Ji Kuhn wrote:
Thanks for the bug's id, it seems like my problem and I have a stand-alone code with 
main().
What about slow garbage collector? This looks for me as wrong suggestion.

I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() call 
is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();
Let change the code once again:
...
public static void main(String[] args) throws IOException, InterruptedException
{
Directory directory = create_index();
for (int i = 1; i  100; i++) {
System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
search_index(directory);
add_to_index(directory, i);
System.gc();
Thread.sleep(1000);// whatever value you want
}
}
...
and in the 4th iteration java.lang.OutOfMemoryError appears again.
Jiri.
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread Ji Kuhn
This doesn't work either!

Lets concentrate on the first version of my code. I believe that the code should run 
endlesly (I have said it before: in version 1.4 final it does).

Jiri.

-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
To: Lucene Users List
Subject: force gc idiom - Re: OutOfMemory example


Ji Kuhn wrote:

 Thanks for the bug's id, it seems like my problem and I have a stand-alone code with 
 main().
 
 What about slow garbage collector? This looks for me as wrong suggestion.


I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() call 
is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();

 
 Let change the code once again:
 
 ...
 public static void main(String[] args) throws IOException, InterruptedException
 {
 Directory directory = create_index();
 
 for (int i = 1; i  100; i++) {
 System.err.println(loop  + i + , index version:  + 
 IndexReader.getCurrentVersion(directory));
 search_index(directory);
 add_to_index(directory, i);
 System.gc();
 Thread.sleep(1000);// whatever value you want
 }
 }
 ...
 
 and in the 4th iteration java.lang.OutOfMemoryError appears again.
 
 Jiri.
 
 
 -Original Message-
 From: John Moylan [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 13, 2004 4:53 PM
 To: Lucene Users List
 Subject: Re: OutOfMemory example
 
 
 http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
 
 you can close the index, but the Garbage Collector still needs to 
 reclaim the memory and it may be taking longer than your loop to do so.
 
 John
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: OutOfMemory example

2004-09-13 Thread sergiu gordea
then probably is my mistake ...I havn't read all the emails in the thread.
So ... your goal is to produce errors ... I try to avoid them :))
  All the best,
 Sergiu
 

Ji Kuhn wrote:
You don't see the point of my post. I sent an application which can everyone run only 
with lucene jar and in deterministic way produce OutOfMemoryError.
That's all.
Jiri.
-Original Message-
From: sergiu gordea [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:16 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
I have a few comments regarding your code ...
1. Why do you use RamDirectory and not the hard disk?
2. as John said, you should reuse the index instead of creating it each 
time in the main function
   if(!indexExists(File indexFile))
IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), true);
   else
IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), false);
   (in some cases indexExists can be as simple as verifying if the file 
exits on the hard disk)

3. you iterate in a loop over 10.000 times and you create a lot of objects
  

for (int i = 0; i  365 * 30; i++) {
   Document doc = new Document();
   doc.add(Field.Keyword(date, df.format(new 
Date(c.getTimeInMillis();
   doc.add(Field.Keyword(id, AB + String.valueOf(i)));
   doc.add(Field.Text(text, Tohle je text  + i));
   writer.addDocument(doc);

   c.add(Calendar.DAY_OF_YEAR, 1);
   }
all the underlined lines of code create new  ojects, and all of them are 
kept in memory.
This is a lot of memory allocated only by this loop. I think that you 
create more than 100.000 object in this loop ...
What do you think?
And none of them cannot be realeased (collected by gc) untill you close 
the index writer.

None says that your code is complicated, but all programmers should 
understand that this is a poor design...
And ... more then that your information is kept in a RamDirectory 
when you will close the writer you will still keep the information 
in memory ...

Sory if I was too agressive with my comments  but ... I cannot see 
what were you thinking when you wrote that code ...

If you are trying to make a test  then I sugest you to replace the 
hard codded 365 value ... with a variable, to iterate over it and to 
test the power of your machine
(PC + JVM) :))

I wish you luck,
Sergiu


Ji Kuhn wrote:
 

I disagree or I don't understand. 

I can change the code as it is shown below. Now I must reopen the index to see the 
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the 
code is so simple.
Jiri.
...
  public static void main(String[] args) throws IOException
  {
  Directory directory = create_index();
  for (int i = 1; i  100; i++) {
  System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
  search_index(directory);
  add_to_index(directory, i);
  }
  }
  private static void add_to_index(Directory directory, int i) throws IOException
  {
  IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(), false);
  SimpleDateFormat df = new SimpleDateFormat(-MM-dd);
  Document doc = new Document();
  doc.add(Field.Keyword(date, df.format(new Date(System.currentTimeMillis();
  doc.add(Field.Keyword(id, CD + String.valueOf(i)));
  doc.add(Field.Text(text, Tohle neni text  + i));
  writer.addDocument(doc);
  System.err.println(index size:  + writer.docCount());
  writer.close();
  }
...
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 3:25 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
You should reuse your old index (as eg an application variable) unless 
it has changed - use getCurrentVersion to check the index for updates. 
This has come up before.

John
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

   


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


OptimizeIt -- Re: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread David Spencer
Ji Kuhn wrote:
This doesn't work either!
You're right.
I'm running under JDK1.5 and trying larger values for -Xmx and it still 
fails.

Running under (Borlands) OptimzeIt shows the number of Terms and 
Terminfos (both in org.apache.lucene.index) increase every time thru the 
loop, by several hundred instances each.

I can trace thru some Term instances on the reference graph of 
OptimizeIt but it's unclear to me what's right. One *guess* is that 
maybe the WeakHashMap in either SegmentReader or FieldCacheImpl is the 
problem.



Lets concentrate on the first version of my code. I believe that the code should 
run endlesly (I have said it before: in version 1.4 final it does).
Jiri.
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
To: Lucene Users List
Subject: force gc idiom - Re: OutOfMemory example
Ji Kuhn wrote:

Thanks for the bug's id, it seems like my problem and I have a stand-alone code with 
main().
What about slow garbage collector? This looks for me as wrong suggestion.

I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() call 
is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();

Let change the code once again:
...
   public static void main(String[] args) throws IOException, InterruptedException
   {
   Directory directory = create_index();
   for (int i = 1; i  100; i++) {
   System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
   search_index(directory);
   add_to_index(directory, i);
   System.gc();
   Thread.sleep(1000);// whatever value you want
   }
   }
...
and in the 4th iteration java.lang.OutOfMemoryError appears again.
Jiri.
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread David Spencer
Just noticed something else suspicious.
FieldSortedHitQueue has a field called Comparators and it seems like 
things are never removed from it

Ji Kuhn wrote:
This doesn't work either!
Lets concentrate on the first version of my code. I believe that the code should run 
endlesly (I have said it before: in version 1.4 final it does).
Jiri.
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
To: Lucene Users List
Subject: force gc idiom - Re: OutOfMemory example
Ji Kuhn wrote:

Thanks for the bug's id, it seems like my problem and I have a stand-alone code with 
main().
What about slow garbage collector? This looks for me as wrong suggestion.

I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() call 
is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();

Let change the code once again:
...
   public static void main(String[] args) throws IOException, InterruptedException
   {
   Directory directory = create_index();
   for (int i = 1; i  100; i++) {
   System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
   search_index(directory);
   add_to_index(directory, i);
   System.gc();
   Thread.sleep(1000);// whatever value you want
   }
   }
...
and in the 4th iteration java.lang.OutOfMemoryError appears again.
Jiri.
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread David Spencer
David Spencer wrote:
Just noticed something else suspicious.
FieldSortedHitQueue has a field called Comparators and it seems like 
things are never removed from it
Replying to my own postthis could be the problem.
If I put in a print statement here in FieldSortedHitQueue, recompile, 
and run w/ the new jar then I see Comparators.size() go up after every 
iteration thru ReopenTest's loop and the size() never goes down...

 static Object store (IndexReader reader, String field, int type, 
Object factory, Object value) {
FieldCacheImpl.Entry entry = (factory != null)
  ? new FieldCacheImpl.Entry (field, factory)
  : new FieldCacheImpl.Entry (field, type);
synchronized (Comparators) {
  HashMap readerCache = (HashMap)Comparators.get(reader);
  if (readerCache == null) {
readerCache = new HashMap();
Comparators.put(reader,readerCache);
		System.out.println( *\t* NOW: + Comparators.size());
  }
  return readerCache.put (entry, value);
}
  }

Ji Kuhn wrote:
This doesn't work either!
Lets concentrate on the first version of my code. I believe that the 
code should run endlesly (I have said it before: in version 1.4 final 
it does).

Jiri.
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
To: Lucene Users List
Subject: force gc idiom - Re: OutOfMemory example
Ji Kuhn wrote:

Thanks for the bug's id, it seems like my problem and I have a 
stand-alone code with main().

What about slow garbage collector? This looks for me as wrong 
suggestion.


I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() 
call is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();

Let change the code once again:
...
   public static void main(String[] args) throws IOException, 
InterruptedException
   {
   Directory directory = create_index();

   for (int i = 1; i  100; i++) {
   System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
   search_index(directory);
   add_to_index(directory, i);
   System.gc();
   Thread.sleep(1000);// whatever value you want
   }
   }
...

and in the 4th iteration java.lang.OutOfMemoryError appears again.
Jiri.
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


SegmentReader - Re: FieldSortedHitQueue.Comparators -- Re: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread David Spencer
Another clue, the SegmentReaders are piling up too, which may be why the 
 Comparator map is increasing in size, because SegmentReaders are the 
keys to Comparator...though again, I don't know enough about the Lucene 
internals to know what refs to SegmentReaders are valid which which ones 
may be causing this leak.

David Spencer wrote:
David Spencer wrote:
Just noticed something else suspicious.
FieldSortedHitQueue has a field called Comparators and it seems like 
things are never removed from it

Replying to my own postthis could be the problem.
If I put in a print statement here in FieldSortedHitQueue, recompile, 
and run w/ the new jar then I see Comparators.size() go up after every 
iteration thru ReopenTest's loop and the size() never goes down...

 static Object store (IndexReader reader, String field, int type, Object 
factory, Object value) {
FieldCacheImpl.Entry entry = (factory != null)
  ? new FieldCacheImpl.Entry (field, factory)
  : new FieldCacheImpl.Entry (field, type);
synchronized (Comparators) {
  HashMap readerCache = (HashMap)Comparators.get(reader);
  if (readerCache == null) {
readerCache = new HashMap();
Comparators.put(reader,readerCache);
System.out.println( *\t* NOW: + Comparators.size());
  }
  return readerCache.put (entry, value);
}
  }


Ji Kuhn wrote:
This doesn't work either!
Lets concentrate on the first version of my code. I believe that the 
code should run endlesly (I have said it before: in version 1.4 final 
it does).

Jiri.
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
To: Lucene Users List
Subject: force gc idiom - Re: OutOfMemory example
Ji Kuhn wrote:

Thanks for the bug's id, it seems like my problem and I have a 
stand-alone code with main().

What about slow garbage collector? This looks for me as wrong 
suggestion.


I've seen this written up before (javaworld?) as a way to probably 
force GC instead of just a System.gc() call. I think the 2nd gc() 
call is supposed to clean up junk from the runFinalization() call...

System.gc();
Thread.sleep( 100);
System.runFinalization();
Thread.sleep( 100);
System.gc();

Let change the code once again:
...
   public static void main(String[] args) throws IOException, 
InterruptedException
   {
   Directory directory = create_index();

   for (int i = 1; i  100; i++) {
   System.err.println(loop  + i + , index version:  + 
IndexReader.getCurrentVersion(directory));
   search_index(directory);
   add_to_index(directory, i);
   System.gc();
   Thread.sleep(1000);// whatever value you want
   }
   }
...

and in the 4th iteration java.lang.OutOfMemoryError appears again.
Jiri.
-Original Message-
From: John Moylan [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 4:53 PM
To: Lucene Users List
Subject: Re: OutOfMemory example
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to 
reclaim the memory and it may be taking longer than your loop to do so.

John

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: OutOfMemory example

2004-09-13 Thread Daniel Naber
On Monday 13 September 2004 15:06, Ji Kuhn wrote:

 I think I can reproduce memory leaking problem while reopening
 an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My
 JVM is:

Could you try with the latest Lucene version from CVS? I cannot reproduce 
your problem with that version (Sun's Java 1.4.2_03, Linux).

Regards
 Daniel

-- 
http://www.danielnaber.de

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: OptimizeIt -- Re: force gc idiom - Re: OutOfMemory example

2004-09-13 Thread Kevin A. Burton
David Spencer wrote:
Ji Kuhn wrote:
This doesn't work either!

You're right.
I'm running under JDK1.5 and trying larger values for -Xmx and it 
still fails.

Running under (Borlands) OptimzeIt shows the number of Terms and 
Terminfos (both in org.apache.lucene.index) increase every time thru 
the loop, by several hundred instances each.
Yes... I'm running into a similar situation on JDK 1.4.2 with Lucene 
1.3... I used the JMP debugger and all my memory is taken by Terms and 
TermInfo...

I can trace thru some Term instances on the reference graph of 
OptimizeIt but it's unclear to me what's right. One *guess* is that 
maybe the WeakHashMap in either SegmentReader or FieldCacheImpl is the 
problem.
Kevin
--
Please reply using PGP.
   http://peerfear.org/pubkey.asc
   
   NewsMonster - http://www.newsmonster.org/
   
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: OutOfMemory example

2004-09-13 Thread Kevin A. Burton
Ji Kuhn wrote:
Hi,
I think I can reproduce memory leaking problem while reopening an index. 
Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is:
$ java -version
java version 1.4.2_05
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04)
Java HotSpot(TM) Client VM (build 1.4.2_05-b04, mixed mode)
The code you can test is below, there are only 3 iterations for me if I use 
-Xmx5m, the 4th fails.
 

At least this test seems tied to the Sort API... I removed the sort 
under Lucene 1.3 and it worked fine...

Kevin
--
Please reply using PGP.
   http://peerfear.org/pubkey.asc
   
   NewsMonster - http://www.newsmonster.org/
   
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
  AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: OutOfMemory example

2004-09-13 Thread David Spencer
Daniel Naber wrote:
On Monday 13 September 2004 15:06, Ji Kuhn wrote:

   I think I can reproduce memory leaking problem while reopening
an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My
JVM is:

Could you try with the latest Lucene version from CVS? I cannot reproduce 
your problem with that version (Sun's Java 1.4.2_03, Linux).
I verified it w/ the latest lucene code from CVS under win xp.
Regards
 Daniel

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]