Hi Chris,
I do not have a stats but I think the performance
is reasonable. I use xpdf for PDF wvWare for DOC.
The size of my index is ~2GB (this is not limited to
only pdf doc). For avoiding memory problems, I have
set an upperbound to the size of the documents that
can be indexed. For
Hi Doug,
you are absolutely right about the older version of the JDK: it is 1.3.1
(ibm).
Unfortunately we cannot upgrade since we are bound to IBM Portalserver 4
environment.
Results:
I patched the Lucene1.4.1:
it has improved not much: after indexing 1897 Objects the number of
SegmentTermEnum
Hi
Guys
Apologies..
The Task for me is to build the Index folder using Lucene a simple
Build.xml for ANT
The Problem .. Same 'Build .xml' should be used for differnet O/s...
[ Win / Linux ]
The glitch is respective jar files such as Lucene-1.4 .jar other jar
files are not
Hi,
I recently had the same kind of problem but it was due to the way à was dealing with
Hits.
Obtaining a Hits object from a Query is very fast. but then I was looping over ALL the
hits to retrieve informations on the documents before displaying the result to the
user.
It was not necessary
You should reuse your old index (as eg an application variable) unless
it has changed - use getCurrentVersion to check the index for updates.
This has come up before.
John
Ji Kuhn wrote:
Hi,
I think I can reproduce memory leaking problem while reopening an index.
Lucene version tested
Okay, reference test is done:
on JDK 1.4.2 Lucene 1.4.1 really seems to run fine: just a moderate
number of SegmentTermEnums that is controlled by gc (about 500 for the
1900 test objects).
Daniel Taurat wrote:
Hi Doug,
you are absolutely right about the older version of the JDK: it is
1.3.1
I disagree or I don't understand.
I can change the code as it is shown below. Now I must reopen the index to see the
changes, but the memory problem remains. I realy don't know what I'm doing wrong, the
code is so simple.
Jiri.
...
public static void main(String[] args) throws
http://issues.apache.org/bugzilla/show_bug.cgi?id=30628
you can close the index, but the Garbage Collector still needs to
reclaim the memory and it may be taking longer than your loop to do so.
John
Ji Kuhn wrote:
I disagree or I don't understand.
I can change the code as it is shown below.
I have a few comments regarding your code ...
1. Why do you use RamDirectory and not the hard disk?
2. as John said, you should reuse the index instead of creating it each
time in the main function
if(!indexExists(File indexFile))
IndexWriter writer = new IndexWriter(directory, new
Thanks for the bug's id, it seems like my problem and I have a stand-alone code with
main().
What about slow garbage collector? This looks for me as wrong suggestion.
Let change the code once again:
...
public static void main(String[] args) throws IOException, InterruptedException
{
You don't see the point of my post. I sent an application which can everyone run only
with lucene jar and in deterministic way produce OutOfMemoryError.
That's all.
Jiri.
-Original Message-
From: sergiu gordea [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:16 PM
To:
Ji Kuhn wrote:
Thanks for the bug's id, it seems like my problem and I have a stand-alone code with
main().
What about slow garbage collector? This looks for me as wrong suggestion.
I've seen this written up before (javaworld?) as a way to probably
force GC instead of just a System.gc() call. I
This doesn't work either!
Lets concentrate on the first version of my code. I believe that the code should run
endlesly (I have said it before: in version 1.4 final it does).
Jiri.
-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED]
Sent: Monday, September 13, 2004 5:34 PM
then probably is my mistake ...I havn't read all the emails in the thread.
So ... your goal is to produce errors ... I try to avoid them :))
All the best,
Sergiu
Ji Kuhn wrote:
You don't see the point of my post. I sent an application which can everyone run only
with lucene jar and in
Ji Kuhn wrote:
This doesn't work either!
You're right.
I'm running under JDK1.5 and trying larger values for -Xmx and it still
fails.
Running under (Borlands) OptimzeIt shows the number of Terms and
Terminfos (both in org.apache.lucene.index) increase every time thru the
loop, by several
Just noticed something else suspicious.
FieldSortedHitQueue has a field called Comparators and it seems like
things are never removed from it
Ji Kuhn wrote:
This doesn't work either!
Lets concentrate on the first version of my code. I believe that the code should run
endlesly (I have said
David Spencer wrote:
Just noticed something else suspicious.
FieldSortedHitQueue has a field called Comparators and it seems like
things are never removed from it
Replying to my own postthis could be the problem.
If I put in a print statement here in FieldSortedHitQueue, recompile,
and
Another clue, the SegmentReaders are piling up too, which may be why the
Comparator map is increasing in size, because SegmentReaders are the
keys to Comparator...though again, I don't know enough about the Lucene
internals to know what refs to SegmentReaders are valid which which ones
may be
On Monday 13 September 2004 15:06, Ji Kuhn wrote:
I think I can reproduce memory leaking problem while reopening
an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My
JVM is:
Could you try with the latest Lucene version from CVS? I cannot reproduce
your problem with that
David Spencer wrote:
Ji Kuhn wrote:
This doesn't work either!
You're right.
I'm running under JDK1.5 and trying larger values for -Xmx and it
still fails.
Running under (Borlands) OptimzeIt shows the number of Terms and
Terminfos (both in org.apache.lucene.index) increase every time thru
the
Ji Kuhn wrote:
Hi,
I think I can reproduce memory leaking problem while reopening an index.
Lucene version tested is 1.4.1, version 1.4 final works OK. My JVM is:
$ java -version
java version 1.4.2_05
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-b04)
Java HotSpot(TM)
On Friday 10 September 2004 15:48, Chas Emerick wrote:
PDFTextStream should be added to the 'Document Converters' section,
with this URL http://snowtide.com , and perhaps this heading:
'PDFTextStream -- PDF text and metadata extraction'. The 'Author'
field should probably be left blank,
Daniel Naber wrote:
On Monday 13 September 2004 15:06, Ji Kuhn wrote:
I think I can reproduce memory leaking problem while reopening
an index. Lucene version tested is 1.4.1, version 1.4 final works OK. My
JVM is:
Could you try with the latest Lucene version from CVS? I cannot reproduce
Hi,
I was looking through the score computation when running search, and
think there may be a discrepancy between what is _documented_ in the
org.apache.lucene.search.Similarity class overview Javadocs, and what
actually occurs in the code.
I believe the problem is only with the documentation.
24 matches
Mail list logo