Re: Improving jvm heap consumption when indexing

2007-12-27 Thread tgospodinov
I am using Lucene 2.2.0 on Java SE 1.6.0_03-b05 in an Eclipse RCP app. I will try reusing documents and see how it'll affect heap. In the meantime, please let me know if there's anything else I can do. Thanks for the quick reply. Grant Ingersoll-6 wrote: > > This question is better for java-

Improving jvm heap consumption when indexing

2007-12-27 Thread tgospodinov
I am indexing a collection of 100,000+ sentences in memory, as part of a client app. I tested the jvm heap consumption and it increases by about 40 megs. I tried indexing on disk (which I can't do in a production environment) just to test the heap usage, and I came up with about 20-25 megs. I have

Re: SinkTokenizer: next(Token) vs. next()

2007-12-27 Thread Grant Ingersoll
On Dec 26, 2007, at 6:20 PM, Doron Cohen wrote: Working on Lucene-1101 I checked if SinkTokenizer.next(Token) should also call Token.clear(). (It shouldn't, because it ignores the input token.) However I think that calls to next() would end up creating Tokens for nothing (by TokenStream.ne

Re: Improving jvm heap consumption when indexing

2007-12-27 Thread Grant Ingersoll
This question is better for java-user, but some questions for you: What version of Lucene are you using? The trunk now has the ability to reuse Tokens/Documents, etc. to save on allocations that may help reduce the amount of heap needed. The other thing to do is try profiling to see where

Re: Improving jvm heap consumption when indexing

2007-12-27 Thread tgospodinov
Reusing documents and invoking garbage collection slowed performance, but improved heap usage big time (From an average of 20 megs+ to about 12 megs+). I am wondering if there's anything else that can be done? Grant Ingersoll-6 wrote: > > This question is better for java-user, but some questio

Re: Improving jvm heap consumption when indexing

2007-12-27 Thread Grant Ingersoll
You will also want to reuse the Token using Lucene 2.3-dev (i.e the latest from Subversion.) -Grant On Dec 27, 2007, at 10:26 AM, tgospodinov wrote: Reusing documents and invoking garbage collection slowed performance, but improved heap usage big time (From an average of 20 megs+ to abou

RE: site javadocs link broken

2007-12-27 Thread Steven A Rowe
Hi Doron, All of these worked for me when I clicked on them just now from the site: http://lucene.apache.org/java/2_2_0/api/index.html http://lucene.apache.org/java/2_1_0/api/index.html http://lucene.apache.org/java/2_0_0/api/index.html http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightl

Re: Improving jvm heap consumption when indexing

2007-12-27 Thread robert engels
Did you try running with -Xmx8m ? If you don't limit the heap, the JVM will use as much as you allow it... On Dec 27, 2007, at 8:01 AM, tgospodinov wrote: I am indexing a collection of 100,000+ sentences in memory, as part of a client app. I tested the jvm heap consumption and it increase

[jira] Updated: (LUCENE-1093) SpanFirstQuery modification to aid term boosting based on position.

2007-12-27 Thread Peter Keegan (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Keegan updated LUCENE-1093: - Attachment: TestBasics20071227.patch Here is a patch to 'TestBasics' that adds a test case for '

[jira] Updated: (LUCENE-1093) SpanFirstQuery modification to aid term boosting based on position.

2007-12-27 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-1093: - Priority: Minor (was: Major) Lucene Fields: [New, Patch Available] (was: [New]) With

[jira] Resolved: (LUCENE-1068) Invalid behavior of StandardTokenizerImpl

2007-12-27 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-1068. - Resolution: Fixed Committed. > Invalid behavior of StandardTokenizerImpl > ---

Build failed in Hudson: Lucene-Nightly #316

2007-12-27 Thread hudson
See http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/316/changes -- [...truncated 763 lines...] A contrib/snowball/src/test/org/apache/lucene/analysis A contrib/snowball/src/test/org/apache/lucene/analysis/snowball AU con