Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread markharw00d
I now think the main issue here is that a busy JVM gets into trouble trying to find large free blocks of memory for large bitsets. In my index of 64 million documents, ~8meg of contiguous free memory must be found for each bitset allocated. The terms I was trying to cache had 14 million entries

Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread eks dev
hi Mark, just out of curiosity, do you know the distribution of set bits in these terms you have tried to cache? maybe this simple tip could help. If you are lucky like we were, such terms typically used for filters are good candidates to be used to sort your index before indexing (once in a

Re: Lucene 2.3.1 release date

2008-02-19 Thread Michael McCandless
OK, SOLR-342 is now resolved: the user reported back that the current head of the 2.3 branch in fact resolves the issue he was seeing. So I think we can ship 2.3.1 now. Thanks Michael! Mike On 2/18/08, Michael McCandless <[EMAIL PROTECTED]> wrote: > The only hesitation I have is SOLR-342, where

Re: Lucene 2.3.1 release date

2008-02-19 Thread Michael Busch
Michael McCandless wrote: > OK, SOLR-342 is now resolved: the user reported back that the current > head of the 2.3 branch in fact resolves the issue he was seeing. > > So I think we can ship 2.3.1 now. Thanks Michael! > > Mike > Cool, I will build 2.3.1 later today! -Michael ---

Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread Paul Elschot
Allocating large blocks while also allocating more smaller blocks is a known problem for memory allocators, so adding a pool with preallocated blocks sounds like a good idea. With 14 million of 64 million bits set, there may not be much room to decrease the memory needed. When the set bits are ra

Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread robert engels
You could always use an array of byte[]. Each sub-array will be allocated on its own - making the contiguous need much smaller. With proper coding the offset calculation is a simple shift - so the performance should be negligible given the other code. On Feb 19, 2008, at 1:48 PM, Paul Elsch

[jira] Commented: (LUCENE-1070) DateTools with DAY resoltion dosn't work depending on your timezone

2008-02-19 Thread Kevin Conaway (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570412#action_12570412 ] Kevin Conaway commented on LUCENE-1070: --- If the behavior is correct, it is extremely

Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread robert engels
You should probably limit each array segment to 64k or such. That being said, I am having doubts that this is really the problem. We perform extensive image processing and use byte[] arrays much larger than 8 mb, and lots of them, continually allocating and deallocating. Also, the GC can m

Re: Out of memory - CachingWrappperFilter and multiple threads

2008-02-19 Thread eks dev
hi Paul, >Allocating large blocks while also allocating more smaller >blocks is a known problem for memory allocators, so adding a >pool with preallocated blocks sounds like a good idea. sure, reducing allocation pressure on jvm is always good for performance, always and everywhere. >Btw. ther

Re: Clover reports missing from hudson?

2008-02-19 Thread Grant Ingersoll
It was disabled. I think I have fixed it, so let's see tonight. -Grant On Feb 15, 2008, at 4:34 PM, Chris Hostetter wrote: not sure when they stoped working, possibly a side effect of the move to the hudson zone that has been missed until now? http://hudson.zones.apache.org/hudson/job/Lu

[jira] Commented: (LUCENE-794) Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, ConstantScoreRangeQuery

2008-02-19 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570472#action_12570472 ] Mark Miller commented on LUCENE-794: Hey Mark H, any chance you will have some time to

[VOTE] Release Lucene 2.3.1

2008-02-19 Thread Michael Busch
Hi Team, I built release artifacts from the current 2.3 branch (rev. 629191) which contain fixes for bugs that were found in the 2.3.0 release. You can find a list of changes here: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_3/CHANGES.txt?revision=629191 Please vote to officially r

[jira] Commented: (LUCENE-794) Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, ConstantScoreRangeQuery

2008-02-19 Thread Mark Harwood (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570478#action_12570478 ] Mark Harwood commented on LUCENE-794: - Will do. I'm taking a quick look now but should

Memory requirements for filters (was Re: Out of memory - CachingWrappperFilter and multiple threads)

2008-02-19 Thread Paul Elschot
Eks, Op Tuesday 19 February 2008 21:48:03 schreef eks dev: ... > > >Btw. there is some room in SortedVIntList to add interval > >coding. Normally the VInt value 0 cannot occur in the current > >version, and this could be used as a prefix to encode a run of > >set bits. > > > > I like this! I was j

[jira] Commented: (LUCENE-1039) Bayesian classifiers using Lucene as data store

2008-02-19 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570494#action_12570494 ] Paul Elschot commented on LUCENE-1039: -- DId you consider using lucene's termvectors?

[jira] Commented: (LUCENE-794) Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, ConstantScoreRangeQuery

2008-02-19 Thread Mark Harwood (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570505#action_12570505 ] Mark Harwood commented on LUCENE-794: - Couple of quick comments from a first look. * I

[jira] Commented: (LUCENE-794) Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, ConstantScoreRangeQuery

2008-02-19 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570515#action_12570515 ] Mark Miller commented on LUCENE-794: Good catch right off Mark. Appreciate you looking

Re: Clover reports missing from hudson?

2008-02-19 Thread Grant Ingersoll
Check that, Nigel and I had a little snafu on this one. I will try to work it out in the coming days. I notice the code coverage is failing in the Berkeley contrib, which is the cause of the problem. There is also a bit of a change on Hudson during the migration to the new servers that ne