ngroups question

2012-08-24 Thread reikje
I have a question regarding expected memory consumption when using field
collapsing with the ngroups parameter. We have indexed a forum with 500.000
threads. Each thread is a group, so we can have max. 500.000 groups. I read
somewhere that for each group a org.apache.lucene.util.ByteRef is created
which is added to a ArrayList. Whats the content of the byte[] the ByteRef
is created with? It will help me to estimate how much memory is used in
worst case if all groups are returned (which is unlikly).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ngroups-question-tp4003093.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: ngroups question

2012-08-24 Thread Erick Erickson
I think the memory size is about the (number of groups) * ((size of
key) + (a little memory for the bucket to hold members of that group).
This latter is (I'm guessing here) quite small.

Sure, you can have all 500.000 groups consume memory, quite easily.
q=*:* (OK, that one wouldn't be scored, but you get the idea). Whether
they're returned or not is not germane, they all have to be counted
(Martjin may jump all over _that_. Consider some group X with a
low-scoring document in it. When could you _know_ that you don't need
to return that group? Unfortunately, not until the very last document
is scored since it could be a perfect match for the query.

Best
Erick

On Fri, Aug 24, 2012 at 10:11 AM, reikje reik.sch...@gmail.com wrote:
 I have a question regarding expected memory consumption when using field
 collapsing with the ngroups parameter. We have indexed a forum with 500.000
 threads. Each thread is a group, so we can have max. 500.000 groups. I read
 somewhere that for each group a org.apache.lucene.util.ByteRef is created
 which is added to a ArrayList. Whats the content of the byte[] the ByteRef
 is created with? It will help me to estimate how much memory is used in
 worst case if all groups are returned (which is unlikly).



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/ngroups-question-tp4003093.html
 Sent from the Solr - User mailing list archive at Nabble.com.