On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi f...@efendi.ca wrote:
I believe this is correct estimate:
C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID]
same as
[String1_Document_Count + ... + String10_Document_Count + ...]
x [4 bytes per DocumentID]
That's right.
Except: as Mark said,
: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: November-03-09 5:00 AM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache memory requirements
On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi f...@efendi.ca wrote:
I believe this is correct estimate:
C. [maxdoc] x [4 bytes
Any thoughts regarding the subject? I hope FieldCache doesn't use more than
6 bytes per document-field instance... I am too lazy to research Lucene
source code, I hope someone can provide exact answer... Thanks
Subject: Lucene FieldCache memory requirements
Hi,
Can anyone confirm Lucene
document-field instance... I am too lazy to research Lucene
source code, I hope someone can provide exact answer... Thanks
Subject: Lucene FieldCache memory requirements
Hi,
Can anyone confirm Lucene FieldCache memory requirements? I have 100
millions docs with non-tokenized field country (10
: Lucene FieldCache memory requirements
Which FieldCache API are you using? getStrings? or getStringIndex
(which is used, under the hood, if you sort by this field).
Mike
On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi f...@efendi.ca wrote:
Any thoughts regarding the subject? I hope
) SOLR query for all documents *:* - in this case it will be fully
populated...
Subject: Re: Lucene FieldCache memory requirements
Which FieldCache API are you using? getStrings? or getStringIndex
(which is used, under the hood, if you sort by this field).
Mike
On Mon, Nov 2, 2009 at 2:27
-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: November-02-09 6:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache memory requirements
OK I think someone who knows how Solr uses the fieldCache for this
type of field will have to pipe up
-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: November-02-09 6:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache memory requirements
OK I think someone who knows how Solr uses the fieldCache for this
type of field will have to pipe
it is (int) Document ID...
-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: November-02-09 6:52 PM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache memory requirements
It also briefly requires more memory than just that - it allocates
Fuad Efendi wrote:
Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no
difference between maxdoc and maxdoc + 1 for such estimate... difference is
between 0.4Gb and 1.2Gb...
I'm not sure I understand - but I didn't mean to imply the +1 on maxdoc
meant anything. The
I just did some tests in a completely new index (Slave), sort by
low-distributed non-tokenized Field (such as Country) takes milliseconds,
but sort (ascending) on tokenized field with heavy distribution took 30
seconds (initially). Second sort (descending) took milliseconds. Generic
query *.*;
Mark,
I don't understand this:
so with a ton of docs and a few uniques, you get a temp boost in the RAM
reqs until it sizes it down.
Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is
not cache?
And this:
A pointer for each doc.
Why can't we use (int) DocumentID?
:
[512Mb ~ 1Gb] + [non_tokenized_fields_count] x [maxdoc] x [8 bytes]
-Fuad
-Original Message-
From: Fuad Efendi [mailto:f...@efendi.ca]
Sent: November-02-09 7:37 PM
To: solr-user@lucene.apache.org
Subject: RE: Lucene FieldCache memory requirements
Simple field (10 different values
will it size down in purely
Lucene-based heavy-loaded production system? Especially if this cache is
used for query optimizations.
-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: November-02-09 8:53 PM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache
Even in simplistic scenario, when it is Garbage Collected, we still
_need_to_be_able_ to allocate enough RAM to FieldCache on demand... linear
dependency on document count...
Hi Mark,
Yes, I understand it now; however, how will StringIndexCache size down in
a
production system faceting by
FieldCache uses internally WeakHashMap... nothing wrong, but... no any
Garbage Collection tuning will help in case if allocated RAM is not enough
for replacing Weak** with Strong**, especially for SOLR faceting... 10%-15%
CPU taken by GC were reported...
-Fuad
Hi,
Can anyone confirm Lucene FieldCache memory requirements? I have 100
millions docs with non-tokenized field country (10 different countries); I
expect it requires array of (int, long), size of array 100,000,000,
without any impact of country field length;
it requires 600,000,000 bytes: int
17 matches
Mail list logo