RE: Lucene FieldCache memory requirements

Fuad Efendi Mon, 02 Nov 2009 12:58:26 -0800

I am not using Lucene API directly; I am using SOLR which uses Lucene
FieldCache for faceting on non-tokenized fields...
I think this cache will be lazily loaded, until user executes sorted (by
this field) SOLR query for all documents *:* - in this case it will be fully
populated...



> Subject: Re: Lucene FieldCache memory requirements
> 
> Which FieldCache API are you using?  getStrings?  or getStringIndex
> (which is used, under the hood, if you sort by this field).
> 
> Mike
> 
> On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi <f...@efendi.ca> wrote:
> > Any thoughts regarding the subject? I hope FieldCache doesn't use more
than
> > 6 bytes per document-field instance... I am too lazy to research Lucene
> > source code, I hope someone can provide exact answer... Thanks
> >
> >
> >> Subject: Lucene FieldCache memory requirements
> >>
> >> Hi,
> >>
> >>
> >> Can anyone confirm Lucene FieldCache memory requirements? I have 100
> >> millions docs with non-tokenized field "country" (10 different
countries);
> > I
> >> expect it requires array of ("int", "long"), size of array 100,000,000,
> >> without any impact of "country" field length;
> >>
> >> it requires 600,000,000 bytes: "int" is pointer to document (Lucene
> > document
> >> ID),  and "long" is pointer to String value...
> >>
> >> Am I right, is it 600Mb just for this "country" (indexed,
non-tokenized,
> >> non-boolean) field and 100 millions docs? I need to calculate exact
> > minimum RAM
> >> requirements...
> >>
> >> I believe it shouldn't depend on cardinality (distribution) of field...
> >>
> >> Thanks,
> >> Fuad
> >>
> >>
> >>
> >>
> >
> >
> >
> >

RE: Lucene FieldCache memory requirements

Reply via email to