:[EMAIL PROTECTED]
Sent: Wednesday, November 09, 2005 9:23 PM
To: java-user@lucene.apache.org
Subject: Re: Sorting: string vs int
The FieldCache (which is used for sorting), uses arrays of size
maxDoc() to cache field values. String sorting will involve caching a
String[] (or StringIndex
Here is a snippet of the current StringIndex class:
public static class StringIndex {
/** All the term values, in natural order. */
public final String[] lookup;
/** For each document, an index into the lookup array. */
public final int[] order;
}
The order field is used for
@lucene.apache.org
Subject: Re: Sorting: string vs int
Here is a snippet of the current StringIndex class:
public static class StringIndex {
/** All the term values, in natural order. */
public final String[] lookup;
/** For each document, an index into the lookup array
: I guess it would be nice to have some way of telling the searcher (and
: the fieldcache) whether the actual string values are needed or not...
: it could save a lot of memory when there are a lot of unique terms.
you're talking about something like LUCENE-457 right? ... but make it
optional so
Hi all. I have a question about sorting. Lucene in Action says: For
numeric types, each field being sorted for each document in the index
requires that four bytes be cached. For String types, each unique term is
also cached for each document.
I want to make sure I'm understanding this
The FieldCache (which is used for sorting), uses arrays of size
maxDoc() to cache field values. String sorting will involve caching a
String[] (or StringIndex) and int sorting will involve caching an
int[]. Unique string values are shared in the array, but the String
values plus the String[]