The FieldCache (which is used for sorting), uses arrays of size maxDoc() to cache field values. String sorting will involve caching a String[] (or StringIndex) and int sorting will involve caching an int[]. Unique string values are shared in the array, but the String values plus the String[] will always take up more room than the int[].
-Yonik Now hiring -- http://forms.cnet.com/slink?231706 On 11/9/05, Monsur Hossain <[EMAIL PROTECTED]> wrote: > Hi all. I have a question about sorting. Lucene in Action says: "For > numeric types, each field being sorted for each document in the index > requires that four bytes be cached. For String types, each unique term is > also cached for each document." > > I want to make sure I'm understanding this correctly. Lets say I have a > document with some text and a date; a typical document might look like this: > > DOCUMENT #1: > text = hello world > date = 20050401 > > Lets say I index 10,000 of these documents into a single Lucene index. I > then create two IndexSearchers on this index and do a search. The first > IndexSearcher sorts by date as an int, the other sorts by date as a string: > > IndexSearcher #1 = date sort on INT > IndexSearcher #2 = date sort in STRING > > If I understand the quoted sentence correctly, IndexSearcher #1 will have an > int array storing one date per document, while IndexSearcher #2 will have a > string array with only unique dates? If so, is there a particular reason > why sorting as an int doesn't cache unique dates? > > The reason I ask this is consider an index with 10,000 documents, where I > store year, month, and day as separte fields (for simplicity lets assume I > only store the years 2000 - 2005 only). When searching as an int, if each > field of each document needs to be cached, that's 10,000 documents * 3 > fields = 30,000 cached ints. If terms are uniquely cached, that's just 6 > (for each year) + 12 (for each month) + 31 (for each day) = 49 cached ints. > Am I interpreting any of this correctly? > > Thanks, > Monsur --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]