Hi, Paul, Thanks for your reply. For your previous email about the need for disk based sorting solution, I kind of agree about your points. One incentive for your approach is that we don't need to warm-up the index anymore in case that the index is huge.
In our application, we have to sync up the index pretty frequently, the warm-up of the index is killing it. To address your concern about single sort locale, what about creating a sort field for each sort locale? So, if you have, say, 10 locales, you will have 10 sort fields, each utilizing the mechanism of constructing the norms. At query time, in the HitCollector, for each doc id matched, you can load the field value (integer) through the IndexReader. (here you need to enhance the IndexReader to be able to load the sort field values). Then, you can use that value to reject/accept the doc, or factor into the score. How do you think? Jian On 4/9/07, Paul Smith <[EMAIL PROTECTED]> wrote:
> > Now, if we could use integers to represent the sort field values, > which is > typically the case for most applications, maybe we can afford to > have the > sort field values stored in the disk and do disk lookup for each > document > matched? The look up of the sort field value will be as simple as > docNo * 4 > * offset. > > This way, we use the same approach as constructing the norms > (proper merging > for incremental indexing), but, at search time, we don't load the > sort field > values into memory, instead, just store them in disk. > > Will this approach be good enough? While a nifty idea, I think this only works for a single sort locale. I initially came up with a similar idea that the terms are already stored in 'sorted' order and one might be able to use the terms position for sorting, it's just that the terms ordering position is different in different locales. Paul --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]