On Wed, 2008-05-14 at 18:05 -0400, Yonik Seeley wrote: > On Wed, May 14, 2008 at 5:47 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > > How difficult would it be to share stored fields across two indexes? I > > am thinking about this for a stemmed and un-stemmed index. I know that > > you could use two fields, but this affects the term stats for your index > > What stats are you concerned about? idf is field specific, so extra > fields don't affect scoring.
Ah, good point. I was confusing this in my mind with the method of using a sentinel on the term for the stemmed/unstemmed term stored at the same place in the same index. > > > and bloats the index for searching. > > The space would be the same (having the two fields in the same index), > and the only searching difference should be when trying to find a > term, an additional step in the binary search will be needed to find > the nearest index term. That should be negligible. This was my bigger concern as some of the indexes I will deal with will be very large. I buy your argument in theory, but I have seen many database's (yes, not the same as Lucene) that have much higher performance if you use multiple tables rather than one giant table. This shouldnt be the case on the same argument right? And maybe that example is not always true (I am also searching for that answer :) ) but I know I have seen it in at least SQL server, and seen it mentioned as an optimization while looking for details on the web. For now, your better judgment is quicker than trying to benchmark though, so Ill take it. - Mark > > -Yonik > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
