Hi, This is probably a question for the user list. However, as it relates to the performance issue, also Lucene index format, I think better to ask the gurus in this list ;-)
In my application, I have implemented a quality score for each document. For each search performed, the relevancy score is first computed using the lucene scoring, then, the relevancy score is combined with the quality score to finally score the document. For storing the quality score, I could use the FieldCache feature and then load the quality scores as a byte array into memory when warming up the index. However, I pay the price for the warm up. However, if I store the quality score in the term index, as in: term, <docId, qualityscore>+ This way, no need to warm up the index. But, I guess the index would be significantly bigger, and for each term, the quality score for a document is stored. I haven't done any testing yet to see which way is better. But, in general, could anyone give me some advice which way is better? I think it could be a classic time vs. space issue in computer science. But still would get the opinions from you gurus. Thanks in advance. Jian