I wanted to test several strategies for Document Boosting. It seems like the only way to do this was to reindex every Document and do setBoost. This will take a long time. I had an idea for how to do this without reindexing and I was curious if there was a better strategy or if there were additional points I should consider in this approach
1) Optimize the index 2) Get the internal lucene doc id for each document 3) Update the boosts IndexReader ir = IndexReader.open(indexDir); IndexSearcher searcher = new IndexSearcher(ir) ; Similarity sim = searcher.getSimilarity(); Collection indexedFields = ir.getFieldNames(true); Iterator it = indexedFields.iterator(); while(it.hasNext()) { String f = (String) it.next()); byte[] norms = ir.norms(f); for (int i=0; i<numDocs; i++) { float oldNorm = sim.decodeNorm(norms[i]); float newNorm = oldNorm * ( newDocBoost[i] / oldDocBoost[i]); norms[i] = sim.encodeNorm(norms[i]); } } 4) Write new norms files Does this become prohibitively complicated using a compound file system? Comments? Thanks, Dan --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]