We were experimenting with SortingMergePolicy and came across an alternate
solution to TimSort of postings-list using FBS & GrowableWriter.
I have attached relevant code-snippet. It would be nice if someone can
clarify whether it is a good idea to implement...
public class SortingAtomicReader {
…
…
class SortingDocsEnum {
//Last 2 variables namely *newdoclist* & *olddocToFreq* are added in
//constructor. It is assumed that these 2 variables are init during
//merge start & they are then re-used till merge completes...
public SortingDocsEnum(int maxDoc, final DocsEnum in, boolean withFreqs,
final Sorter.DocMap docMap, FixedBitSet newdoclist, GrowableWriter
olddocToFreq) throws IOException {
….
…
while (true) {
//Instead of Tim-Sorting as in existing code
doc = in.nextDoc();
int newdoc = docMap.oldToNew(doc);
newdoclist.set(newdoc);
if(withFreqs) {
olddocToFreq.set(doc, in.freq());
}
}
@Override
public int nextDoc() throws IOException {
if (++docIt >= upto) {
return NO_MORE_DOCS;
}
currDoc = newdoclist.nextSetBit(++currDoc);
if(currDoc == -1) {
return NO_MORE_DOCS;
}
//clear the set-bit here before returning...
newdoclist.clear(currDoc);
return currDoc;
}
@Override
public int freq() throws IOException {
if(withFreqs && docIt < upto) {
return (int)olddocToFreq.getMutable()
.get(docMap.newToOld(currDoc));
}
return 1;
}
}