Along the lines of Lucene-550, what about having a MemoryIndex that accepts multiple documents, then wrote the index once at the end in the Lucene file format (so it could be merged) during close.
When adding documents using an IndexWriter, a new segment is created for each document, and then the segments are periodically merged in memory, and/or with disk segments. It seems that when constructing an Index or updating a "lot" of documents in an existing index, the write, read, merge cycle is inefficient, and if the documents/field information were maintained in order (TreeMaps) greater efficiency would be realized. With a memory index, the memory needed during update will increase dramatically, but this could still be bounded, and a "disk based" index segment written when too many documents are in the memory index (max buffered documents). Does this "sound" like an improvement? Has anyone else tried something like this?