So this question has two parts: 1. How does Lucene scale, exactly? Do we distribute the index to multiple servers somehow? Or is it one index, sitting on some sort of a shared filesystem, shared by all Lucene servers? If it's the latter, the bottleneck will be I/O ... anyway, elaborate on scalability please, and how you set it up
2. High availability. How would one go about making Lucene redundant?