On Wed, 2015-09-02 at 08:30 -0700, Erick Erickson wrote: > Because I routinely see 50M docs on a single node and I've seen over 300M docs > on a single node with sub-second responses.
For what it's worth, we also do article-based search of newspaper based material (old OCR'ed papers). We use a single replicated shard for that and it works fine (response times < 1s for 98.5% of the searches), with faceting on 4 fields as well as grouping. There are 66M articles in a 340GB shard. It is always hard to compare indexes, but I agree with Erick that having performance problems with 10M documents calls for locating the bottlenecks, before trying to scale the problem away. - Toke Eskildsen, State and University Library, Denmark