I think Hathi Trust has a few terabytes of index. They do full-text search on 10 million books.
http://www.hathitrust.org/blogs/Large-scale-Search wunder On Apr 26, 2014, at 8:36 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: >> Anyone with experience, suggestions or lessons learned in the 10 -100 TB >> scale they'd like to share? >> Researching optimum design for a Solr Cloud with, say, about 20TB index. > > We're building a web archive with a projected index size of 20TB (distributed > in 20 shards). Some test results and a short write-up at > http://sbdevel.wordpress.com/2013/12/06/danish-webscale/ - feel free to ask > for more details. > > tl;dr: We're saying to hell with RAM for caching and putting it all on SSDs > on a single big machine. Results so far (some distributed tests with 200GB & > 400GB indexes, some single tests with a production-index of 1TB) are very > promising, both for plain keyword-search, grouping and faceting (DocValues > rocks). > > - Toke Eskildsen