I think Hathi Trust has a few terabytes of index. They do full-text search on 
10 million books.

http://www.hathitrust.org/blogs/Large-scale-Search

wunder

On Apr 26, 2014, at 8:36 AM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote:

>> Anyone with experience, suggestions or lessons learned in the 10 -100 TB 
>> scale they'd like to share?
>> Researching optimum design for a Solr Cloud with, say, about 20TB index.
> 
> We're building a web archive with a projected index size of 20TB (distributed 
> in 20 shards). Some test results and a short write-up at 
> http://sbdevel.wordpress.com/2013/12/06/danish-webscale/ - feel free to ask 
> for more details.
> 
> tl;dr: We're saying to hell with RAM for caching and putting it all on SSDs 
> on a single big machine. Results so far (some distributed tests with 200GB & 
> 400GB indexes, some single tests with a production-index of 1TB) are very 
> promising, both for plain keyword-search, grouping and faceting (DocValues 
> rocks).
> 
> - Toke Eskildsen



Reply via email to