On Wed, 2015-02-04 at 23:31 +0100, Arumugam, Suresh wrote:
> We are trying to do a POC for searching our log files with a single
> node Solr(396 GB RAM with 14 TB Space).

We're running 7 billion larger-than-typical-log-entries documents from a
machine of similar size and it serves our needs well: 
https://sbdevel.wordpress.com/net-archive-search/

With your (I assume) tiny documents, the number 14 billion does not seem
too scary for your machine. Of course it depends on the types of queries
you are issuing and your requirements for throughput & latency.

Perhaps you could state your performance requirements as well as the
types of queries you will be issuing?


Besides the hard requirement of < 2 billion documents / shard, you are
free to choose your shard size. While the general advice of 100M/shard
is not bad, I would guess that 3-500M/shard could also work for you, as
it lowers the merging overhead to have fewer shards. What works best
also depends on the queries you make; especially faceting can be tricky
with a high number of documents /shard. 

- Toke Eskildsen, State and University Library, Denmark


Reply via email to