Re: Indexing throughput

2018-05-02 Thread Shawn Heisey
th a defined list of values. The expected peak indexing > throughput is to be about *5* documents per second (expected to be done > at off-peak hours so that search requests will be minimal during this time) > and the average throughput around *1* documents (normal business > hours)

Re: Indexing throughput

2018-05-02 Thread Greenhorn Techie
. Basically the indexing throughput is gated by two things: 1> the number of shards. Indexing throughput essentially scales up reasonably linearly with the number of shards. 2> the indexing program that pushes data to Solr. Before thinking Solr is the bottleneck, check how fast your ETL process is p

Re: Indexing throughput

2018-05-02 Thread Erick Erickson
I've seen 1.5 M docs/second. Basically the indexing throughput is gated by two things: 1> the number of shards. Indexing throughput essentially scales up reasonably linearly with the number of shards. 2> the indexing program that pushes data to Solr. Before thinking Solr is the bottleneck,

Re: Indexing throughput

2018-05-02 Thread Walter Underwood
r our production cluster is 20 nodes, each > with 24cores and 256GB memory. Data being indexed is very structured in > nature and is about 30 columns or so, out of which half of them are > categorical with a defined list of values. The expected peak indexing > throughput is to be about *5

Indexing throughput

2018-05-02 Thread Greenhorn Techie
Hi, The current hardware profile for our production cluster is 20 nodes, each with 24cores and 256GB memory. Data being indexed is very structured in nature and is about 30 columns or so, out of which half of them are categorical with a defined list of values. The expected peak indexing