Re: Indexing 700 docs per second

2016-04-20 Thread Mark Robinson
Thank you all for your very valuable suggestions. I will try out the options shared once our set up is ready and probably get back on my experience once it is done. Thanks! Mark. On Wed, Apr 20, 2016 at 9:54 AM, Bram Van Dam wrote: > > I have a requirement to index

Re: Indexing 700 docs per second

2016-04-20 Thread Bram Van Dam
> I have a requirement to index (mainly updation) 700 docs per second. > Suppose I have a 128GB RAM, 32 CPU machine, with each doc size around 260 > byes (6 fields out of which only 2 will undergo updation at the above > rate). This collection has around 122Million docs and that count is pretty >

Re: Indexing 700 docs per second

2016-04-19 Thread Jeff Wartes
I have no numbers to back this up, but I’d expect Atomic Updates to be slightly slower than a full update, since the atomic approach has to retrieve the fields you didn't specify before it can write the new (updated) document. On 4/19/16, 11:54 AM, "Tim Robertson"

Re: Indexing 700 docs per second

2016-04-19 Thread Tim Robertson
Hi Mark, We were putting in and updating docs of around 20-25 indexed fields (mainly INTs, but some Strings and multivalue fields) at >1000/sec on far lesser hardware and a total of 600 million docs (batch updates of course) while also serving live queries for a website which had about 30

Re: Indexing 700 docs per second

2016-04-19 Thread Erick Erickson
Make very sure you batch updates though. Here's a benchmark I ran: https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/ NOTE: it's not entirely clear that you want to put 122M docs on a single shard. Depending on the queries you'll run you may want 2 or more shards, but that

Re: Indexing 700 docs per second

2016-04-19 Thread Susheel Kumar
It sounds achievable with your machine configuration and i would suggest to try out atomic update. Use SolrJ with multi-threaded indexing for higher indexing rate. Thanks, Susheel On Tue, Apr 19, 2016 at 9:27 AM, Tom Evans wrote: > On Tue, Apr 19, 2016 at 10:25

Re: Indexing 700 docs per second

2016-04-19 Thread Tom Evans
On Tue, Apr 19, 2016 at 10:25 AM, Mark Robinson wrote: > Hi, > > I have a requirement to index (mainly updation) 700 docs per second. > Suppose I have a 128GB RAM, 32 CPU machine, with each doc size around 260 > byes (6 fields out of which only 2 will undergo updation at

Indexing 700 docs per second

2016-04-19 Thread Mark Robinson
Hi, I have a requirement to index (mainly updation) 700 docs per second. Suppose I have a 128GB RAM, 32 CPU machine, with each doc size around 260 byes (6 fields out of which only 2 will undergo updation at the above rate). This collection has around 122Million docs and that count is pretty much