RE: Hardware config for SOLR

2008-09-21 Thread Andrey Shulinskiy
Grant,

Thanks a lot for the answers. Please see my replies below.

  1) Should we do sharding or not?
  If we start without sharding, how hard will it be to enable it?
  Is it just some config changes + the index rebuild or is it more?
 
 There will be operations setup, etc.  And you'll have to add in the
 appropriate query stuff.
 
 Your install and requirements aren't that large, so I doubt you'll
 need sharding, but it always depends on your exact configuration.
 I've seen indexes as big as 80 million docs on a single machine, but
 the docs were smaller in size.
 
  My personal opinion is to go without sharding at first and enable it
  later if do get a lot of documents.
 
 Sounds reasonable.

One more question - is it worth it to try to keep the whole index in
memory and shard when it doesn't fit anymore? For me it seems like a bit
of overhead, but I may be very wrong here.
What's a recommended ratio of the parts to keep in RAM and on the HDDs?

  2) How should we organize our clusters to ensure redundancy?
 
  Should we have 2 or more identical Masters (means that all the
  updates/optimisations/etc. are done for every one of them)?
 
  An alternative, afaik, is to reconfigure one slave to become the new
  Master, how hard is that?
 
 I don't have a good answer here, maybe someone else can chime in.  I
 know master failover is a concern, but I'm not sure how others handle
 it right now.  Would be good to have people share their approach.
 That being said, it seems reasonable to me to have identical masters.

I found this thread related to this issue:
http://www.nabble.com/High-Availability-deployment-to13094489.html#a1309
8729

I guess, it depends on how easy we can fill the gap between the last
commit and the time of the Master going down. Most likely, we'll have to
have 2 Masters.


  3) Basically, we can get servers of two kinds:
  * Single Processor, Dual Core Opteron 2214HE
  * 2 GB DDR2 SDRAM
  * 1 x 250 GB (7200 RPM) SATA Drive(s)
 
  * Dual Processor, Quad Core 5335
  * 16 GB Memory (Fully Buffered)
  * 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1
 
  The second - more powerful - one is more expensive, of course.
 
 Get as much RAM as you can afford.  Surely there is an in between
 machine as well that might balance cost and capabilities.  The first
 machine seems a bit light, especially in memory.

Fair enough.

  How can we take advantage of the multiprocessor/multicore servers?
 
  Is there some special setup required to make, say, 2 instances of
SOLR
  run on the same server using different processors/cores?
 
 See the Core Admin stuff http://wiki.apache.org/solr/CoreAdmin.  Solr
 is thread-safe by design (so it's a bug, if you hit issues).  You can
 send it documents on multiple threads and it will be fine.

Hmmm, it seems that several cores are supposed to handle different
indexes:
http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf
35ab6ff393f360d59
 Solr1.3 added support for multiple Solr Cores in a single
deployment of Solr -- each Solr Core has it's own index. For more
information please see CoreAdmin.

As we are going to have just one index, so the only way to use it that I
see is to configure a Master on Core1 and a Slave on core 2, or 2 slaves
on 2 cores.

Do I miss something here? 

  4) Does it make much difference to get a more powerful Master?
 
  Or, on the contrary, as slaves will be queried more often, they
should
  be the better ones? Maybe just the HDDs for the slaves should be as
  fast
  as possible?
 
 Depends on where your bottlenecks are.  Are you getting a lot of
 queries or a lot of updates?

Both, but more queries than updates. Means we shouldn't neglect slaves,
I guess?


 As for HDDs, people have noted some nice speedups in Lucene using
 Solid-state drives, if you can afford them.  Fast I/O is good if
 you're retrieving whole documents, but once things are warmed up more
 RAM is most important, I think, as many things can be cached.


  5) How many slaves does it make sense to have per one Master?
  What's (roughly) the performance gain from 1 to 2, 2 - 3, etc?
  When does it stop making sense to add more slaves?
 
 I suppose it's when you can handle your peak load, but I don't have
 numbers.  One of the keys is to incrementally test and see what makes
 sense for your scenario.

Right, the numbers given in other responses (thanks Karl and Lars) look
impressive, so we'll consider this option.

  As far as I understand, it depends mainly on the size of the index.
  However, I'd guess the time required to do a push for too many
slaves
  can be a problem too, correct?
 
 The biggest problem for slaves is if the master does an optimization,
 in which case the whole snapshot must be downloaded versus incremental
 additions can be handled by getting just the deltas.

Our initial idea is to send batch updates several times per day rather
than individual real-time updates, commit and run optimization after
that, as advised here:

Re: Hardware config for SOLR

2008-09-21 Thread Otis Gospodnetic
Hi Andrey,

Responses inlined.



- Original Message 
 From: Andrey Shulinskiy [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Sunday, September 21, 2008 11:23:00 PM
 Subject: RE: Hardware config for SOLR
 
 Grant,
 
 Thanks a lot for the answers. Please see my replies below.
 
   1) Should we do sharding or not?
   If we start without sharding, how hard will it be to enable it?
   Is it just some config changes + the index rebuild or is it more?
  
  There will be operations setup, etc.  And you'll have to add in the
  appropriate query stuff.
  
  Your install and requirements aren't that large, so I doubt you'll
  need sharding, but it always depends on your exact configuration.
  I've seen indexes as big as 80 million docs on a single machine, but
  the docs were smaller in size.
  
   My personal opinion is to go without sharding at first and enable it
   later if do get a lot of documents.
  
  Sounds reasonable.
 
 One more question - is it worth it to try to keep the whole index in
 memory and shard when it doesn't fit anymore? For me it seems like a bit
 of overhead, but I may be very wrong here.
 What's a recommended ratio of the parts to keep in RAM and on the HDDs?

It's well worth trying to keep the index buffered (i.e. in memory).  Yes, once 
you can't fit the hot parts of the index in RAM it's time to think about 
sharding (or buying more RAM).  However, it's not as simple as looking at the 
index size and RAM size, as not all parts of the index need to be cached.

   2) How should we organize our clusters to ensure redundancy?
  
   Should we have 2 or more identical Masters (means that all the
   updates/optimisations/etc. are done for every one of them)?
  
   An alternative, afaik, is to reconfigure one slave to become the new
   Master, how hard is that?
  
  I don't have a good answer here, maybe someone else can chime in.  I
  know master failover is a concern, but I'm not sure how others handle
  it right now.  Would be good to have people share their approach.
  That being said, it seems reasonable to me to have identical masters.
 
 I found this thread related to this issue:
 http://www.nabble.com/High-Availability-deployment-to13094489.html#a1309
 8729
 
 I guess, it depends on how easy we can fill the gap between the last
 commit and the time of the Master going down. Most likely, we'll have to
 have 2 Masters.

Or you could simply have 2 masters and index the same data on both of them.  
Then, in case #1 fails, you simply get your slaves to start copying from the 
#2.  You could have slaves talk to the master via a LB VIP, so a change from #1 
to #2 can be done in LB quickly and slaves don't have to be changed.  Or you 
could have masters keep the index on some sort of shared storage (e.g. SAN).

   3) Basically, we can get servers of two kinds:
   * Single Processor, Dual Core Opteron 2214HE
   * 2 GB DDR2 SDRAM
   * 1 x 250 GB (7200 RPM) SATA Drive(s)
  
   * Dual Processor, Quad Core 5335
   * 16 GB Memory (Fully Buffered)
   * 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1
  
   The second - more powerful - one is more expensive, of course.
  
  Get as much RAM as you can afford.  Surely there is an in between
  machine as well that might balance cost and capabilities.  The first
  machine seems a bit light, especially in memory.
 
 Fair enough.
 
   How can we take advantage of the multiprocessor/multicore servers?
  
   Is there some special setup required to make, say, 2 instances of
 SOLR
   run on the same server using different processors/cores?
  
  See the Core Admin stuff http://wiki.apache.org/solr/CoreAdmin.  Solr
  is thread-safe by design (so it's a bug, if you hit issues).  You can
  send it documents on multiple threads and it will be fine.
 
 Hmmm, it seems that several cores are supposed to handle different
 indexes:
 http://wiki.apache.org/solr/MultipleIndexes#head-e517417ef9b96e32168b2cf
 35ab6ff393f360d59

Yes.

  Solr1.3 added support for multiple Solr Cores in a single
 deployment of Solr -- each Solr Core has it's own index. For more
 information please see CoreAdmin.
 
 As we are going to have just one index, so the only way to use it that I
 see is to configure a Master on Core1 and a Slave on core 2, or 2 slaves
 on 2 cores.
 
 Do I miss something here? 

It sounds like you are talking about a single server hosting the master and 
slave(s) on the same server.
That's not what you want to do, though.  Master and slave(s) live each on their 
own server.  But I think you are aware of this.
You don't need to think about Solr Multicore functionality if you have but a 
single index.

   4) Does it make much difference to get a more powerful Master?
  
   Or, on the contrary, as slaves will be queried more often, they
 should
   be the better ones? Maybe just the HDDs for the slaves should be as
   fast
   as possible?
  
  Depends on where your bottlenecks are.  Are you getting a lot of
  queries or a lot of updates?
 
 Both, but more

Re: Hardware config for SOLR

2008-09-20 Thread Otis Gospodnetic
I have not worked with SSDs, though I've read all the good information that's 
trickling to us from Denmark.  One thing that I've been wondering all along is 
- what about writes?  That is, what about writes wearing out the SSD?  How 
quickly does that happen and when it does happen, what are the symptoms?  For 
example, does it happen after N write operations?  Do writes start failing and 
one starts getting IOExceptions in case of Lucene and Solr?


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Karl Wettin [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Friday, September 19, 2008 6:15:53 PM
 Subject: Re: Hardware config for SOLR
 
 
 19 sep 2008 kl. 23.22 skrev Grant Ingersoll:
 
  As for HDDs, people have noted some nice speedups in Lucene using  
  Solid-state drives, if you can afford them.
 
 I've seen the average response time cut in 5-10 times when switching  
 to SSD. 64GB SSD is starting at EUR 200 so that can be a lot cheaper  
 to do replace the disk than getting more servers, given you can fit  
 your index on of those.
 
 
   karl



Re: Hardware config for SOLR

2008-09-20 Thread Lars Kotthoff
 I have not worked with SSDs, though I've read all the good information that's
 trickling to us from Denmark.  One thing that I've been wondering all along is
 - what about writes?  That is, what about writes wearing out the SSD?  How
 quickly does that happen and when it does happen, what are the symptoms?  For
 example, does it happen after N write operations?  Do writes start failing and
 one starts getting IOExceptions in case of Lucene and Solr?

With modern SSDs you get something in the region of 500,000 to 1,000,000 write
cycles per memory cell. Additionally they all use wear leveling, i.e. the writes
are spread over the whole disk -- you can write to a file system block many
times more. One of the manufacturers of high-end SSDs [1] claims that at a
sustained write rate of 50GB per day their drives will last more than 140 years,
i.e. it's much more likely that something else will fail before ;)

When the write cycles are exhausted much the same thing as with a bad
conventional disk happens -- you'll see lots of write errors. If the wear
leveling is perfect (i.e. all memory locations have exactly the same number of
writes) it's even possible that the whole disk will fail at once.

Lars

[1] http://www.mtron.net


Re: Hardware config for SOLR

2008-09-19 Thread Grant Ingersoll

Inline below.

On Sep 17, 2008, at 6:32 PM, Andrey Shulinskiy wrote:


Hello,




First, some numbers we're expecting.

- The average size of a doc: ~100K

- The number of indexes: 1

- The query response time we're looking for:  200 - 300ms

- The number of stored docs:

1st year: 500K - 1M

2nd year: 2-3M

- The estimated number of concurrent users per second

1st year: 15 - 25

2nd year: 40 - 60

- The estimated number of queries

1st year: 15 - 25

2nd year: 40 - 60



Now the questions



1)  Should we do sharding or not?

If we start without sharding, how hard will it be to enable it?

Is it just some config changes + the index rebuild or is it more?


There will be operations setup, etc.  And you'll have to add in the  
appropriate query stuff.


Your install and requirements aren't that large, so I doubt you'll  
need sharding, but it always depends on your exact configuration.   
I've seen indexes as big as 80 million docs on a single machine, but  
the docs were smaller in size.





My personal opinion is to go without sharding at first and enable it
later if do get a lot of documents.


Sounds reasonable.






2)  How should we organize our clusters to ensure redundancy?

Should we have 2 or more identical Masters (means that all the
updates/optimisations/etc. are done for every one of them)?

An alternative, afaik, is to reconfigure one slave to become the new
Master, how hard is that?


I don't have a good answer here, maybe someone else can chime in.  I  
know master failover is a concern, but I'm not sure how others handle  
it right now.  Would be good to have people share their approach.   
That being said, it seems reasonable to me to have identical masters.








3) Basically, we can get servers of two kinds:



* Single Processor, Dual Core Opteron 2214HE

* 2 GB DDR2 SDRAM

* 1 x 250 GB (7200 RPM) SATA Drive(s)



* Dual Processor, Quad Core 5335

* 16 GB Memory (Fully Buffered)

* 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1



The second - more powerful - one is more expensive, of course.


Get as much RAM as you can afford.  Surely there is an in between  
machine as well that might balance cost and capabilities.  The first  
machine seems a bit light, especially in memory.








How can we take advantage of the multiprocessor/multicore servers?

Is there some special setup required to make, say, 2 instances of SOLR
run on the same server using different processors/cores?


See the Core Admin stuff http://wiki.apache.org/solr/CoreAdmin.  Solr  
is thread-safe by design (so it's a bug, if you hit issues).  You can  
send it documents on multiple threads and it will be fine.







4)  Does it make much difference to get a more powerful Master?

Or, on the contrary, as slaves will be queried more often, they should
be the better ones? Maybe just the HDDs for the slaves should be as  
fast

as possible?


Depends on where your bottlenecks are.  Are you getting a lot of  
queries or a lot of updates?


As for HDDs, people have noted some nice speedups in Lucene using  
Solid-state drives, if you can afford them.  Fast I/O is good if  
you're retrieving whole documents, but once things are warmed up more  
RAM is most important, I think, as many things can be cached.







5) How many slaves does it make sense to have per one Master?

What's (roughly) the performance gain from 1 to 2, 2 - 3, etc?

When does it stop making sense to add more slaves?


I suppose it's when you can handle your peak load, but I don't have  
numbers.  One of the keys is to incrementally test and see what makes  
sense for your scenario.






As far as I understand, it depends mainly on the size of the index.
However, I'd guess the time required to do a push for too many slaves
can be a problem too, correct?




The biggest problem for slaves is if the master does an optimization,  
in which case the whole snapshot must be downloaded versus incremental  
additions can be handled by getting just the deltas.



HTH,
Grant



--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: Hardware config for SOLR

2008-09-19 Thread Karl Wettin


19 sep 2008 kl. 23.22 skrev Grant Ingersoll:

As for HDDs, people have noted some nice speedups in Lucene using  
Solid-state drives, if you can afford them.


I've seen the average response time cut in 5-10 times when switching  
to SSD. 64GB SSD is starting at EUR 200 so that can be a lot cheaper  
to do replace the disk than getting more servers, given you can fit  
your index on of those.



 karl


Re: Hardware config for SOLR

2008-09-19 Thread Lars Kotthoff
  As for HDDs, people have noted some nice speedups in Lucene using  
  Solid-state drives, if you can afford them.
 
 I've seen the average response time cut in 5-10 times when switching  
 to SSD. 64GB SSD is starting at EUR 200 so that can be a lot cheaper  
 to do replace the disk than getting more servers, given you can fit  
 your index on of those.

For some concrete numbers, see http://wiki.statsbiblioteket.dk/summa/Hardware

Lars


Re: Hardware config for SOLR

2008-09-18 Thread Matthew Runo
I can't speak to a lot of this - but regarding the servers I'd go with  
the more powerful ones, if only for the amount of ram. Your index will  
likely be larger than 1 gig, and with only two you'll have a lot of  
your index not stored in ram, which will slow down your QPS.


Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
[EMAIL PROTECTED] - 702-943-7833

On Sep 17, 2008, at 3:32 PM, Andrey Shulinskiy wrote:


Hello,



We're planning to use SOLR for our project, got some questions.



So I asked some Qs yesterday, got no answers whatsoever. Wondering if
they didn't make sense, or if the e-mail was too long... :-)

Anyway, I'll try to ask them again and hope for some answers this  
time.


It's a very new experience for us so any help is really appreciated.



First, some numbers we're expecting.

- The average size of a doc: ~100K

- The number of indexes: 1

- The query response time we're looking for:  200 - 300ms

- The number of stored docs:

1st year: 500K - 1M

2nd year: 2-3M

- The estimated number of concurrent users per second

1st year: 15 - 25

2nd year: 40 - 60

- The estimated number of queries

1st year: 15 - 25

2nd year: 40 - 60



Now the questions



1)  Should we do sharding or not?

If we start without sharding, how hard will it be to enable it?

Is it just some config changes + the index rebuild or is it more?

My personal opinion is to go without sharding at first and enable it
later if do get a lot of documents.



2)  How should we organize our clusters to ensure redundancy?

Should we have 2 or more identical Masters (means that all the
updates/optimisations/etc. are done for every one of them)?

An alternative, afaik, is to reconfigure one slave to become the new
Master, how hard is that?



3) Basically, we can get servers of two kinds:



* Single Processor, Dual Core Opteron 2214HE

* 2 GB DDR2 SDRAM

* 1 x 250 GB (7200 RPM) SATA Drive(s)



* Dual Processor, Quad Core 5335

* 16 GB Memory (Fully Buffered)

* 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1



The second - more powerful - one is more expensive, of course.



How can we take advantage of the multiprocessor/multicore servers?

Is there some special setup required to make, say, 2 instances of SOLR
run on the same server using different processors/cores?



4)  Does it make much difference to get a more powerful Master?

Or, on the contrary, as slaves will be queried more often, they should
be the better ones? Maybe just the HDDs for the slaves should be as  
fast

as possible?



5) How many slaves does it make sense to have per one Master?

What's (roughly) the performance gain from 1 to 2, 2 - 3, etc?

When does it stop making sense to add more slaves?

As far as I understand, it depends mainly on the size of the index.
However, I'd guess the time required to do a push for too many slaves
can be a problem too, correct?



Thanks,

Andrey.







RE: Hardware config for SOLR

2008-09-18 Thread Andrey Shulinskiy
Matthew,

Thanks, a very good point.

Andrey.

 -Original Message-
 From: Matthew Runo [mailto:[EMAIL PROTECTED]
 Sent: Thursday, September 18, 2008 11:38 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Hardware config for SOLR
 
 I can't speak to a lot of this - but regarding the servers I'd go with
 the more powerful ones, if only for the amount of ram. Your index will
 likely be larger than 1 gig, and with only two you'll have a lot of
 your index not stored in ram, which will slow down your QPS.
 
 Thanks for your time!
 
 Matthew Runo
 Software Engineer, Zappos.com
 [EMAIL PROTECTED] - 702-943-7833
 
 On Sep 17, 2008, at 3:32 PM, Andrey Shulinskiy wrote:
 
  Hello,
 
 
 
  We're planning to use SOLR for our project, got some questions.
 
 
 
  So I asked some Qs yesterday, got no answers whatsoever. Wondering
if
  they didn't make sense, or if the e-mail was too long... :-)
 
  Anyway, I'll try to ask them again and hope for some answers this
  time.
 
  It's a very new experience for us so any help is really appreciated.
 
 
 
  First, some numbers we're expecting.
 
  - The average size of a doc: ~100K
 
  - The number of indexes: 1
 
  - The query response time we're looking for:  200 - 300ms
 
  - The number of stored docs:
 
  1st year: 500K - 1M
 
  2nd year: 2-3M
 
  - The estimated number of concurrent users per second
 
  1st year: 15 - 25
 
  2nd year: 40 - 60
 
  - The estimated number of queries
 
  1st year: 15 - 25
 
  2nd year: 40 - 60
 
 
 
  Now the questions
 
 
 
  1)  Should we do sharding or not?
 
  If we start without sharding, how hard will it be to enable it?
 
  Is it just some config changes + the index rebuild or is it more?
 
  My personal opinion is to go without sharding at first and enable it
  later if do get a lot of documents.
 
 
 
  2)  How should we organize our clusters to ensure redundancy?
 
  Should we have 2 or more identical Masters (means that all the
  updates/optimisations/etc. are done for every one of them)?
 
  An alternative, afaik, is to reconfigure one slave to become the new
  Master, how hard is that?
 
 
 
  3) Basically, we can get servers of two kinds:
 
 
 
  * Single Processor, Dual Core Opteron 2214HE
 
  * 2 GB DDR2 SDRAM
 
  * 1 x 250 GB (7200 RPM) SATA Drive(s)
 
 
 
  * Dual Processor, Quad Core 5335
 
  * 16 GB Memory (Fully Buffered)
 
  * 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1
 
 
 
  The second - more powerful - one is more expensive, of course.
 
 
 
  How can we take advantage of the multiprocessor/multicore servers?
 
  Is there some special setup required to make, say, 2 instances of
SOLR
  run on the same server using different processors/cores?
 
 
 
  4)  Does it make much difference to get a more powerful Master?
 
  Or, on the contrary, as slaves will be queried more often, they
should
  be the better ones? Maybe just the HDDs for the slaves should be as
  fast
  as possible?
 
 
 
  5) How many slaves does it make sense to have per one Master?
 
  What's (roughly) the performance gain from 1 to 2, 2 - 3, etc?
 
  When does it stop making sense to add more slaves?
 
  As far as I understand, it depends mainly on the size of the index.
  However, I'd guess the time required to do a push for too many
slaves
  can be a problem too, correct?
 
 
 
  Thanks,
 
  Andrey.
 
 
 
 



Hardware config for SOLR

2008-09-17 Thread Andrey Shulinskiy
Hello,

 

We're planning to use SOLR for our project, got some questions. 

 

So I asked some Qs yesterday, got no answers whatsoever. Wondering if
they didn't make sense, or if the e-mail was too long... :-) 

Anyway, I'll try to ask them again and hope for some answers this time.

It's a very new experience for us so any help is really appreciated.

 

First, some numbers we're expecting.

- The average size of a doc: ~100K

- The number of indexes: 1

- The query response time we're looking for:  200 - 300ms

- The number of stored docs:

1st year: 500K - 1M

2nd year: 2-3M

- The estimated number of concurrent users per second 

1st year: 15 - 25 

2nd year: 40 - 60

- The estimated number of queries

1st year: 15 - 25

2nd year: 40 - 60

 

Now the questions

 

1)  Should we do sharding or not? 

If we start without sharding, how hard will it be to enable it?

Is it just some config changes + the index rebuild or is it more?

My personal opinion is to go without sharding at first and enable it
later if do get a lot of documents. 

 

2)  How should we organize our clusters to ensure redundancy?

Should we have 2 or more identical Masters (means that all the
updates/optimisations/etc. are done for every one of them)?

An alternative, afaik, is to reconfigure one slave to become the new
Master, how hard is that?

 

3) Basically, we can get servers of two kinds:

 

* Single Processor, Dual Core Opteron 2214HE 

* 2 GB DDR2 SDRAM 

* 1 x 250 GB (7200 RPM) SATA Drive(s)

 

* Dual Processor, Quad Core 5335

* 16 GB Memory (Fully Buffered)

* 2 x 73 GB (10k RPM) 2.5 SAS Drive(s), RAID 1

 

The second - more powerful - one is more expensive, of course. 

 

How can we take advantage of the multiprocessor/multicore servers? 

Is there some special setup required to make, say, 2 instances of SOLR
run on the same server using different processors/cores?

 

4)  Does it make much difference to get a more powerful Master? 

Or, on the contrary, as slaves will be queried more often, they should
be the better ones? Maybe just the HDDs for the slaves should be as fast
as possible?

 

5) How many slaves does it make sense to have per one Master? 

What's (roughly) the performance gain from 1 to 2, 2 - 3, etc? 

When does it stop making sense to add more slaves? 

As far as I understand, it depends mainly on the size of the index.
However, I'd guess the time required to do a push for too many slaves
can be a problem too, correct?

 

Thanks,

Andrey.