Hi Daniel,

well, I assume there is a performance difference on host B between

a) getting some ready-made segments from host A (master, taking care of indexing) to host B (slave, taking care of answering queries)

and

b) host B (along with host A) doing all the work necessary to prepare incoming SolrDocument objects into a segment and make it searchable.

I am talking here about a setup where during peak loads the CPUs on host B are sweating at >80% and I assume the following:

i) Indexing will draw more than 20% CPU. Thereby it would start competing with query answering

ii) Merely copying finished segments to the query-answering node will not draw more than 20% CPU and will thereby not compete with query answering.

Index consistency is not an issue, because the number of documents and the number of different, hard-to-get-at source we will be indexing will always be out-of-sync with the index. Adding and hour or two here is the least of my problems.

Harald.

On 30.07.2014 11:58, Daniel Collins wrote:
Working backwards slightly, what do you think SolrCloud is going to give
you, apart from the consistency of the index (which you want to turn off)?
  What are "all the other benefits of SolrCloud", if you are querying
separate instances that aren't guaranteed to be in sync (since you want to
use the traditional-style master-slave for indexing.

And secondly, why don't you want to use SolrCloud for indexing everywhere?
  Again, what do you think master-slave methodology gains you?  You have
said you want all the resources of the slaves to be for querying, which
makes sense, but the slaves have to get the new updates somehow, surely?
Whether that is from SolrCloud directly, or via a master-slave replication,
the work has to be done at some point?

If you don't have NRT, and you set your commit frequency to something
reasonably large, then I don't see the "cost" of SolrCloud, but I guess it
depends on the frequency of your updates.


On 30 July 2014 08:22, Harald Kirsch <harald.kir...@raytion.com> wrote:

Thanks Erick,

for the confirmation.

You say "traditional" but the docs call it "legacy". Not a native speaker
I might misinterpret the meaning slightly but to me it conveys the notion
of "don't use this stuff if you don't have to".


"SolrCloud indexes to all nodes all the time, there's no real way to turn
that off."

which is really a pity when only query-load must be scaled and NRT is not
necessary. :-/

Harald.


On 29.07.2014 18:16, Erick Erickson wrote:

bq: What if I don't need NRT and in particular want the slave to use all
resources for query answering, i.e. only the master shall index. But at
the
same time I want all the other benefits of SolrCloud.

You want all the benefits of SolrCloud without... using SolrCloud?

Your only two choices are traditional master/slave or SolrCloud. SolrCloud
indexes to all nodes all the time, there's no real way to turn that off.
You _can_ control the frequency of commits but you can't turn off the
indexing to all the nodes.

FWIW,
Erick


On Tue, Jul 29, 2014 at 5:41 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

  I never did it, but always like.

http://lucene.472066.n3.nabble.com/Best-practice-for-
rebuild-index-in-SolrCloud-td4054574.html
  From time to time such recipes are mentioned in the list.


On Tue, Jul 29, 2014 at 12:39 PM, Harald Kirsch <
harald.kir...@raytion.com


  wrote:

  Hi all,

from the Solr documentation I find two options how replication of an
indexing is handled:

a) SolrCloud indexes on master and all slaves in parallel to support NRT
(near realtime search)

b) Legacy replication where only the master does the indexing and slave
receive index copies once in a while.

What if I don't need NRT and in particular want the slave to use all
resources for query answering, i.e. only the master shall index. But at

the

same time I want all the other benefits of SolrCloud.

Is this setup possible? Is it somewhere described in the docs?

Thanks,
Harald.




--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
   <mkhlud...@griddynamics.com>



Reply via email to