Re: SolrCloud Shard console shows roughly same number of documents?

Erick Erickson Thu, 26 May 2016 11:31:12 -0700

Q1: Not quite sure what you mean. Let's say I have 2 shards, 3
replicas each 16 docs on each.I _think_ you're
talking about the "core selector", which shows the docs on that
particular core, 16 in our case not 48.

Q2: Yes, that's how SolrCloud is designed. It has to be for HA/DR.
Every replica in a shard has all the docs, 16 as above. Otherwise if
one of your machines went down there could be no guarantee even
attempted about there not being data loss.

Q3: Yes, indexing will be slower when there is more than one replica
per shard since the raw document is forwarded from the leader to all
followers before acking back. In distributed situations, you will have
a bunch (potentially) more machines doing indexing so total throughput
can be faster.

Why do you care? Is there a problem or is this just general background
info? There are a number of techniques for speeding up indexing, the
first is to use SolrJ and CloudSolrClient and send batches of docs at
once rather than one-at-a-time.

Best,
Erick

On Wed, May 25, 2016 at 1:54 PM, Siddhartha Singh Sandhu
<sandhus...@gmail.com> wrote:
> Hi,
>
> I recently moved to a SolrCloud config. I had a few questions:
>
> Q1. Does a shard show cumulative number of documents or documents present
> in that particular shard on the admin console of respective shard?
>
> Q2. If 1's answer is non-cumulative then my shards(on different servers)
> are indexing all the documents on each instance of shard. Is this natural?
> I created the shards with compositeId.
>
> Q3. If the answer to 1 is cumulative then my indexing was slower then a
> single core instance which was on the same machine of which I have 2
>  now(my shards). What could I be missing while configuring Solr?
>
>
> I am using Solr 6.0.0 on Ubuntu 14.04 with external zookeeper.
>
> Regards,
>
> Sid.

Re: SolrCloud Shard console shows roughly same number of documents?

Reply via email to