Well, two thoughts:

1. If you’re not using solrcloud, presumably you don’t have any replicas. If 
you are, presumably you do. This makes for a biased comparison, because 
SolrCloud won’t acknowledge a write until it’s been safely written to all 
replicas. In short, solrcloud write time is max(per-replica write time). The 
more replicas you add, the bigger the chance some replica randomly takes longer 
(gc pause, perhaps?), and the longer your overall write time, assuming a fixed 
number of indexing threads.
2. The parallelism of the optimize operation across replicas has gone back and 
forth a bit, and I’m not sure what it was doing in 4.9. However, at one point 
the optimize happened per-replica, serially. So it’d do shard1_replica1, then 
when that was done, do shard1_replica2, then shard2_replica1, etc. Other 
versions of Solr would do those at the same time. Again, I don’t know if you’re 
comparing to a non-replicated solr index, but that could explain some of the 
difference.

There’s a sort of an obligatory comment at this point that optimize doesn’t 
necessarily save you a lot. There are certainly cases where it does, but if you 
haven’t already, you’ll want to validate that you have one of them and that 
you’re not just doing unnecessary work.


On 7/12/16, 7:41 AM, "Kent Mu" <solr.st...@gmail.com> wrote:

>hello, does anybody also come across the issue? can anybody help me?
>
>2016-07-11 23:17 GMT+08:00 Kent Mu <solr.st...@gmail.com>:
>
>> Hi friends!
>>
>> solr version: 4.9.0.
>>
>> we use solr and solrcloud in our project, that means we use sorl and
>> solrcloud at the same time.
>> but we find a phenomenon that sorlcoud consumes more time than solr when
>> write index. it takes nearly 5 or more times longer. I wonder that is why?
>>
>> in our project, we have a scheduler job to add index, and then execute the
>> the method of "optimize(false, true, 2)" to optimize the added index.
>> I wonder if it is caused by solrcloud internal that when writing index,
>> solrcloud needs to just which shard it should be stored? and when
>> optimizing the replicate needs to take some time to synchronize the data
>> from leader?
>>
>> and I wonder what about query?  will solrcloud also take more time than
>> solr when query data?
>>

Reply via email to