During the replication, check the disk, network, and CPU utilization. One of 
them is the bottleneck.

If the disk is at 100%, you are OK. If the network is at 100%, you are OK. If 
neither of them is at 100% and there is lots of CPU used (up to 100% of one 
core), then Solr is the bottleneck and it needs more performance work.

We are using New Relic for monitoring. That makes this sort of check very easy.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 8, 2017, at 8:24 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
> On 3/8/2017 5:30 AM, Caruana, Matthew wrote:
>> After upgrading to 6.4.2 from 6.4.1, we’ve seen replication time for a
>> 200gb index decrease from 45 hours to 1.5 hours. 
> 
> Just to check how long it takes to move a large amount of data over a
> network, I started a copy of a 32GB directory over a 100Mb/s network
> using a Windows client and a Samba server.  It said it would take 50
> minutes.  At this rate, copying 200GB would take over five hours.  This
> is quite a bit longer than I expected, but I hadn't done the math to
> check transfer rate against size.
> 
> Assuming that you actually intended to use the word "replication" there
> (and not something like "rebuild"), this tells me that your network is
> considerably faster than 100 megabits per second, probably gigabit, and
> that the bottleneck is the speed of the disks.
> 
> I see a previous thread where you asked about optimization performance,
> so it sounds like you are optimizing the master index which causes a
> full replication to slaves.  This is one of the reasons that
> optimization is generally not recommended except on very small indexes
> or indexes that do not change very often.
> 
> Thanks,
> Shawn
> 

Reply via email to