On 10/30/2013 1:49 PM, Shalom Ben-Zvi Kazaz wrote:
we are continuously getting this exception during replication from
master to slave. our index size is 9.27 G and we are trying to replicate
a slave from scratch.
Its a different file each time , sometimes we get to 60% replication
before it fails and sometimes only 10%, we never managed a successful
replication.

<snip>

this is the master setup:

|<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
      <str name="replicateAfter">commit</str>
      <str name="replicateAfter">startup</str>
      <str 
name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str>
      <str name="commitReserveDuration">00:00:50</str>
    </lst>
</requestHandler>

I assume that you're probably doing commits fairly often, resulting in a lot of merge activity that frequently deletes segments. That "commitReserveDuration" parameter needs to be made larger. I would imagine that it takes a lot more than 50 seconds to do the replication - even if you've got an extremely fast network, replicating 9.7GB probably takes several minutes.

From the wiki page on replication: "If your commits are very frequent and network is particularly slow, you can tweak an extra attribute <str name="commitReserveDuration">00:00:10</str>. This is roughly the time taken to download 5MB from master to slave. Default is 10 secs."

http://wiki.apache.org/solr/SolrReplication#Master

You've said that your network is not slow, but with that much data, all networks are slow.

Thanks,
Shawn

Reply via email to