On 10/30/2013 1:49 PM, Shalom Ben-Zvi Kazaz wrote:
we are continuously getting this exception during replication from
master to slave. our index size is 9.27 G and we are trying to replicate
a slave from scratch.
Its a different file each time , sometimes we get to 60% replication
before it fails and sometimes only 10%, we never managed a successful
replication.
<snip>
this is the master setup:
|<requestHandler name="/replication" class="solr.ReplicationHandler" >
<lst name="master">
<str name="replicateAfter">commit</str>
<str name="replicateAfter">startup</str>
<str
name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str>
<str name="commitReserveDuration">00:00:50</str>
</lst>
</requestHandler>
I assume that you're probably doing commits fairly often, resulting in a
lot of merge activity that frequently deletes segments. That
"commitReserveDuration" parameter needs to be made larger. I would
imagine that it takes a lot more than 50 seconds to do the replication -
even if you've got an extremely fast network, replicating 9.7GB probably
takes several minutes.
From the wiki page on replication: "If your commits are very frequent
and network is particularly slow, you can tweak an extra attribute
<str name="commitReserveDuration">00:00:10</str>. This is roughly the
time taken to download 5MB from master to slave. Default is 10 secs."
http://wiki.apache.org/solr/SolrReplication#Master
You've said that your network is not slow, but with that much data, all
networks are slow.
Thanks,
Shawn