Re: Anyone having these Replication issues as well?

2011-05-18 Thread kenf_nc
Thanks Markus, for your patience with getting the response in as well the
comments.

This is my Dev environment, I'm actually going to be setting up a new
master-slave configuration in a different environment today. I'll see if
it's environment specific or not. One thing I didn't mention, wasn't sure it
was germane, is that these servers are in Amazon EC2. Also, the master is
currently on a 32 bit OS the slaves are on 64 bit OS's. Just the order in
which the servers are getting upgraded in dev. 

The master has AutoCommit turned on at 30 second intervals. Even if nothing
is getting indexed, could an AutoCommit occurring during a replication
request cause a failed replication?

Ken

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Anyone-having-these-Replication-issues-as-well-tp2954365p2957127.html
Sent from the Solr - User mailing list archive at Nabble.com.


Anyone having these Replication issues as well?

2011-05-17 Thread kenf_nc
Is it just me or is Replication a POS?  (Solr 1.4.1, Tomcat  6.0.32)

1) I had set my pollInterval to 60 seconds but it appears to fire constantly
so I set it to 5 minutes and I see in the Tomcat logs where it fires the
replication check anywhere from 2 minutes to 4 1/2 minutes, but never
anything remotely consistent and never approaching 5 minutes. What kind of
timer is being used, sundial?

2) When it does fire it seems to do the check between slave and master
anywhere from 3 to 8 times, for a single poll interval. I have 3 slaves and
1 master, the master gets pounded by replication check queries, when it
should get 3 every 5 minutes, it gets up to 24 every couple minutes.

3) Worse of all, there is a replication.properties file on the slaves. It
constantly shows errors, but the tomcat logs on both the slaves and the
master are error free. Below is a representative sample. The timesFailed
number just keeps climbing. The one below went from 10 to 32 in about 8
minutes on the same server, and it should only attempt once every 5 minutes.

#Replication details
#Tue May 17 17:10:00 EDT 2011
replicationFailedAtList= {some long string of large numbers}
previousCycleTimeInSeconds=0
timesFailed=10
indexReplicatedAtList= {some long string of large numbers}
indexReplicatedAt=130500335
replicationFailedAt=130500335
timesIndexReplicated=10
lastCycleBytesDownloaded=0

Keep in mind, replication actually works! If I add/update a document on the
master i see it on the slaves eventually. So the errors above are especially
frustrating.

Any help on any or all of these issues would be greatly appreciated.
Thanks,
Ken


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Anyone-having-these-Replication-issues-as-well-tp2954365p2954365.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Anyone having these Replication issues as well?

2011-05-17 Thread Bill Bell
Sundial? Ha ha

Bill Bell
Sent from mobile


On May 17, 2011, at 3:23 PM, kenf_nc ken.fos...@realestate.com wrote:

 Is it just me or is Replication a POS?  (Solr 1.4.1, Tomcat  6.0.32)
 
 1) I had set my pollInterval to 60 seconds but it appears to fire constantly
 so I set it to 5 minutes and I see in the Tomcat logs where it fires the
 replication check anywhere from 2 minutes to 4 1/2 minutes, but never
 anything remotely consistent and never approaching 5 minutes. What kind of
 timer is being used, sundial?
 
 2) When it does fire it seems to do the check between slave and master
 anywhere from 3 to 8 times, for a single poll interval. I have 3 slaves and
 1 master, the master gets pounded by replication check queries, when it
 should get 3 every 5 minutes, it gets up to 24 every couple minutes.
 
 3) Worse of all, there is a replication.properties file on the slaves. It
 constantly shows errors, but the tomcat logs on both the slaves and the
 master are error free. Below is a representative sample. The timesFailed
 number just keeps climbing. The one below went from 10 to 32 in about 8
 minutes on the same server, and it should only attempt once every 5 minutes.
 
 #Replication details
 #Tue May 17 17:10:00 EDT 2011
 replicationFailedAtList= {some long string of large numbers}
 previousCycleTimeInSeconds=0
 timesFailed=10
 indexReplicatedAtList= {some long string of large numbers}
 indexReplicatedAt=130500335
 replicationFailedAt=130500335
 timesIndexReplicated=10
 lastCycleBytesDownloaded=0
 
 Keep in mind, replication actually works! If I add/update a document on the
 master i see it on the slaves eventually. So the errors above are especially
 frustrating.
 
 Any help on any or all of these issues would be greatly appreciated.
 Thanks,
 Ken
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Anyone-having-these-Replication-issues-as-well-tp2954365p2954365.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Anyone having these Replication issues as well?

2011-05-17 Thread Markus Jelsma
Third and last attempt, Apache spam filter seems to hate me!

Hi,

I've remember a reported issue on the mailing list mentioning the funky 
interval you describe but it had no replies. I've done several set ups with 
replication of which one is a very high load service with a pollInterval of 2 
seconds. The other set ups have a much higher interval. I've never seen this 
behaviour before in any set up. Is there something else going on? Can you 
reproduce this weird behaviour with the same index, software versions etc in a 
development environment?

About the replication.properties file's number of failed replication; i might 
not remember correctly but this value, i think, is incremented when a 
replication fails. A replication can fail when the slave is trying to download 
a (large) list of large files when, in the meantime, the master merges some 
segments. This specific issue can be remedied using the commitReserveDuration 
replication property. However, if this occurs there should be an exception in 
your log.



Re: Anyone having these Replication issues as well?

2011-05-17 Thread Erick Erickson
Markus:

I've had much better luck with the spam filter after switching to plain text
rather than HTML-ized e-mail.

FWIW
Erick

On Tue, May 17, 2011 at 6:23 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Third and last attempt, Apache spam filter seems to hate me!

 Hi,

 I've remember a reported issue on the mailing list mentioning the funky
 interval you describe but it had no replies. I've done several set ups with
 replication of which one is a very high load service with a pollInterval of 2
 seconds. The other set ups have a much higher interval. I've never seen this
 behaviour before in any set up. Is there something else going on? Can you
 reproduce this weird behaviour with the same index, software versions etc in a
 development environment?

 About the replication.properties file's number of failed replication; i might
 not remember correctly but this value, i think, is incremented when a
 replication fails. A replication can fail when the slave is trying to download
 a (large) list of large files when, in the meantime, the master merges some
 segments. This specific issue can be remedied using the commitReserveDuration
 replication property. However, if this occurs there should be an exception in
 your log.