On Thu, Nov 19, 2009 at 9:40 PM, Giovanni Di Milia <gdimi...@cfa.harvard.edu> wrote: > > On Nov 19, 2009, at 3:03 PM, Andrew Beekhof wrote: > > Another problem has appeared: > > after the reboot of one server I often have a cluster partition and both > > servers elect themselves DC. > > Even if the partition doesn't appear just after the reboot of one server > > (i.e. serverA), if I try to restart corosync on the other server (i.e. > > serverB), the partition appear. > > Then if I also restart corosync on the first server (serverA) everything > > work fine again. > > But if I restart corosync on the second server (serverB) nothing change and > > the partition appears again. > > It's seems to me that there is still something wrong with the first run of > > corosync just after the server reboot. > > I've found that it starts a bit too early by default. > Various systems seem to like messing with the network stack (xen is > one but there are others) which confuses corosync. > > I wrote a shell script that "manually starts" corosync 5 minutes after the > server starts and in this case the problem appears every time! > It's driving me crazy, because I can see that my script starts a while after > the server is up and I'm pretty sure everything is running! > On the other hand, if I start manually corosync just after the server is up, > everything works fine!
i wonder if there is something in the environment. perhaps have your script dump the output of env | sort to a file and compare to the logged in case. > > You're not getting addresses from a dhcp server are you? > Thats another common cause, since there can be a significant delay in > obtaining the address - which again messes with corosync. > > Absolutely no! > I have two servers with static public IP. > I also added the two server in the /etc/hosts file: in general I followed > all the guidelines I found in the documentation. > > I didn't configure any fencing method, because I think that my configuration > > is really simple and I don't need it. > > Do you need your data though? > > Do you mean it's better to configure a fencing method anyway? yes _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker