ok. I have fixed that to be no_path_retry fail but I don't think this has anything to do with the errors I am seeing.
They seem to be related to sbd's link with my cluster, not with disk I/O Tom On 23/04/14 03:11 AM, emmanuel segura wrote: > the first thing, you are using no_path_retry in wrong way in your > multipath, try to read this > http://www.novell.com/documentation/oes2/clus_admin_lx/data/bl9ykz6.html > > > 2014-04-22 20:41 GMT+02:00 Tom Parker <[email protected]>: > >> I have attached the config files to this e-mail. The sbd dump is below >> >> [LIVE] qaxen1:~ # sbd -d /dev/mapper/qa-xen-sbd dump >> ==Dumping header on disk /dev/mapper/qa-xen-sbd >> Header version : 2.1 >> UUID : ae835596-3d26-4681-ba40-206b4d51149b >> Number of slots : 255 >> Sector size : 512 >> Timeout (watchdog) : 45 >> Timeout (allocate) : 2 >> Timeout (loop) : 1 >> Timeout (msgwait) : 90 >> ==Header on disk /dev/mapper/qa-xen-sbd is dumped >> >> On 22/04/14 02:30 PM, emmanuel segura wrote: >>> you are missingo cluster configuration and sbd configuration and >> multipath >>> config >>> >>> >>> 2014-04-22 20:21 GMT+02:00 Tom Parker <[email protected]>: >>> >>>> Has anyone seen this? Do you know what might be causing the flapping? >>>> >>>> Apr 21 22:03:03 qaxen6 sbd: [12962]: info: Watchdog enabled. >>>> Apr 21 22:03:03 qaxen6 sbd: [12973]: info: Servant starting for device >>>> /dev/mapper/qa-xen-sbd >>>> Apr 21 22:03:03 qaxen6 sbd: [12974]: info: Monitoring Pacemaker health >>>> Apr 21 22:03:03 qaxen6 sbd: [12973]: info: Device /dev/mapper/qa-xen-sbd >>>> uuid: ae835596-3d26-4681-ba40-206b4d51149b >>>> Apr 21 22:03:03 qaxen6 sbd: [12974]: info: Legacy plug-in detected, AIS >>>> quorum check enabled >>>> Apr 21 22:03:03 qaxen6 sbd: [12974]: info: Waiting to sign in with >>>> cluster ... >>>> Apr 21 22:03:04 qaxen6 sbd: [12971]: notice: Using watchdog device: >>>> /dev/watchdog >>>> Apr 21 22:03:04 qaxen6 sbd: [12971]: info: Set watchdog timeout to 45 >>>> seconds. >>>> Apr 21 22:03:04 qaxen6 sbd: [12974]: info: Waiting to sign in with >>>> cluster ... >>>> Apr 21 22:03:06 qaxen6 sbd: [12974]: info: We don't have a DC right now. >>>> Apr 21 22:03:08 qaxen6 sbd: [12974]: WARN: Node state: UNKNOWN >>>> Apr 21 22:03:09 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 21 22:03:09 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 21 22:03:10 qaxen6 sbd: [12974]: WARN: Node state: pending >>>> Apr 21 22:03:11 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 21 22:15:01 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 21 22:15:01 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 21 22:16:37 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 21 22:16:37 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 21 22:25:08 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 21 22:25:08 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 21 22:26:44 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 21 22:26:44 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 21 22:39:24 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 21 22:39:24 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 21 22:42:44 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 21 22:42:44 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 01:36:24 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 01:36:24 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 01:36:34 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 01:36:34 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 06:53:15 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 06:53:15 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 06:54:03 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 06:54:03 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 09:57:21 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 09:57:21 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 09:58:12 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 09:58:12 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 10:59:49 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 10:59:49 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 11:00:41 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 11:00:41 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 11:50:55 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 11:50:55 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 11:51:06 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 11:51:06 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 13:09:12 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 13:09:12 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 13:09:35 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 13:09:35 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 13:31:35 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 13:31:35 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 13:31:44 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 13:31:44 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 13:32:52 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 13:32:52 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 13:33:01 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 13:33:01 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 13:44:39 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 13:44:39 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 13:44:47 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 13:44:47 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> Apr 22 14:07:42 qaxen6 sbd: [12974]: WARN: AIS: Quorum outdated! >>>> Apr 22 14:07:42 qaxen6 sbd: [12971]: WARN: Pacemaker health check: >>>> UNHEALTHY >>>> Apr 22 14:07:51 qaxen6 sbd: [12974]: info: Node state: online >>>> Apr 22 14:07:51 qaxen6 sbd: [12971]: info: Pacemaker health check: OK >>>> >>>> _______________________________________________ >>>> Linux-HA mailing list >>>> [email protected] >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>> See also: http://linux-ha.org/ReportingProblems >>>> >>> >> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
