pfu,
just read throu your email with all the information, just to conclude
that i cannot help that much as i'm using the new v2 style configuraiton
(which is cib.xml and ocf ressource agents)
perhaps theres somebody else who can help you with your problem.
if you try to use the xml based configuration method, feel free to ask
me about my setup!
cheers,
raoul bhatia
Peter Petroff wrote:
Doing first ever install of drbd + heartbeat.
Having issue with DRBD. Any help would be great.
Lets say I lose power to Primary, the secondary will not become primary
so then my haresources fail.
This is the status before I pull the power on trixbox1.local
SVN Revision: 2947 build by [EMAIL PROTECTED], 2007-09-29 06:28:57
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
Here is a power loss to trixbox1.local
Oct 8 12:38:48 trixbox2 heartbeat: [2400]: info: Link
trixbox1.local:eth1 dead.
Oct 8 12:38:48 trixbox2 heartbeat: [2895]: info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys trixbox2.local] to
acquire.
Oct 8 12:38:48 trixbox2 harc[2894]: info: Running /etc/ha.d/rc.d/status
status
Oct 8 12:38:48 trixbox2 kernel: tg3: eth1: Link is up at 100 Mbps, full
duplex.
Oct 8 12:38:48 trixbox2 kernel: tg3: eth1: Flow control is off for TX
and off for RX.
Oct 8 12:38:48 trixbox2 mach_down[2923]: info: Taking over resource
group xxx.xxx.xxx.xxx/27/eth0
Oct 8 12:38:48 trixbox2 ResourceManager[2949]: info: Acquiring resource
group: trixbox1.local xxx.xxx.xxx.xxx/27/eth0 drbddisk::drbd0
Filesystem::/dev/drbd0::/share::ext3 mysqld sendmail asterisk httpd ircd
xinetd
Oct 8 12:38:48 trixbox2 IPaddr[2976]: INFO: Resource is stopped
Oct 8 12:38:48 trixbox2 ResourceManager[2949]: info: Running
/etc/ha.d/resource.d/IPaddr xxx.xxx.xxx.xxx/27/eth0 start
Oct 8 12:38:48 trixbox2 IPaddr[3076]: INFO: Using calculated netmask
for xxx.xxx.xxx.xxx: 255.255.255.224
Oct 8 12:38:48 trixbox2 IPaddr[3076]: INFO: eval ifconfig eth0:0
xxx.xxx.xxx.xxx netmask 255.255.255.224 broadcast 70.97.159.127
Oct 8 12:38:48 trixbox2 IPaddr[3047]: INFO: Success
Oct 8 12:38:48 trixbox2 ResourceManager[2949]: info: Running
/etc/ha.d/resource.d/drbddisk drbd0 start
Oct 8 12:38:49 trixbox2 kernel: drbd0: PingAck did not arrive in time.
Oct 8 12:38:49 trixbox2 kernel: drbd0: peer( Secondary -> Unknown )
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Oct 8 12:38:49 trixbox2 kernel: drbd0: asender terminated
Oct 8 12:38:49 trixbox2 kernel: drbd0: disk( UpToDate -> Outdated )
Oct 8 12:38:49 trixbox2 kernel: drbd0: outdate-peer helper broken,
returned 0
Oct 8 12:38:49 trixbox2 kernel: drbd0: State change failed: Refusing to
be Primary without at least one UpToDate disk
Oct 8 12:38:49 trixbox2 kernel: drbd0: state = { cs:NetworkFailure
st:Secondary/Unknown ds:Outdated/DUnknown r--- }
Oct 8 12:38:49 trixbox2 kernel: drbd0: wanted = { cs:NetworkFailure
st:Primary/Unknown ds:Outdated/DUnknown r--- }
Oct 8 12:38:49 trixbox2 kernel: drbd0: short read expecting header on
sock: r=-512
Oct 8 12:38:49 trixbox2 kernel: drbd0: tl_clear()
Oct 8 12:38:49 trixbox2 kernel: drbd0: Connection closed
Oct 8 12:38:49 trixbox2 kernel: drbd0: Writing meta data super block
now.
Oct 8 12:38:49 trixbox2 kernel: drbd0: conn( NetworkFailure ->
Unconnected )
Oct 8 12:38:49 trixbox2 kernel: drbd0: receiver terminated
Oct 8 12:38:49 trixbox2 kernel: drbd0: receiver (re)started
Oct 8 12:38:49 trixbox2 kernel: drbd0: conn( Unconnected ->
WFConnection )
Oct 8 12:38:50 trixbox2 kernel: drbd0: State change failed: Refusing to
be Primary without at least one UpToDate disk
Oct 8 12:38:50 trixbox2 kernel: drbd0: state = { cs:WFConnection
st:Secondary/Unknown ds:Outdated/DUnknown r--- }
Oct 8 12:38:50 trixbox2 kernel: drbd0: wanted = { cs:WFConnection
st:Primary/Unknown ds:Outdated/DUnknown r--- }
Oct 8 12:38:54 trixbox2 ResourceManager[2949]: CRIT: Giving up
resources due to failure of drbddisk::drbd0
Oct 8 12:38:54 trixbox2 ResourceManager[2949]: info: Releasing resource
group: trixbox1.local xxx.xxx.xxx.xxx/27/eth0 drbddisk::drbd0
Filesystem::/dev/drbd0::/share::ext3 mysqld sendmail asterisk httpd ircd
xinetd
Here is my setup. Step By step.
2 identical boxes
Trixbox1.local eth0 publicip eth1 10.0.10.2 160gb
Trixbox2.local eth0 publicip eth1 10.0.10.3 2x160gb hwRAID1
eth1 via crossover gigabit nic
Both are set up as
/boot 101mb
/ 75000mb
/drbd075000mb
/swap 2048mb
OS: CentOS 5 Final 2.6.18-8.1.14.el5 SMP
Drbd 8.0.4-1.el5
Kmod-Drbd 8.0.4-1.2.6.18_8.1.14.el5
/etc/drbd.conf on both
global {
dialog-refresh 5; # 5 seconds
usage-count yes;
}
common {
syncer { rate 120M; }
}
resource drbd0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
}
startup {
wfc-timeout 60;
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
fencing resource-and-stonith;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict call-pri-lost;
}
syncer {
rate 120M;
}
on trixbox1.local {
device /dev/drbd0;
disk /dev/sda4;
address 10.0.10.2:7788;
flexible-meta-disk internal;
}
on trixbox2.local {
device /dev/drbd0;
disk /dev/mapper/hpt45x_dejfaehcfp4;
address 10.0.10.3:7788;
meta-disk internal;
}
}
-------------------------------------------------------------
Reboot
On trixbox1.local drbdadm -- --overwrite-data-of-peer primary all
mkfs.ext3 /dev/drbd0
trixbox 1 & 2 /etc/fstab /dev/drbd0 /share ext3 noauto 0 0
trixbox 1 & 2 mkdir /share
RESULTS:
[trixbox1.local ~]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by [EMAIL PROTECTED], 2007-09-29 06:28:57
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:75189868 nr:0 dw:1340508 dr:73849428 al:1138 bm:4665 lo:0 pe:0
ua:0 ap:0
resync: used:0/31 hits:4736102 misses:4940 starving:0 dirty:0
changed:4940
act_log: used:0/127 hits:333989 misses:2974 starving:2
dirty:1834 changed:1138
on both yum install heartbeat -y
Installed: heartbeat.i386 0:2.1.2-3.el5.centos
Dependency Installed: heartbeat-pils.i386 0:2.1.2-3.el5.centos
heartbeat-stonith.i386 0:2.1.2-3.el5.centos
openhpi.i386 0:2.4.1-6.el5.1
on both /etc/ha.d/authkeys
auth 1
1 crc
On both Chmod 600 authkeys
On both /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 200ms
deadtime 2
warntime 1
initdead 120
udpport 694
bcast eth1
auto_failback on
node trixbox1.local
node trixbox2.local
on both /etc/ha.d/haresources
trixbox1.local xxx.xxx.xxx.xxx/27/eth0 drbddisk::drbd0
Filesystem::/dev/drbd0::/share::ext3 mysqld sendmail asterisk httpd ircd
xinetd
on both did the following
chkconfig --levels 345 mysqld off
chkconfig --levels 345 sendmail off
chkconfig --levels 345 asterisk off
chkconfig --levels 345 httpd off
chkconfig --levels 345 ircd off
chkconfig --levels 345 xinetd off
chkconfig --levels 345 heartbeat on
service mysqld stop
service sendmail stop
service asterisk stop
service httpd stop
service ircd stop
service xinetd stop
service heartbeat start
Peter Petroff
Sr. Systems Engineer
208-287-5524
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. [EMAIL PROTECTED]
Technischer Leiter
IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. [EMAIL PROTECTED]
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems