I have no idea what the problem might be. But I have an idea to un-hang drbd. If you go on the primary node and disconnect the resource (drbdadm r1 disconnect), maybe the processes on the secondary will respond. Saves a boot.
Are you certain about the reliability of the network layer between the drbd hosts? Dan -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Abdelkarim Mateos Sanchez Sent: Sunday, January 13, 2013 2:18 AM To: [email protected] Subject: Re: [DRBD-user] DRDB stalled and impossible restart, down... Hi. Any reply for this question? I'm desolate. In this machine, every week I need reboot server because DRBD it's hung down. Example. cat /proc/drbd version: 8.3.13 (api:88/proto:86-96) GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:335534008 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 1: cs:VerifyS ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:1309972 dw:1309972 dr:51448776 al:0 bm:88 lo:1 pe:136721 ua:2048 ap:0 ep:1 wo:b oos:9459536 [>....................] verified: 4.4% (48996/51196)M finish: 16317:48:56 speed: 0 (0) want: 40,960 K/sec (stalled) 2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:209708728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:28051588 3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:209708728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:90437120 root@pro01:~# /sbin/drbdadm verify all Command '/sbin/drbdsetup 0 verify' did not terminate within 5 seconds root@pro01:~# root@pro01:~# No response from the DRBD driver! Is the module loaded? I like shutdown drbd, can't do it. I like detach r1, can't do it.. Desolate. El 11/01/2013, a las 10:36, Abdelkarim Mateos Sanchez <[email protected]> escribió: > Hi. > > I'm desolate. > > With DRBD 8.3 (latest minor version) on Proxmox 2.2 r1.res stalled > > at /proc/drbd > version: 8.3.13 (api:88/proto:86-96) > GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, > 2012-10-09 12:47:51 > 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- > ns:0 nr:0 dw:44628 dr:335534008 al:0 bm:39 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b > oos:0 > 1: cs:VerifyS ro:Secondary/Primary ds:UpToDate/UpToDate C r----- > ns:0 nr:52427164 dw:52427164 dr:3246072 al:0 bm:3200 lo:1 pe:145893 ua:2048 > ap:0 ep:1 wo:b oos:1309972 > [>...................] verified: 6.2% (48036/51196)M > finish: 755:24:58 speed: 16 (96) want: 40,960 K/sec (stalled) > 2: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- > ns:0 nr:23024700 dw:127879064 dr:104854364 al:0 bm:8866 lo:0 pe:0 ua:0 ap:0 > ep:1 wo:b oos:0 > 3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- > ns:0 nr:79852364 dw:184706728 dr:104854364 al:0 bm:11415 lo:0 pe:0 ua:0 > ap:0 ep:1 wo:b oos:0 > > > I like disconnect, down resource, any solution for this situation. > > Bat all get a timeout > > cat /var/lock/drbd-147-1 > lock on /var/lock/drbd-147-1 currently held by pid:591161 > State change failed: (0)unknown error. > change failed: (0)unknown error. > > service drbd restart > Stopping all DRBD resources: > > No response from the DRBD driver! Is the module loaded? > > No response from the DRBD driver! Is the module loaded? > > But---> > lsmod | grep drbd > drbd 342496 13 > > > Dec 31 17:52:31 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961767 > Dec 31 17:52:37 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961766 > Dec 31 17:52:43 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961765 > Dec 31 17:52:49 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961764 > Dec 31 17:52:55 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961763 > Dec 31 17:53:01 pro01 kernel: block drbd1: [drbd1_worker/20189] sock_sendmsg > time expired, ko = 4294961762 > > > Try kill process, not work > > ps aux |grep drbd1 > root 20189 0.0 0.0 0 0 ? S Dec28 0:17 > [drbd1_worker] > root 20207 0.0 0.0 0 0 ? S Dec28 3:21 > [drbd1_receiver] > root 20213 0.0 0.0 0 0 ? S Dec28 0:16 > [drbd1_asender] > > Apreciate help > > > Abdelkarim Mateos Sánchez > CEO Tamainout Hébergement, S.A.R.L. (Marruecos) > CET Tamainut IT, S.L. (España) > Contacto | [email protected] | Skype - mamateos > Teléfono Fijo España: +34.851000209 | Marruecos Móvil: +212.671819412 > islaserver.com | tamainut.tel > Este mensaje se dirige exclusivamente a su destinatario y puede contener > información privilegiada o confidencial. Si no es vd. el destinatario > indicado, queda notificado de que la utilización, divulgación y/o copia sin > autorización está prohibida en virtud de la legislación vigente. Si ha > recibido este mensaje por error, le rogamos que nos lo comunique > inmediatamente por esta misma vía y proceda a su destrucción. > This message is intended exclusively for its addresse and may contain > information that is CONFIDENTIAL and protected by professional privilege. If > you are not the intended recipient you are hereby notified that any > dissemination, copy or disclosure of this communication is strictly > prohibited by law. If this message has been received in error, please > immediately notify us via e-mail and delete it. > > _______________________________________________ > drbd-user mailing list > [email protected] > http://lists.linbit.com/mailman/listinfo/drbd-user _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user _______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
