Massimo Cetra wrote:

HI all,

i resume this old post that didn't have an answer.
I am experiencing the same identical problem.

the 2 nodes are 2 KVM virtual machines on 2.6.30.5 that are supposed to work in dual primary mode with OCFS on top pf DRBD.

The testbed is:
- machines are up and running, synced and in dual primary mode.

- when i shutdown -h one of the two, everything comes up correctly.
- when i reboot (KVM is quite fast), i see that the rebooted node doesn't sync and disconnects.

I am having almost the same problem with 8.3.2 I am running in primary secondary mode. If I stop the secondary, the primary goes to StandAlone and the log messages are the same as below. This does not happen with 8.3.0. My nodes are physical not VMs.

This is strange because i don't see any configuration problem.
If i restart drbd, it comes up cleanly as well.

How to fix and what's the problem ?

This is the relevant portion of the log.

[    7.444342] drbd: initialised. Version: 8.3.2 (api:88/proto:86-90)
[ 7.466196] drbd: GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by p...@fat-tyre, 2009-07-03 15:35:39
[    7.489085] drbd: registered as block device major 147
[    7.500701] drbd: minor_table @ 0xffff8800ddd5ac80
[    7.523703] block drbd1: Starting worker thread (from cqueue [1691])
[    7.535696] block drbd1: disk( Diskless -> Attaching )
[ 7.551927] block drbd1: Found 6 transactions (6 active extents) in activity log.
[    7.574272] block drbd1: Method to ensure write ordering: barrier
[    7.585844] block drbd1: max_segment_size ( = BIO size ) = 32768
[ 7.597424] block drbd1: drbd_bm_resize called with capacity == 104854328
[    7.609724] block drbd1: resync bitmap: bits=13106791 words=204794
[    7.632516] block drbd1: size = 50 GB (52427164 KB)
[ 7.647568] block drbd1: recounting of set bits took additional 0 jiffies [ 7.659347] block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
[    7.671181] block drbd1: disk( Attaching -> UpToDate )
[ 7.682717] block drbd1: Barriers not supported on meta data device - disabling
[    7.773093] block drbd1: conn( StandAlone -> Unconnected )
[ 7.785603] block drbd1: Starting receiver thread (from drbd1_worker [1698])
[    7.808437] block drbd1: receiver (re)started
[    7.824425] block drbd1: conn( Unconnected -> WFConnection )
[    7.836513] block drbd1: bind before connect failed, err = -99
[    7.852925] block drbd1: conn( WFConnection -> Disconnecting )
[    7.868108] block drbd1: role( Secondary -> Primary )
[    7.880114] block drbd1: Creating new current UUID
[    8.064182] block drbd1: Discarding network configuration.
[    8.075792] block drbd1: Connection closed
[    8.087158] block drbd1: conn( Disconnecting -> StandAlone )
[    8.098898] block drbd1: receiver terminated
[    8.110451] block drbd1: Terminating receiver thread

Thanks,
Max



Richard Hector ha scritto:
Hi all,

One of my 2 machines doesn't seem to connect at boot time - doesn't
matter whether it's configured to come up as primary or secondary. This,
at a guess, seems relevant:

...
[ 29.196417] drbd0: conn( Unconnected -> WFConnection ) [ 29.196417] drbd0: bind before connect failed, err = -99
[   29.196417] drbd0: conn( WFConnection -> Disconnecting ) ...

I've tried to find any docs regarding this without success - that bind
is an internal kernel one, not bind(2) (the syscall), right? My
expertise at navigating the kernel source is rather limited ...

If I run /etc/init.d/drbd restart after boot, it comes up fine.

Both machines are Debian Lenny, amd64, with drbd 8.3 from backports.org.

Any suggestions? Any more useful info I can supply?


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to