Hi there,

I have an up-to-date Debian 10 system, with drbd 8.4.10, and ocfs2
1.8.5.

Looks like everything works, but when a node restarted, then my
/dev/drbd0 device doesn't mounted automatically. After the boot I
can log in to system, and can mount it by hand - the `mount -a`
command takes about 8 seconds...

# time mount -a

real    0m8,173s
user    0m0,009s
sys     0m0,019s

Before I mount the volume by hand, I checked the /proc/drbd, and
I think there is everything well:

# cat /proc/drbd 
version: 8.4.10 (api:1/proto:86-101)
srcversion: 9B4D87C5E865DF526864868 
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:124 nr:16511 dw:16635 dr:646 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f 
oos:0

(this is what I see after the system booted and I logged in,
before I mount the device)

Here are the relevant configs:

# cat /etc/drbd.d/r0.res 
resource r0 {
  meta-disk internal;
  device /dev/drbd0;
  syncer {
    verify-alg sha1;
  }
  net {
    allow-two-primaries;
  }
  startup {
    become-primary-on both;
  }
  on t2app1 {
    disk /dev/vg-t2app1/lvdrbd0;
    address 192.168.72.21:7789;
  }
  on t2app2 {
    disk /dev/vg-t2app2/lvdrbd0;
    address 192.168.72.22:7789;
  }
}

(Note, that the configs are same on both nodes)

# cat /etc/ocfs2/cluster.conf 
cluster:
    node_count = 2
    name = data

node:
    ip_port = 7777
    ip_address = 192.168.72.21
    number = 1
    name = t2app1
    cluster = data

node:
    ip_port = 7777
    ip_address = 192.168.72.22
    number = 2
    name = t2app2
    cluster = data

(also the configs are same on both nodes)

# grep drbd /etc/fstab 
/dev/drbd0      /drbd   ocfs2   _netdev,defaults        0       0

on both nodes.

I've found some interesting lines in syslog from the boot
intervall:

# grep "\(o2\|ocfs\|drbd\)" /var/log/syslog


Sep  1 18:47:37 t2app2 systemd[1]: Stopping Mount ocfs2 Filesystems...
Sep  1 18:47:52 t2app2 systemd-modules-load[361]: Inserted module 'drbd'
Sep  1 18:47:52 t2app2 systemd[1]: Starting Load o2cb Modules...
Sep  1 18:47:52 t2app2 kernel: [    3.822815] drbd: initialized.  Version: 
8.4.10 (api:1/proto:86-101)
Sep  1 18:47:52 t2app2 kernel: [    3.822816] drbd: srcversion: 
9B4D87C5E865DF526864868 
Sep  1 18:47:52 t2app2 kernel: [    3.822817] drbd: registered as block device 
major 147
Sep  1 18:47:52 t2app2 o2cb[659]: checking debugfs...
Sep  1 18:47:53 t2app2 o2cb[659]: Loading filesystem "configfs": OK
Sep  1 18:47:53 t2app2 o2cb[659]: Mounting configfs filesystem at 
/sys/kernel/config: mount: /sys/kernel/config: configfs already mounted on 
/sys/kernel/config.
Sep  1 18:47:53 t2app2 o2cb[659]: Loading stack plugin "o2cb": OK
Sep  1 18:47:53 t2app2 kernel: [    4.544893] ocfs2: Registered cluster 
interface o2cb
Sep  1 18:47:53 t2app2 o2cb[659]: Loading filesystem "ocfs2_dlmfs": OK
Sep  1 18:47:53 t2app2 o2cb[659]: Mounting ocfs2_dlmfs filesystem at /dlm: OK
Sep  1 18:47:53 t2app2 o2cb[659]: Setting cluster stack "o2cb": OK
Sep  1 18:47:53 t2app2 drbd[696]: Starting DRBD resources:[
Sep  1 18:47:53 t2app2 drbd[696]:      create res: r0
Sep  1 18:47:53 t2app2 drbd[696]:    prepare disk: r0
Sep  1 18:47:53 t2app2 systemd[1]: Found device /dev/drbd0.
Sep  1 18:47:53 t2app2 systemd[1]: Mounting /drbd...
Sep  1 18:47:53 t2app2 mount[726]: mount.ocfs2: I/O error on channel while 
opening device /dev/drbd0
Sep  1 18:47:53 t2app2 systemd[1]: drbd.mount: Mount process exited, 
code=exited, status=1/FAILURE
Sep  1 18:47:53 t2app2 systemd[1]: drbd.mount: Failed with result 'exit-code'.
Sep  1 18:47:53 t2app2 systemd[1]: Failed to mount /drbd.
Sep  1 18:47:53 t2app2 kernel: [    4.581222] drbd r0: Starting worker thread 
(from drbdsetup-84 [729])
Sep  1 18:47:53 t2app2 drbd[696]:     adjust disk: r0
Sep  1 18:47:53 t2app2 kernel: [    4.581729] block drbd0: disk( Diskless -> 
Attaching ) 
Sep  1 18:47:53 t2app2 kernel: [    4.581782] drbd r0: Method to ensure write 
ordering: flush
Sep  1 18:47:53 t2app2 kernel: [    4.581784] block drbd0: max BIO size = 
1048576
Sep  1 18:47:53 t2app2 kernel: [    4.581786] block drbd0: drbd_bm_resize 
called with capacity == 20970808
Sep  1 18:47:53 t2app2 kernel: [    4.581827] block drbd0: resync bitmap: 
bits=2621351 words=40959 pages=80
Sep  1 18:47:53 t2app2 kernel: [    4.581828] block drbd0: size = 10 GB 
(10485404 KB)
Sep  1 18:47:53 t2app2 kernel: [    4.583737] block drbd0: recounting of set 
bits took additional 0 jiffies
Sep  1 18:47:53 t2app2 kernel: [    4.583738] block drbd0: 16 MB (4096 bits) 
marked out-of-sync by on disk bit-map.
Sep  1 18:47:53 t2app2 kernel: [    4.583741] block drbd0: disk( Attaching -> 
UpToDate ) 
Sep  1 18:47:53 t2app2 kernel: [    4.583743] block drbd0: attached to UUIDs 
1E63CB670138DCE9:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9
Sep  1 18:47:53 t2app2 kernel: [    4.586632] drbd r0: conn( StandAlone -> 
Unconnected ) 
Sep  1 18:47:53 t2app2 kernel: [    4.586642] drbd r0: Starting receiver thread 
(from drbd_w_r0 [731])
Sep  1 18:47:53 t2app2 kernel: [    4.586811] drbd r0: receiver (re)started
Sep  1 18:47:53 t2app2 kernel: [    4.586817] drbd r0: conn( Unconnected -> 
WFConnection ) 
Sep  1 18:47:53 t2app2 drbd[696]:      adjust net: r0
Sep  1 18:47:53 t2app2 drbd[696]: ]
Sep  1 18:47:53 t2app2 o2cb[659]: Registering O2CB cluster "data": OK
Sep  1 18:47:53 t2app2 o2cb[659]: Setting O2CB cluster timeouts : OK
Sep  1 18:47:53 t2app2 kernel: [    4.634584] o2hb: Heartbeat mode set to local
Sep  1 18:47:53 t2app2 systemd[1]: Started Load o2cb Modules.
Sep  1 18:47:53 t2app2 o2hbmonitor: Starting
Sep  1 18:47:53 t2app2 systemd[1]: Starting Mount ocfs2 Filesystems...
Sep  1 18:47:53 t2app2 ocfs2[794]: Starting Oracle Cluster File System (OCFS2) 
mount.ocfs2: I/O error on channel while opening device /dev/drbd0
Sep  1 18:47:53 t2app2 ocfs2[794]: Failed
Sep  1 18:47:53 t2app2 systemd[1]: Started Mount ocfs2 Filesystems.
Sep  1 18:47:53 t2app2 kernel: [    5.107027] drbd r0: Handshake successful: 
Agreed network protocol version 101
Sep  1 18:47:53 t2app2 kernel: [    5.107028] drbd r0: Feature flags enabled on 
protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
Sep  1 18:47:53 t2app2 kernel: [    5.107049] drbd r0: conn( WFConnection -> 
WFReportParams ) 
Sep  1 18:47:53 t2app2 kernel: [    5.107050] drbd r0: Starting ack_recv thread 
(from drbd_r_r0 [738])
Sep  1 18:47:53 t2app2 kernel: [    5.153635] block drbd0: drbd_sync_handshake:
Sep  1 18:47:53 t2app2 kernel: [    5.153637] block drbd0: self 
1E63CB670138DCE8:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9 bits:4096 
flags:0
Sep  1 18:47:53 t2app2 kernel: [    5.153638] block drbd0: peer 
DA6EC47089E7218F:1E63CB670138DCE9:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9 bits:1 
flags:0
Sep  1 18:47:53 t2app2 kernel: [    5.153639] block drbd0: uuid_compare()=-1 by 
rule 50
Sep  1 18:47:53 t2app2 kernel: [    5.153643] block drbd0: peer( Unknown -> 
Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) 
pdsk( DUnknown -> UpToDate ) 
Sep  1 18:47:53 t2app2 kernel: [    5.154072] block drbd0: receive bitmap stats 
[Bytes(packets)]: plain 0(0), RLE 25(1), total 25; compression: 100.0%
Sep  1 18:47:53 t2app2 kernel: [    5.154125] block drbd0: send bitmap stats 
[Bytes(packets)]: plain 0(0), RLE 44(1), total 44; compression: 100.0%
Sep  1 18:47:53 t2app2 kernel: [    5.154127] block drbd0: conn( WFBitMapT -> 
WFSyncUUID ) 
Sep  1 18:47:53 t2app2 kernel: [    5.155755] block drbd0: updated sync uuid 
1E64CB670138DCE8:0000000000000000:A7B3A04F6A72E7D9:A7B2A04F6A72E7D9
Sep  1 18:47:53 t2app2 kernel: [    5.156150] block drbd0: helper command: 
/sbin/drbdadm before-resync-target minor-0
Sep  1 18:47:53 t2app2 kernel: [    5.157047] block drbd0: helper command: 
/sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Sep  1 18:47:53 t2app2 kernel: [    5.157054] block drbd0: conn( WFSyncUUID -> 
SyncTarget ) disk( Outdated -> Inconsistent ) 
Sep  1 18:47:53 t2app2 kernel: [    5.157067] block drbd0: Began resync as 
SyncTarget (will sync 16384 KB [4096 bits set]).
Sep  1 18:47:53 t2app2 kernel: [    5.157500] block drbd0: role( Secondary -> 
Primary ) 
Sep  1 18:47:53 t2app2 drbd[696]: WARN: stdin/stdout is not a TTY; using 
/dev/console.
Sep  1 18:48:00 t2app2 kernel: [   11.508179] block drbd0: Resync done (total 6 
sec; paused 0 sec; 2728 K/sec)
Sep  1 18:48:00 t2app2 kernel: [   11.508183] block drbd0: updated UUIDs 
DA6EC47089E7218F:0000000000000000:1E64CB670138DCE9:1E63CB670138DCE9
Sep  1 18:48:00 t2app2 kernel: [   11.508186] block drbd0: conn( SyncTarget -> 
Connected ) disk( Inconsistent -> UpToDate ) 
Sep  1 18:48:00 t2app2 kernel: [   11.508516] block drbd0: helper command: 
/sbin/drbdadm after-resync-target minor-0
Sep  1 18:48:00 t2app2 kernel: [   11.509496] block drbd0: helper command: 
/sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
Sep  1 18:48:58 t2app2 kernel: [   69.610107] o2net: Connected to node t2app1 
(num 1) at 192.168.72.21:7777
Sep  1 18:49:02 t2app2 kernel: [   73.701897] o2dlm: Joining domain 
34CC3A6762474F669C72F63604E47EE4 
Sep  1 18:49:02 t2app2 kernel: [   73.706853] ocfs2: Mounting device (147,0) on 
(node 2, slot 1) with ordered data mode.

note, that the last 3 lines (18:48:58 - 18:49:02) comes when I
mounted the device by hand.

Looks like the /dev/drbd0 doesn't exists at boot time, but when I
log in, it builded.

Am I missing something? Or there is some network issue, eg. slow
network...?

This config (which is _same_) works on Debian 9 without any
issue.


Any helps are welcome,


thanks,


a.

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to