Howdy folks, I’m having some problems getting a very basic 2 node environment 
setup with DRBDv9 on Ubuntu 16.04 in Google’s cloud.

I’m using the ubuntu-1604-xenial-v20160627 image with basically this additional 
customization:

add-apt-repository ppa:linbit/linbit-drbd9-stack
apt update
apt install drbd-utils python-drbdmanage drbd-dkms

That appears to function and compiles the kernel module.

My vm's have a 100gb disk attached as /dev/sdb and I have been able to mostly 
get something working with:

on elysium-test01 vm:

vgcreate drbdpool /dev/sdb
drbdmanage init 10.12.0.2
drbdmanage add-node elysium-test02 10.12.0.3
drbdmanage add-resource data01
drbdmanage add-volume data01 90gb
drbdmanage assign-resource data01 elysium-test01
drbdmanage assign-resource data01 elysium-test02

on elysium-test02 vm:

vgcreate drbdpool /dev/sdb
drbdmanage join -p 6999 10.12.0.3 1 elysium-test01 10.12.0.2 0 
mUEU/uPLZOAFpkZGgmlT

At this point checking with drbd-overview it looks like everything is happy and 
connected, though elysium-test02 is inconsistent.

on elysium-test01 vm:

mkfs.ext4 -F -E discard /dev/drbd100
mkdir -p /mnt/disks/data01
mount -o discard,defaults /dev/drbd100 /mnt/disks/data01

At this point everything looks okay, and logs show that elysium-test01 is now 
the primary for data01.

Then the problems start, on the elysium-test02 node, after a few seconds the 
logs show "BUG: unable to handle kernel NULL pointer dereference at           
(null)”

<snip>
Jul 14 01:11:57 ubuntu kernel: [  444.132269] drbd data01/0 drbd100 
elysium-test01: received new current UUID: 096A8FD96D357D37
Jul 14 01:11:59 ubuntu kernel: [  445.934934] drbd data01/0 drbd100 
elysium-test01: Resync done (total 70 sec; paused 0 sec; 1255580 K/sec)
Jul 14 01:11:59 ubuntu kernel: [  445.934942] drbd data01/0 drbd100 
elysium-test01: updated UUIDs 
096A8FD96D357D36:0000000000000000:0000000000000000:0000000000000000
Jul 14 01:11:59 ubuntu kernel: [  445.934954] drbd data01/0 drbd100: disk( 
Inconsistent -> UpToDate )
Jul 14 01:11:59 ubuntu kernel: [  445.934957] drbd data01/0 drbd100 
elysium-test01: repl( SyncTarget -> Established )
Jul 14 01:11:59 ubuntu kernel: [  445.936014] drbd data01/0 drbd100 
elysium-test01: helper command: /sbin/drbdadm after-resync-target
Jul 14 01:11:59 ubuntu drbdadm[12829]: Don't know which config file belongs to 
resource data01, trying default ones...
Jul 14 01:11:59 ubuntu kernel: [  445.942075] drbd data01/0 drbd100 
elysium-test01: helper command: /sbin/drbdadm after-resync-target exit code 0 
(0x0)
Jul 14 01:12:01 ubuntu kernel: [  448.254494] drbd data01 elysium-test01: peer( 
Primary -> Secondary )
Jul 14 01:12:12 ubuntu kernel: [  458.528749] drbd data01 elysium-test01: 
Preparing remote state change 504089555 (primary_nodes=0, weak_nodes=0)
Jul 14 01:12:12 ubuntu kernel: [  458.530153] drbd data01 elysium-test01: 
Committing remote state change 504089555
Jul 14 01:12:12 ubuntu kernel: [  458.530168] drbd data01 elysium-test01: peer( 
Secondary -> Primary )
Jul 14 01:12:14 ubuntu kernel: [  460.832177] BUG: unable to handle kernel NULL 
pointer dereference at           (null)
Jul 14 01:12:14 ubuntu kernel: [  460.840403] IP: [<ffffffff813f91ed>] 
memcpy_orig+0x9d/0x110
Jul 14 01:12:14 ubuntu kernel: [  460.846205] PGD 0 
Jul 14 01:12:14 ubuntu kernel: [  460.848811] Oops: 0002 [#1] SMP 
Jul 14 01:12:14 ubuntu kernel: [  460.852394] Modules linked in: 
drbd_transport_tcp(OE) drbd(OE) ip6table_filter ip6_tables iptable_filter 
ip_tables x_tables ppdev serio_raw parport_pc pvpanic parport ib_iser rdma_cm 
iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd psmouse virtio_scsi
Jul 14 01:12:14 ubuntu kernel: [  460.905902] CPU: 0 PID: 12729 Comm: 
drbd_r_data01 Tainted: G           OE   4.4.0-28-generic #47-Ubuntu
Jul 14 01:12:14 ubuntu kernel: [  460.915567] Hardware name: Google 
Google/Google, BIOS Google 01/01/2011
Jul 14 01:12:14 ubuntu kernel: [  460.922378] task: ffff8800b991e040 ti: 
ffff8800b9af8000 task.ti: ffff8800b9af8000
Jul 14 01:12:14 ubuntu kernel: [  460.930084] RIP: 0010:[<ffffffff813f91ed>]  
[<ffffffff813f91ed>] memcpy_orig+0x9d/0x110
Jul 14 01:12:14 ubuntu kernel: [  460.938329] RSP: 0018:ffff8800b9afb9a8  
EFLAGS: 00010202
Jul 14 01:12:14 ubuntu kernel: [  460.943760] RAX: 0000000000000000 RBX: 
0000000000000012 RCX: 0000000000000200
Jul 14 01:12:14 ubuntu kernel: [  460.951091] RDX: 0000000000000012 RSI: 
ffff8800b9db80ae RDI: 0000000000000000
Jul 14 01:12:14 ubuntu kernel: [  460.958348] RBP: ffff8800b9afb9e0 R08: 
0000000000000000 R09: 0000000000000000
Jul 14 01:12:14 ubuntu kernel: [  460.965591] R10: 0000000000000000 R11: 
0000000000000000 R12: ffff8800b9afbbb0
Jul 14 01:12:14 ubuntu kernel: [  460.972841] R13: 0000000000000012 R14: 
0000000000000012 R15: ffff8800b9afbb90
Jul 14 01:12:14 ubuntu kernel: [  460.980087] FS:  0000000000000000(0000) 
GS:ffff88012fc00000(0000) knlGS:0000000000000000
Jul 14 01:12:14 ubuntu kernel: [  460.988289] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Jul 14 01:12:14 ubuntu kernel: [  460.994144] CR2: 0000000000000000 CR3: 
00000000ba48c000 CR4: 00000000001406f0
Jul 14 01:12:14 ubuntu kernel: [  461.001397] Stack:
Jul 14 01:12:14 ubuntu kernel: [  461.003631]  ffffffff813fde16 
ffff8800b9db80c0 0000000000000200 000000000000003e
Jul 14 01:12:14 ubuntu kernel: [  461.011633]  0000000000000012 
0000000000000012 000000000000002c ffff8800b9afba40
Jul 14 01:12:14 ubuntu kernel: [  461.019793]  ffffffff8170f018 
0000000000000000 ffff88012aa42580 0000000000000002
Jul 14 01:12:14 ubuntu kernel: [  461.027911] Call Trace:
Jul 14 01:12:14 ubuntu kernel: [  461.030577]  [<ffffffff813fde16>] ? 
copy_to_iter+0x1b6/0x260
Jul 14 01:12:14 ubuntu kernel: [  461.036358]  [<ffffffff8170f018>] 
skb_copy_datagram_iter+0x68/0x280
Jul 14 01:12:14 ubuntu kernel: [  461.042960]  [<ffffffff817694e3>] 
tcp_recvmsg+0x613/0xbe0
Jul 14 01:12:14 ubuntu kernel: [  461.048567]  [<ffffffff8179740e>] 
inet_recvmsg+0x7e/0xb0
Jul 14 01:12:14 ubuntu kernel: [  461.053987]  [<ffffffff816ffa3b>] 
sock_recvmsg+0x3b/0x50
Jul 14 01:12:14 ubuntu kernel: [  461.059409]  [<ffffffff816ffb91>] 
kernel_recvmsg+0x61/0x80
Jul 14 01:12:14 ubuntu kernel: [  461.065002]  [<ffffffffc02a9703>] 
dtt_recv_short+0x63/0x80 [drbd_transport_tcp]
Jul 14 01:12:14 ubuntu kernel: [  461.072666]  [<ffffffffc02a97e0>] 
dtt_recv+0xc0/0x180 [drbd_transport_tcp]
Jul 14 01:12:14 ubuntu kernel: [  461.079771]  [<ffffffffc0335f88>] 
drbd_recv+0x48/0x1f0 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.085894]  [<ffffffff816ffa3b>] ? 
sock_recvmsg+0x3b/0x50
Jul 14 01:12:14 ubuntu kernel: [  461.091699]  [<ffffffffc033ef98>] 
read_in_block+0xa8/0x350 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.097937]  [<ffffffffc0342140>] ? 
e_end_resync_block+0x110/0x110 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.104848]  [<ffffffffc0342250>] 
receive_Data+0x110/0xcb0 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.111071]  [<ffffffffc02a9803>] ? 
dtt_recv+0xe3/0x180 [drbd_transport_tcp]
Jul 14 01:12:14 ubuntu kernel: [  461.118253]  [<ffffffffc0335f88>] ? 
drbd_recv+0x48/0x1f0 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.124304]  [<ffffffffc0342140>] ? 
e_end_resync_block+0x110/0x110 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.131361]  [<ffffffffc0342140>] ? 
e_end_resync_block+0x110/0x110 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.138269]  [<ffffffffc0345ee4>] 
drbd_receiver+0x3e4/0x620 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.144573]  [<ffffffffc0350420>] ? 
idr_has_entry+0x10/0x10 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.150873]  [<ffffffffc035047e>] 
drbd_thread_setup+0x5e/0x110 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.157453]  [<ffffffffc0350420>] ? 
idr_has_entry+0x10/0x10 [drbd]
Jul 14 01:12:14 ubuntu kernel: [  461.163750]  [<ffffffff810a0808>] 
kthread+0xd8/0xf0
Jul 14 01:12:14 ubuntu kernel: [  461.168754]  [<ffffffff810a0730>] ? 
kthread_create_on_node+0x1e0/0x1e0
Jul 14 01:12:14 ubuntu kernel: [  461.175394]  [<ffffffff81827a4f>] 
ret_from_fork+0x3f/0x70
Jul 14 01:12:14 ubuntu kernel: [  461.180914]  [<ffffffff810a0730>] ? 
kthread_create_on_node+0x1e0/0x1e0
Jul 14 01:12:14 ubuntu kernel: [  461.187563] Code: 57 e8 4c 89 5f e0 48 8d 7f 
e0 73 d2 83 c2 20 48 29 d6 48 29 d7 83 fa 10 72 24 4c 8b 06 4c 8b 4e 08 4c 8b 
54 16 f0 4c 8b 5c 16 f8 <4c> 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 
90 83 fa 
Jul 14 01:12:14 ubuntu kernel: [  461.214595] RIP  [<ffffffff813f91ed>] 
memcpy_orig+0x9d/0x110
Jul 14 01:12:14 ubuntu kernel: [  461.220601]  RSP <ffff8800b9afb9a8>
Jul 14 01:12:14 ubuntu kernel: [  461.224205] CR2: 0000000000000000
Jul 14 01:12:14 ubuntu kernel: [  461.227643] ---[ end trace 670dbe9e8d37a576 
]---
</snip>

After this, nothing seems to work properly (on either vms).  Attempts to 
unmount the volume hang, other commands like drbd-overview hang; eventually I 
have to reboot both vms to get back to some sort of sanity, yet DRBD still is 
basically non-functional and causing kennel errors :-(

Anyone have any idea whats wrong?

Thanks!

—jason



_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to