Re: [DRBD-user] Extent XXX beyond end of bitmap!

Oleksiy Evin Mon, 01 Oct 2018 20:59:07 -0700

The above 'Extent XXX beyond end of bitmap!' error is constantly reproduced on 
our environment.  That's not clear what was exactly the trigger, but that 
happened when peacemaker were unable to properly failover to another node due 
to DRBD timeout issue following by the server reset.


# drbdadm statussg-master-drbd role:Secondary  disk:Diskless  peer role:Primary 
   replication:Established peer-disk:UpToDate
# drbdadm up allextent 19136507 beyond end of bitmap!extent 21495810 beyond end 
of bitmap!extent 21785161 beyond end of bitmap!... another 50+ entries similar 
to above...../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos 
<= chunk - extents_size) failed.sg-master-drbd: Failure: (102) Local 
address(port) already in use.Command 'drbdsetup-84 connect sg-master-drbd 
ipv4:172.16.2.10:7801 ipv4:172.16.2.20:7801 --protocol=C --max-buffers=64K 
--sndbuf-size=1024K --after-sb-0pri=discard-younger-primary 
--after-sb-1pri=discard-secondary --after-sb-2pri=call-pri-lost-after-sb' 
terminated with exit code 10
]# drbdadm attach allextent 19136507 beyond end of bitmap!extent 21495810 
beyond end of bitmap!extent 21785161 beyond end of 
bitmap!...../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos <= 
chunk - extents_size) failed.
Previously we fixed that by recreating DRBD meta data and fully resynchronize 
the nodes, which obviously is incorrect way to handle it.

The configuration is pretty much standard, with Internal meta data and defaults 
for AL and max-peers.

resource master-drbd {  net {    protocol C;    max-buffers    64K;    
sndbuf-size    1024K;    after-sb-0pri  discard-younger-primary;    
after-sb-1pri  discard-secondary;    after-sb-2pri  call-pri-lost-after-sb;  }  
disk {    resync-rate 4000M;    disk-barrier no;    disk-flushes no;    
c-plan-ahead 0;    read-balancing 1M-striping;  }  volume 0 {    disk 
/dev/drbdpool/data;    device /dev/drbd0;    meta-disk internal;  }  on 
hcluster01 {    address 172.16.2.10:7801;  }  on hcluster02 {    address 
172.16.2.20:7801;  }}

I'm not able to get 'drbdadm dump-md' with the following error:

# drbdadm dump-md allFound meta data is "unclean", please apply-al firstCommand 
'drbdmeta 0 v08 /dev/drbdpool/data internal dump-md' terminated with exit code 
255
Backend device 'dm-3' for DRBD is a logical volume 'data' which combines two 
Hardware RAID0 arrays (sda, sdb) by volume group 'drbdpool'. 

Reported sizes on a Failed node:

# blockdev --reportRO    RA   SSZ   BSZ   StartSec            Size   Devicerw   
256   512  4096          0 120009573531648   /dev/sdarw   256   512  4096       
   0 100007977943040   /dev/sdcrw   256   512  4096          0 220017543086080  
 /dev/dm-3

# blockdev --getsize /dev/drbd0blockdev: cannot open /dev/drbd0: Wrong medium 
type
Reported sizes on a Operational node:

# blockdev --reportRO    RA   SSZ   BSZ   StartSec            Size   Devicerw   
256   512  4096          0 120009573531648   /dev/sdarw   256   512  4096       
   0 100007977943040   /dev/sdcrw   256   512  4096          0 220017543086080  
 /dev/dm-3rw   256   512  4096          0 220010828644352   /dev/drbd0
# blockdev --getsize /dev/drbd0429708649696
# vgdisplay
  --- Volume group ---  VG Name               drbdpool  System ID               
Format                lvm2  Metadata Areas        2  Metadata Sequence No  2  
VG Access             read/write  VG Status             resizable  MAX LV       
         0  Cur LV                1  Open LV               1  Max PV            
    0  Cur PV                2  Act PV                2  VG Size               
200.10 TiB  PE Size               4.00 MiB  Total PE              52456270  
Alloc PE / Size       52456270 / 200.10 TiB  Free  PE / Size       0 / 0      
# lvdisplay   --- Logical volume ---  LV Path                /dev/drbdpool/data 
 LV Name                data  VG Name                drbdpool  LV Write Access  
      read/write  LV Status              available  # open                 2  
LV Size                200.10 TiB  Current LE             52456270  Segments    
           2  Allocation             inherit  Read ahead sectors     auto  - 
currently set to     256  Block device           253:3 

# dmesg | grep drbd[    1.863088] drbd: loading out-of-tree module taints 
kernel.[    1.865879] drbd: module verification failed: signature and/or 
required key missing - tainting kernel[    1.894498] drbd: initialized. 
Version: 8.4.11-1 (api:1/proto:86-101)[    1.894501] drbd: GIT-hash: 
66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, 2018-04-26 
12:10:42[    1.894502] drbd: registered as block device major 147[   88.950747] 
drbd sg-master-drbd: Starting worker thread (from drbdsetup-84 [3242])[   
88.951999] drbd sg-master-drbd: conn( StandAlone -> Unconnected ) [   
88.952532] drbd sg-master-drbd: Starting receiver thread (from drbd_w_sg-maste 
[3244])[   88.952592] drbd sg-master-drbd: receiver (re)started[   88.952656] 
drbd sg-master-drbd: conn( Unconnected -> WFConnection ) [   89.453261] drbd 
sg-master-drbd: Handshake successful: Agreed network protocol version 101[   
89.453271] drbd sg-master-drbd: Feature flags enabled on protocol level: 0xf 
TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.[   89.453358] drbd sg-master-drbd: 
conn( WFConnection -> WFReportParams ) [   89.453373] drbd sg-master-drbd: 
Starting ack_recv thread (from drbd_r_sg-maste [3245])[   89.469010] block 
drbd0: max BIO size = 4096[   89.469023] block drbd0: size = 200 TB 
(214854324848 KB)[   89.469043] block drbd0: peer( Unknown -> Primary ) conn( 
WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) [49807.178096] drbd 
sg-master-drbd: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) 
pdsk( UpToDate -> DUnknown ) [49807.178116] drbd sg-master-drbd: ack_receiver 
terminated[49807.178124] drbd sg-master-drbd: Terminating 
drbd_a_sg-maste[49807.192386] drbd sg-master-drbd: Connection 
closed[49807.192452] drbd sg-master-drbd: conn( Disconnecting -> StandAlone ) 
[49807.192463] drbd sg-master-drbd: receiver terminated[49807.192470] drbd 
sg-master-drbd: Terminating drbd_r_sg-maste[49807.229346] drbd sg-master-drbd: 
Terminating drbd_w_sg-maste[49847.525209] drbd sg-master-drbd: Starting worker 
thread (from drbdsetup-84 [23082])[49847.525490] drbd sg-master-drbd: conn( 
StandAlone -> Unconnected ) [49847.525542] drbd sg-master-drbd: Starting 
receiver thread (from drbd_w_sg-maste [23084])[49847.525624] drbd 
sg-master-drbd: receiver (re)started[49847.525687] drbd sg-master-drbd: conn( 
Unconnected -> WFConnection ) [49848.025725] drbd sg-master-drbd: Handshake 
successful: Agreed network protocol version 101[49848.025735] drbd 
sg-master-drbd: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC 
WRITE_SAME WRITE_ZEROES.[49848.025964] drbd sg-master-drbd: conn( WFConnection 
-> WFReportParams ) [49848.025979] drbd sg-master-drbd: Starting ack_recv 
thread (from drbd_r_sg-maste [23085])[49848.036394] block drbd0: max BIO size = 
4096[49848.036407] block drbd0: size = 200 TB (214854324848 KB)[49848.036427] 
block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) 
pdsk( DUnknown -> UpToDate ) 

//OE

-----Original Message-----
From: Robert Altnoeder <[email protected]>
To: [email protected]
Subject: Re: [DRBD-user] Extent XXX beyond end of bitmap!
Date: Tue, 14 Aug 2018 13:03:40 +0200

The following information would be useful for debugging:- Internal or external 
meta data?- Any special activity log configuration, like a striped AL, 
differentAL stripe size, etc.?- Any manually configured number of AL extents?- 
Value of max-peers- Reported size of the DRBD device in sectors- Reported size 
of the backend device for DRBD in sectors- Ideally, a 'drbdadm dump-md' of the 
meta data of the affected devices
br,Robert
On 08/14/2018 10:02 AM, Yannis Milios wrote:Does this happen on both nodes? 
What’s the status of the backingdevice (lvm) ? Can you post the exact versions 
for both kernel moduleand utils? Any clue in the logs?
On Tue, 14 Aug 2018 at 06:57, Oleksiy Evin 
<[email protected]<mailto:[email protected]>> wrote:

    # drbdadm attach all    extent 19136522 beyond end of bitmap!    extent 
19143798 beyond end of bitmap!    extent 19151565 beyond end of bitmap!
    ../shared/drbdmeta.c:2279:apply_al: ASSERT(bm_pos - bm_on_disk_pos    <= 
chunk - extents_size) failed.

_______________________________________________drbd-user mailing 
[email protected]http://lists.linbit.com/mailman/listinfo/drbd-user

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] Extent XXX beyond end of bitmap!

Reply via email to