Re: [DRBD-user] DRBD Delay

2019-07-23 Thread Banks, David (db2d)
Thanks Veit, that’s what I use to start it manually. I’m looking for the 
automated approach so that it starts on it’s own reliably after system start.

---
Dav Banks
School of Architecture
University of Virginia
Campbell Hall, Rm 138
(o) 434.243.8883
(e) dav.ba...@virginia.edu

On Jul 23, 2019, at 3:49 AM, Veit Wahlich 
mailto:cru.li...@zodia.de>> wrote:

Hi David,

have a look at the documentation of drbdadm; the commands up, down and
adjust might be what you are looking for.

Best regards,
// Veit


Am Montag, den 22.07.2019, 16:34 + schrieb Banks, David (db2d):
Hello,

Is there a way to delay the loading of DRBD resources until after the
underlying block system has made the devices available?

I’ve looked in systemd but didn’t see any drbd services and wanted to
ask before monkeying with that.

System: Ubuntu 18.04
DRBD: 8.9.10

After reboot DRBD starts before the zfs volumes that is use are
available so I have to do a 'drbdadm adjust all’ each time. I’d like
it to just wait until the zfs-mount.service is done.

Thanks!

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] DRBD 9: 3-node mirror error (Low.dev. smaller than requested DRBD-dev. size.)

2019-07-23 Thread Paul Clements
> One thing you could check just to be sure is whether the configuration
> files are identical on all three systems.

Yes, it's identical. md5sums check out.

> Was DRBD stopped on all three nodes before you recreated the meta data?
> (Did you do 'drbdsetup down resourcename' on all three nodes, then
> recreate the meta data on all three nodes, and then try 'drbdadm up
> resourcename'?)

I did:

# drbdadm down r0

Before I did the rest of those steps, yes.

> You said you were using DRBD 9.0.16, correct?

Yes.

> Could you please also check exact output of "cat /proc/drbd" and
> "drbdadm --version" for me please?

# cat /proc/drbd
version: 9.0.16-1 (api:2/proto:86-114)
GIT-hash: ab9777dfeaf9d619acc9a5201bfcae8103e9529c build by
mockbuild@, 2018-11-03 13:54:24
Transports (api:16): tcp (9.0.16-1)

# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ d458166f5f4740625e5ff215f62366aca60ca37b\
build\ by\ mockbuild@\,\ 2018-11-03\ 14:14:44
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090010
DRBD_KERNEL_VERSION=9.0.16
DRBDADM_VERSION_CODE=0x090600
DRBDADM_VERSION=9.6.0

> With that information I may be able to recreate the situation if the
> problem persists.

Thanks for the help.

The conf file is at bottom for reference. Does anyone have a working
three node drbd.conf they'd be willing to post? I'm starting to think
something in my conf file is odd.

Paul

--
# cat /etc/drbd.conf

# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

resource r0 {
on sios0 {
node-id 0;
device /dev/drbd0;
disk /dev/dm-0;
address 10.0.0.4:7788;
meta-disk internal;
}

on sios1 {
node-id 1;
device /dev/drbd0;
disk /dev/dm-0;
address 10.0.0.7:7788;
meta-disk internal;
}

on sios2 {
node-id 2;
device /dev/drbd0;
disk /dev/dm-0;
address 10.0.0.5:7788;
meta-disk internal;
}

connection {
  host sios0 port 7788;
  host sios1 port 7788;
  net {
protocol C;
  }
}

connection {
  host sios0 port 7788;
  host sios2 port 7788;
  net {
protocol A;
  }
}

connection {
  host sios1 port 7788;
  host sios2 port 7788;
  net {
protocol A;
  }
}
}
___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] DRBD Delay

2019-07-23 Thread Banks, David (db2d)
Thanks Robert! Any thoughts on where that service or script resides? I would 
have thought it would be in the systemd folder but no joy.

---
Dav Banks
School of Architecture
University of Virginia
Campbell Hall, Rm 138
(o) 434.243.8883
(e) dav.ba...@virginia.edu

On Jul 23, 2019, at 4:06 AM, Robert Altnoeder 
mailto:robert.altnoe...@linbit.com>> wrote:

On 7/22/19 6:34 PM, Banks, David (db2d) wrote:
Is there a way to delay the loading of DRBD resources until after the 
underlying block system has made the devices available?

I’ve looked in systemd but didn’t see any drbd services and wanted to ask 
before monkeying with that.

There must be some service or script somewhere that starts it, because
DRBD itself does not automatically start resources when the kernel
module is loaded.

After reboot DRBD starts before the zfs volumes that is use are available so I 
have to do a 'drbdadm adjust all’ each time. I’d like it to just wait until the 
zfs-mount.service is done.

Another question is whether that will actually be good enough to
reliably start the DRBD resources. It may work sometimes, maybe even
often, and then sometimes it may fail, because many of the base OS
components such as udev, which makes various storage devices available,
have countless race conditions. In other words, I suspect that the
zfs-mount.service may finish successfully before the ZFS devices are
actually usable or visible in the /dev filesystem.
You may need to either add a delay (which still only increases the
chance that it will work, but does not guarantee it) or add some custom
programming to wait for the devices to be created in the /dev filesystem
and to actually start working (e.g., try opening one and see whether
that works).

br,
Robert

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] local WRITE IO error sector 21776+1016 on dm-2

2019-07-23 Thread Roland JARRY
Hello,

I have an issue mounting drbd 8.4.11-1 resources on a kernel
4.9.0-9-amd64 (debian 9.9). I have this error message : block drbd3:
local WRITE IO error sector 21776+1016 on dm-2

Then, the resource becomes diskless.

Here are the settings of the resource :

root@srv-pg-sav-p:~# cat /etc/drbd.d/vgbackup-lv-back3.res
resource vgbackup-lv-back3 {
  net {
    allow-two-primaries;
  }
  startup {
    wfc-timeout 120;
    degr-wfc-timeout 120;
  }

  volume 0 {
    device    /dev/drbd3;
    #meta-disk internal;
    meta-disk /dev/vgbackup/lv-md-back3;
    disk  /dev/vgbackup/lv-back3;
}

  on srv-pg-sav-p {
    address   192.168.8.221:7803;

  }
  on srv-pg-sav-s {
    address   192.168.8.222:7803;
  }
}

I've changed meta-disk internal to external lv device to have more space
(1GB), but I have the same issue :

root@srv-pg-sav-p:~# lvs
  LV  VG   Attr   LSize  Pool Origin Data%  Meta%  Move
Log Cpy%Sync Convert
  lv-back1    vgbackup -wi-ao
21.00t   
  lv-back2    vgbackup -wi-ao
21.00t   
  lv-back3    vgbackup -wi-a-
21.00t   
  lv-md-back3 vgbackup -wi-a-  1.00g   

I have 3 resources of same size. 2 works right now and not the 3rd. And
I had the same issue before with 2 first resources.

I notice that the error is on the same sector on each resource and at
each time. Is there a limitation somewhere ?

Here is more log :

Jul 23 10:23:52 srv-pg-sav-p kernel: [1532462.342531] EXT4-fs (drbd3):
mounted filesystem with ordered data mode. Opts: (null)
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.860138] block drbd3: local
WRITE IO error sector 21776+1016 on dm-2
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.860146] block drbd3: disk(
UpToDate -> Failed )
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.860177] block drbd3: Local
IO failed in __req_mod. Detaching...
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.868306] block drbd3:
helper command: /sbin/drbdadm pri-on-incon-degr minor-3
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.868356] block drbd3: IO
ERROR: neither local nor remote data, sector 21776+8
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.876611] block drbd3: IO
ERROR: neither local nor remote data, sector 21784+8
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.881934] block drbd3:
helper command: /sbin/drbdadm pri-on-incon-degr minor-3 exit code 0 (0x0)
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.885020] block drbd3: IO
ERROR: neither local nor remote data, sector 21792+8
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.894204] block drbd3: 21 TB
(5637144528 bits) marked out-of-sync by on disk bit-map.
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.894207] block drbd3: disk(
Failed -> Diskless )
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.894220] block drbd3: IO
ERROR: neither local nor remote data, sector 21800+8
Jul 23 10:23:53 srv-pg-sav-p kernel: [1532463.902118] block drbd3: IO
ERROR: neither local nor remote data, sector 21808+8
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.637796] block drbd3: 122
messages suppressed in /usr/src/modules/drbd/drbd/drbd_req.c:1446.
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.637802] block drbd3: IO
ERROR: neither local nor remote data, sector 22548840448+8
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.648265] Buffer I/O error
on dev drbd3, logical block 2818605056, lost sync page write
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.658056] JBD2: Error -5
detected when updating journal superblock for drbd3-8.
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.668068] Aborting journal
on device drbd3-8.
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.678189] Buffer I/O error
on dev drbd3, logical block 2818605056, lost sync page write
Jul 23 10:23:59 srv-pg-sav-p kernel: [1532469.688556] JBD2: Error -5
detected when updating journal superblock for drbd3-8.
Jul 23 10:45:28 srv-pg-sav-p kernel: [1533756.213355] block drbd3: 1
messages suppressed in /usr/src/modules/drbd/drbd/drbd_req.c:1446.
Jul 23 10:45:28 srv-pg-sav-p kernel: [1533756.213361] block drbd3: IO
ERROR: neither local nor remote data, sector 45097156480+8
Jul 23 10:45:28 srv-pg-sav-p kernel: [1533756.222815] block drbd3: IO
ERROR: neither local nor remote data, sector 45097156592+8
Jul 23 10:45:28 srv-pg-sav-p kernel: [1533756.232250] block drbd3: IO
ERROR: neither local nor remote data, sector 0+8
Jul 23 10:45:28 srv-pg-sav-p kernel: [1533756.241269] block drbd3: IO
ERROR: neither local nor remote data, sector 8+8

Kind regards.

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] DRBD 9: 3-node mirror error (Low.dev. smaller than requested DRBD-dev. size.)

2019-07-23 Thread Robert Altnoeder
On 7/22/19 9:55 PM, Paul Clements wrote:
> Is there something wrong with my conf file maybe?

One thing you could check just to be sure is whether the configuration
files are identical on all three systems.

Was DRBD stopped on all three nodes before you recreated the meta data?
(Did you do 'drbdsetup down resourcename' on all three nodes, then
recreate the meta data on all three nodes, and then try 'drbdadm up
resourcename'?)

You said you were using DRBD 9.0.16, correct?
Could you please also check exact output of "cat /proc/drbd" and
"drbdadm --version" for me please?
With that information I may be able to recreate the situation if the
problem persists.

br,
Robert

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] DRBD Delay

2019-07-23 Thread Robert Altnoeder
On 7/22/19 6:34 PM, Banks, David (db2d) wrote:
> Is there a way to delay the loading of DRBD resources until after the 
> underlying block system has made the devices available?
>
> I’ve looked in systemd but didn’t see any drbd services and wanted to ask 
> before monkeying with that.

There must be some service or script somewhere that starts it, because
DRBD itself does not automatically start resources when the kernel
module is loaded.

> After reboot DRBD starts before the zfs volumes that is use are available so 
> I have to do a 'drbdadm adjust all’ each time. I’d like it to just wait until 
> the zfs-mount.service is done.

Another question is whether that will actually be good enough to
reliably start the DRBD resources. It may work sometimes, maybe even
often, and then sometimes it may fail, because many of the base OS
components such as udev, which makes various storage devices available,
have countless race conditions. In other words, I suspect that the
zfs-mount.service may finish successfully before the ZFS devices are
actually usable or visible in the /dev filesystem.
You may need to either add a delay (which still only increases the
chance that it will work, but does not guarantee it) or add some custom
programming to wait for the devices to be created in the /dev filesystem
and to actually start working (e.g., try opening one and see whether
that works).

br,
Robert

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] DRBD Delay

2019-07-23 Thread Veit Wahlich
Hi David,

have a look at the documentation of drbdadm; the commands up, down and
adjust might be what you are looking for.

Best regards,
// Veit


Am Montag, den 22.07.2019, 16:34 + schrieb Banks, David (db2d):
> Hello,
> 
> Is there a way to delay the loading of DRBD resources until after the
> underlying block system has made the devices available?
> 
> I’ve looked in systemd but didn’t see any drbd services and wanted to
> ask before monkeying with that.
> 
> System: Ubuntu 18.04
> DRBD: 8.9.10
> 
> After reboot DRBD starts before the zfs volumes that is use are
> available so I have to do a 'drbdadm adjust all’ each time. I’d like
> it to just wait until the zfs-mount.service is done.
> 
> Thanks!
> 
> ___
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user