Re: [DRBD-user] 9.1.14 upgrade issue

2023-04-14 Thread Lars Ellenberg
On Fri, Apr 14, 2023 at 08:54:39AM -0700, Akemi Yagi wrote: > Hi Nigel, > > kmod-drbd90-9.1.14-2.el8_7.x86_64.rpm will start syncing to our mirrors > shortly. Since we modularized the "transport", DRBD consists of (at least) two modules, "drbd" and "drbd_transport_tcp". You ship

Re: [DRBD-user] DRBD Trim Support

2022-01-13 Thread Lars Ellenberg
On Sat, Jan 08, 2022 at 04:24:54AM +, Eric Robinson wrote: > According to the documentation, SSD TRIM/Discard support has been in DRBD > since version 8. DRBD is supposed to detect if the underlying storage > supports trim and, if so, automatically enable it. However, I am unable to > TRIM my

Re: [DRBD-user] drbdadm attach - how to allow non-exclusive access ?

2021-08-26 Thread Lars Ellenberg
On Mon, Aug 16, 2021 at 07:07:41AM +0100, TJ wrote: > I've got a rather unique scenario and need to allow non-exclusive > read-only opening of the drbd device on the Secondary. > Then I thought to use a device-mapper COW snapshot - so the underlying > drbd device is never changed but the snapshot

Re: [DRBD-user] protocol C replication - unexpected behaviour

2021-08-26 Thread Lars Ellenberg
On Thu, Aug 05, 2021 at 11:53:44PM +0200, Janusz Jaskiewicz wrote: > Hello. > > I'm experimenting a bit with DRBD in a cluster managed by Pacemaker. > It's a two node, active-passive cluster and the service that I'm > trying to put in the cluster writes to the file system. > The service manages

Re: [DRBD-user] DRBD 8.0 life cycle

2021-08-26 Thread Lars Ellenberg
On Tue, Aug 03, 2021 at 03:34:58PM -0700, Paul D. O'Rorke wrote: > Hi all, > > I have been running a DRDB-8 simple 3 node disaster recovery set up of > libvirt VMs for a number of years and have been very happy with it.   Our > needs are simple, 2 servers on Protocol C, each running a handful of

Re: [DRBD-user] 9.0.28 fails to build on centos-8-stream

2021-03-01 Thread Lars Ellenberg
On Fri, Feb 26, 2021 at 07:09:29AM +0100, Fabio M. Di Nitto wrote: > hey guys, > > similar to 9.0.27, log below. > > Any chance you can give me a quick and dirty fix? > CC [M] /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.o > /builddir/build/BUILD/drbd-9.0.28-1/drbd/drbd_main.c: In

[DRBD-user] drbd-9.0.27 [Re: drbd-9.0.26]

2020-12-23 Thread Lars Ellenberg
> * reliably detect split brain situation on both nodes > * improve error reporting for failures during attach > * implement 'blockdev --setro' in DRBD > * following upstream changes to DRBD up to Linux 5.10 and ensure >compatibility with Linux 5.8, 5.9, and 5.10 -- : Lars El

Re: [DRBD-user] 4Kib backing stores -> virtual device sector size ?

2020-11-20 Thread Lars Ellenberg
system or other use on top of DRBD that can not tolerate a change in logical block size from one "mount" to the next, then make sure to use IO backends with identical (or similar enough) characteristics. If you have a file system that can tolerate such a change, you may get away wi

[DRBD-user] DRBD no longer working after RHEL 7 kernel upgrade

2020-05-15 Thread Lars Ellenberg
Well, obviously DRBD *does* still work just fine. Though not for those that only upgrade the kernel without upgrading the module, or vice versa. See below. As we get more and more reports of people having problems with their DRBD after a RHEL upgrade, let me quickly state the facts: RHEL

[DRBD-user] drbd-9.0.20-0rc3

2019-10-03 Thread Lars Ellenberg
Changes wrt RC2 (for RC2 announcement see below): 9.0.20-0rc3 (api:genl2/proto:86-115/transport:14) * fix regression related to the quorum feature, introduced by code deduplication; regression never released, happened during this .20 development/release cycle * completing aspects

Re: [DRBD-user] Impossible to get primary node.

2019-09-27 Thread Lars Ellenberg
-peer", NOT after-resync-target. That was from the times when there was no unfence-peer handler, and we overloaded/abused the after-resync-target handler for this purpose. >     fencing resource-only; > >     after-sb-0pri   discard-least-changes; You are automating data loss. That

Re: [DRBD-user] Auto-promote hangs when 3rd node is gracefully taken offline

2019-07-30 Thread Lars Ellenberg
24:15 el8-a01n01.digimer.ca kernel: drbd test_server: Preparing > cluster-wide state change 3514756670 (0->2 499/146) > Jul 27 18:24:15 el8-a01n01.digimer.ca kernel: drbd test_server: State change > 3514756670: primary_nodes=0, weak_nodes=0 > Jul 27 18:24:15 el8-a01n01.digimer.ca kern

Re: [DRBD-user] DRBD 9: 3-node mirror error (Low.dev. smaller than requested DRBD-dev. size.)

2019-07-25 Thread Lars Ellenberg
d -v -z -o $(( ${size_gb} * 2**30 - 2**20 )) -l 1M /dev/$VG/$LV Make your config file refer to disk /dev/$VG/$LV dmesg -c > /dev/null# clear dmesg before the test drbdadm -v create-md# on all nodes drbdadm -v up all # on all nodes dmesg # if it failed drbdsetu

Re: [DRBD-user] local WRITE IO error sector 21776+1016 on dm-2

2019-07-25 Thread Lars Ellenberg
te-same". > Maybe this one can also be used : > https://chris.hofstaedtler.name/blog/2016/10/kernel319plus-3par-incompat.html > finding before ATTRS{rev} property of disks. For your specific hardware, probably yes. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD --

Re: [DRBD-user] PingAck did not arrive in time.

2019-06-24 Thread Lars Ellenberg
m, performance considerations, using a single DRBD volume of that size is most likely not what you want. If you really mean it, it will likely require a number of deviations from "default" settings to work reasonaly well. Do you mind sharing with us what you actually want? What d

Re: [DRBD-user] drbd local replication with remote replication at the same time

2019-06-24 Thread Lars Ellenberg
o allow "local replication", but to be used for certain "ssl socket forwarding solutions", or for use with the DRBD proxy. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRB

Re: [DRBD-user] Peer cannot deal with requests bigger than 131072

2019-06-24 Thread Lars Ellenberg
bd louie Centos63: Connection closed > > > Is there any way of working around this ? Well, now, did you even try what this message suggests? Otherwise: maybe first 8.3 -> 8.4, then 8.4 -> 9 Or forget about the "rolling" upgrade, just re-create the meta data as 9, and

[DRBD-user] drbd-9.0.19-0rc1

2019-06-14 Thread Lars Ellenberg
low-remote-read (disallow read from DR connections) * some build changes http://www.linbit.com/downloads/drbd/9.0/drbd-9.0.19-0rc1.tar.gz https://github.com/LINBIT/drbd-9.0/tree/d1e16bdf2b71 -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacema

Re: [DRBD-user] drbd-9.0.18-1 : BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0

2019-06-12 Thread Lars Ellenberg
quot; once you actually try to use it, is ... not very friendly either. > > [525370.955135] dm-16: error: dax access failed (-95) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ ple

Re: [DRBD-user] Spin_Lock timeout in DRBD during heavy load

2019-05-28 Thread Lars Ellenberg
] > [ 2643.473042] [] ? receive_Data+0x77e/0x18f0 [drbd] Supposedly fixed with 9.0.18, more specifically with 7ce7cac6 drbd: fix potential spinlock deadlock on device->al_lock -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker D

Re: [DRBD-user] 'systemctl start drbd' resets /sys/block/sdb/queue/rotational

2019-03-25 Thread Lars Ellenberg
On Fri, Mar 22, 2019 at 10:45:52AM +, Holger Kiehl wrote: > Hello, > > I have megaraid controller with only SAS SSD's attached which always > sets /sys/block/sdb/queue/rotational to 1. So, in /etc/rc.d/rc.local > I just did a 'echo -n 0 > /sys/block/sdb/queue/rotational', that fixed > it. But

Re: [DRBD-user] Rescue a drbd partition

2019-01-24 Thread Lars Ellenberg
s DRBD in normal operation). You may need to adapt that filter to allow lvm to see the backend device directly, if you *mean* to bypass drbd in a recovery scenario. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are reg

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-14 Thread Lars Ellenberg
On Fri, Dec 14, 2018 at 02:13:50PM +0100, Harald Dunkel wrote: > Hi Lars, > > On 12/14/18 1:27 PM, Lars Ellenberg wrote: > > > > There was nothing dirty (~ 7 MB; nothing worth to mention). > > So nothing to sync. > > > > But it takes some time to in

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-14 Thread Lars Ellenberg
On Fri, Dec 14, 2018 at 09:32:14AM +0100, Harald Dunkel wrote: > Hi folks, > > On 12/13/18 11:49 PM, Igor Cicimov wrote: > > On Fri, Dec 14, 2018 at 2:57 AM Lars Ellenberg wrote: > > > > > > Unlikely to have anything to do with DRBD. > > > >

Re: [DRBD-user] umount /drbdpart takes >50 seconds

2018-12-13 Thread Lars Ellenberg
uce, monitor grep -e Dirty -e Writeback /proc/meminfo and slabtop before/during/after umount. Also check sysctl settings sysctl vm | grep dirty -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered tradem

Re: [DRBD-user] Offline Resize

2018-12-07 Thread Lars Ellenberg
lusterdb > /tmp/metadata > > > > Found meta data is "unclean", please apply-al first Well, there it tells you what is wrong (meta data is "unclean"), and what you should do about it: ("apply-al"). So how about just doing what it tells you to do

Re: [DRBD-user] Complete decoupling of LINSTOR from DRBD (Q1 2019)

2018-11-23 Thread Lars Ellenberg
On Fri, Nov 23, 2018 at 11:55:32AM +0100, Robert Altnoeder wrote: > On 11/23/18 11:00 AM, Michael Hierweck wrote: > > Linbit announces the complete decoupling of LINSTOR from DRBD (Q1 2019). > > [...] > > Does this mean Linbit will abandon DRBD? > > Not at all, TL;DR: Marketing:

Re: [DRBD-user] Update linstor-proxmox from drbdmanage-proxmox

2018-11-06 Thread Lars Ellenberg
I fix it? Probably by upgrading your drbd-utils. If that's not sufficient, we'll have to look into it. We could add a single line workaround into the plugin, but that would likely just mask a bug elsewhere. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Hea

Re: [DRBD-user] (DRBD 9) promote secondary to primary with primary crashed

2018-11-06 Thread Lars Ellenberg
Daniel Hertanu wrote: > Hello Yannis, > > I tried that, same result, won't switch to primary. Well, it says: > >> [root@server2-drbd ~]# drbdadm primary resource01 > >> resource01: State change failed: (-2) Need access to UpToDate data Does it have "access to UpTo

Re: [DRBD-user] Configuring a two-node cluster with redundant nics on each node?

2018-10-24 Thread Lars Ellenberg
ut for the record, if this was not only about redundancy, but also hopes to increase bandwidth while all links are operational, LACP does not increase bandwidth for a single TCP flow. "bonding round robin" is the only mode that does. Just saying. -- : Lars Ellenberg : LINBIT | Ke

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-24 Thread Lars Ellenberg
On Fri, Oct 19, 2018 at 10:18:12AM +0200, VictorSanchez2 wrote: > On 10/18/2018 09:51 PM, Lars Ellenberg wrote: > > On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote: > > > On 10/11/2018 10:59 AM, Lars Ellenberg wrote: > > > > On Wed, Oct 10, 201

Re: [DRBD-user] split brain on both nodes

2018-10-18 Thread Lars Ellenberg
e resync rate to minimize impact on > applications using the storage. As it slows itself down to "stay out of > the way", the resync time increases of course. You won't have redundancy > until the resync completes. > > -- > Digimer > Papers and Projects: https://a

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-18 Thread Lars Ellenberg
On Thu, Oct 11, 2018 at 02:06:11PM +0200, VictorSanchez2 wrote: > On 10/11/2018 10:59 AM, Lars Ellenberg wrote: > > On Wed, Oct 10, 2018 at 11:52:34AM +, Garrido, Cristina wrote: > > > Hello, > > > > > > I have two drbd devices configured on my cluster. O

Re: [DRBD-user] drbdadm status blocked:lower

2018-10-11 Thread Lars Ellenberg
"congestion" for the backing device. Why it did that, and whether that was actually the case, and what that actually means is very much dependend on that backing device, and how it "felt" at the time of that status output. -- : Lars Ellenberg : LINBIT | Keeping the Digital World R

Re: [DRBD-user] drbdadm down failed (-12) - blocked by drbd_submit

2018-10-11 Thread Lars Ellenberg
good way to deal with this case, as whether some DRBD step is > missing, which leaves the process or killing the process is the right way? Again, that "process" has nothing to do with drbd being "held open", but is a kernel thread that is part of the existence of that DRBD volum

Re: [DRBD-user] Sending time expired on SyncSource node

2018-09-26 Thread Lars Ellenberg
of congestion. But read about "timeout" and "ko-count" in the users guide. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list --

Re: [DRBD-user] Mount and use disk while Inconsistent?

2018-09-24 Thread Lars Ellenberg
l that whatever is > being used can handle having the storage ripped out from under it. Yes. Also, when using a SyncTarget, many reads are no longer local, because there is no good local data to read, which may or may not be a serious performance hit, depending on your workload. -- : Lars

Re: [DRBD-user] notify-split-brain.sh[153967]: Environment variable $DRBD_PEER not found (this is normally passed in by drbdadm).

2018-09-21 Thread Lars Ellenberg
On Wed, Sep 19, 2018 at 04:57:08PM -0400, Daniel Ragle wrote: > On 9/18/2018 10:51 AM, Lars Ellenberg wrote: > > On Thu, Sep 13, 2018 at 04:36:54PM -0400, Daniel Ragle wrote: > > > Anybody know where I need to start looking to figure this one out: > > > > &g

Re: [DRBD-user] notify-split-brain.sh[153967]: Environment variable $DRBD_PEER not found (this is normally passed in by drbdadm).

2018-09-18 Thread Lars Ellenberg
k=DRBD_NODE_ID_${DRBD_PEER_NODE_ID}; v=${!k}; [[ $v ]] && DRBD_PEER=$v; fi -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send t

Re: [DRBD-user] Max disk size with external metadata (8.4.11-1)

2018-09-18 Thread Lars Ellenberg
vailable. I'm not exactly sure, but I sure hope we have dropped the "indexed" flavor in DBRD 9. Depending on the number of (max-) peers, DRBD 9 needs more room for metadata than a "two-node only" DRBD. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRB

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-10 Thread Lars Ellenberg
On Sat, Sep 08, 2018 at 09:34:32AM +0200, Valentin Vidic wrote: > On Fri, Sep 07, 2018 at 07:14:59PM +0200, Valentin Vidic wrote: > > In fact the first one is the original code path before I modified > > blkback. The problem is it gets executed async from workqueue so > > it might not always run

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-07 Thread Lars Ellenberg
On Fri, Sep 07, 2018 at 02:13:48PM +0200, Valentin Vidic wrote: > On Fri, Sep 07, 2018 at 02:03:37PM +0200, Lars Ellenberg wrote: > > Very frequently it is *NOT* the "original user", that "still" holds it > > open, but udev, or something triggered-by-udev. &

Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device

2018-09-07 Thread Lars Ellenberg
On Wed, Sep 05, 2018 at 06:27:56PM +0200, Valentin Vidic wrote: > On Wed, Sep 05, 2018 at 12:36:49PM +0200, Roger Pau Monné wrote: > > On Wed, Aug 29, 2018 at 08:52:14AM +0200, Valentin Vidic wrote: > > > Switching to closed state earlier can cause the block-drbd > > > script to fail with 'Device

Re: [DRBD-user] Any way to jump over initial sync ?

2018-08-30 Thread Lars Ellenberg
On Wed, Aug 29, 2018 at 12:39:07PM -0400, David Bruzos wrote: > Hi Lars, > Thank you and the others for such a wonderful and useful system! Now, to > your comment: > > >Um, well, while it may be "your proven method" as well, it actually > >is the method documented in the drbdsetup man page and

[DRBD-user] drbd issue?

2018-08-30 Thread Lars Ellenberg
ons, then reconnects, and syncs up. > Second node: > > [Wed Aug 29 01:42:48 2018] drbd resource0: PingAck did not arrive in time. Again, time stamps do not match up. But there is your reason for this incident: "PingAck did not arrive in time". Find out why, or simply increase t

Re: [DRBD-user] Any way to jump over initial sync ?

2018-08-29 Thread Lars Ellenberg
ta and an "unsatisfactory" replication link bandwidth, you may want to look into the second typical use case of "new-current-uuid", which we coined "truck based replication", which is also documented in the drbdsetup man page. (Or, do the initial sync with an &

Re: [DRBD-user] confused with DRBD 9.0 and dual-primary, multi-primary, multi-secondary ...

2018-08-29 Thread Lars Ellenberg
gration with various virtualization solutions much easier. Still, also in that case, prepare to regularly upgrade both DRBD 9 and LINSTOR components. There will be bugs, and bug fixes, and they will be relevant for your environment. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DR

Re: [DRBD-user] drbd issue?

2018-08-29 Thread Lars Ellenberg
? Some strangeness with the new NIC drivers? A bug in the "shipped with the debian kernel" DRBD version? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc

Re: [DRBD-user] Resource is 'Blocked: upper'

2018-08-29 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 06:15:09PM +0200, Julien Escario wrote: > Le 27/08/2018 à 17:44, Lars Ellenberg a écrit : > > On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote: > >> Hello, > >> We're stuck in a strange situation. One of our ressources is mark

Re: [DRBD-user] Resource is 'Blocked: upper'

2018-08-27 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 05:01:52PM +0200, Julien Escario wrote: > Hello, > We're stuck in a strange situation. One of our ressources is marked as : > volume 0 (/dev/drbd155): UpToDate(normal disk state) Blocked: upper > > I used drbdtop to get this info because drbdadm hangs. > > I can also see

Re: [DRBD-user] LVM logical volume create failed

2018-08-27 Thread Lars Ellenberg
On Mon, Aug 27, 2018 at 08:21:35AM +, Jaco van Niekerk wrote: > Hi > > cat /proc/drbd > version: 8.4.11-1 (api:1/proto:86-101) > GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by mockbuild@, > 2018-04-26 12:10:42 > 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C

Re: [DRBD-user] [Pacemaker] Pacemaker unable to start DRBD

2018-07-30 Thread Lars Ellenberg
clone-max=2 clone-node-max=1 notify=true > > I receive the following on pcs status: > * my_iscsidata_monitor_0 on node2.san.localhost 'not configured' (6): > call=9, status=complete, exitreason='meta parameter misconfigured, > expected clone-max -le 2, but found unset.', Well,

Re: [DRBD-user] drbd+lvm no bueno

2018-07-30 Thread Lars Ellenberg
log > bottleneck problem. One LV -> DRBD Volume -> Filesystem per DB instance. If the DBs are "logically related", have all volumes in one DRBD resource. If not, separate DRBD resources, one volume each. But whether or not that would help in your setup depends very much on the typic

Re: [DRBD-user] drbd+lvm no bueno

2018-07-26 Thread Lars Ellenberg
part. it's what most people think when doing that: use a huge single DRBD as PV, and put loads of unrelated LVS inside of that. Which then all share the single DRBD "activity log" of the single DRBD volume, which then becomes a bottleneck for IOPS. -- : Lars Ellenberg : LINBIT | Keepi

Re: [DRBD-user] Pacemaker unable to start DRBD

2018-07-26 Thread Lars Ellenberg
resource create ... pcs -f tmp_cfg resource master ... pcs cluster push cib tmp_cfg if you need to get things done, don't take unknown short cuts, because, as they say, the unknown short cut is the longest route to the destination. though you may learn a lot along the way, so if you are in the position wh

Re: [DRBD-user] Content of DRBD volume is invalid during sync after disk replace

2018-07-26 Thread Lars Ellenberg
30) > Library version: 1.02.137 (2016-11-30) > Driver version: 4.37.0 > Is it bug or am I doing something wrong? Thanks for the detailed and useful report, definetely a serious and embarassing bug, now already fixed internally. Fix will go into 9.0.15 final. We are in the progress

Re: [DRBD-user] drbd+lvm no bueno

2018-07-26 Thread Lars Ellenberg
nerate your > distro's initrd/initramfs to reflect the changes directly at startup. Yes, don't forget that step ^^^ that one is important as well. But really, most of the time, you really want LVM *below* DRBD, and NOT above it. Even though it may "appear" to be convenient, it is usually not w

Re: [DRBD-user] Cannot synchronize stacked device to backup server with DRBD9

2018-06-19 Thread Lars Ellenberg
On Tue, Jun 19, 2018 at 09:19:04AM +0200, Artur Kaszuba wrote: > Hi Lars, thx for answer > > W dniu 18.06.2018 o 17:10, Lars Ellenberg pisze: > > On Wed, Jun 13, 2018 at 01:03:53PM +0200, Artur Kaszuba wrote: > > > I know about 3 node solution and i have used it for some

Re: [DRBD-user] 125TB volume working, 130TB not working

2018-06-19 Thread Lars Ellenberg
f you try again, do those numbers change? If they change, do they still show such a pattern in hex digits? > [ma. juni 18 14:44:43 2018] drbd drbd1/0 drbd1: we had at least one MD IO > ERROR during bitmap IO > [ma. juni 18 14:44:47 2018] drbd drbd1/0 drbd1: recounting of set bits took >

Re: [DRBD-user] Cannot synchronize stacked device to backup server with DRBD9

2018-06-18 Thread Lars Ellenberg
write this post because stacked configuration is > still described in documentation and should work? Unfortunately for > now it is not possible to create such configuration or i missed > something :/ I know there are DRBD 9 users using "stacked" configurations out there. Maybe yo

Re: [DRBD-user] DRBD Issues causing high server load

2018-05-03 Thread Lars Ellenberg
should *upgrade*. > or upgrade the drbd version ? Yes, that as well. > Thanks in advance for your help. Cheers, :-) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker : R, Integration, Ops, Consulting, Support DRBD® and LIN

Re: [DRBD-user] New 3-way drbd setup does not seem to take i/o

2018-05-03 Thread Lars Ellenberg
on that can take a long time, it is also (and especially) the wait_for_completion_io(). We could "make the warnings" go away by accepting only (arbitrary small number) of discard requests at a time, and then blocking in submit_bio(), until at least one of the pending ones completes. But t

[DRBD-user] drbd-9.0.14

2018-05-02 Thread Lars Ellenberg
up to v4.15.x > * new wire packet P_ZEROES a cousin of P_DISCARD, following the kernel >as it introduced separated BIO ops for writing zeros and discarding > * compat workaround for two RHEL 7.5 idiosyncrasies regarding refcount_t >and struct nla_policy -- : Lars Ellenber

[DRBD-user] drbd-8.4.11 released

2018-04-26 Thread Lars Ellenberg
ile IO is frozen * fix various corner cases when recovering from multiple failure cases https://www.linbit.com/downloads/drbd/8.4/drbd-8.4.11-1.tar.gz https://github.com/LINBIT/drbd-8.4/tree/drbd-8.4.11 -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Co

Re: [DRBD-user] about split-brain

2018-04-20 Thread Lars Ellenberg
ql > > my auto solve config: > > net { > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; If the "good data" (by whatever metric) happens to be secondary during that handshake, and the "bad data" happens to be prima

Re: [DRBD-user] Data consistency question

2018-03-14 Thread Lars Ellenberg
ynchronous" replication here. Going online with the Secondary now will look just like a "single system crash", but like that crash would have happened a few requests earlier. It may miss the latest few updates. But it will still be consistent. -- : Lars Ellenberg : LINBIT

Re: [DRBD-user] Node failure in a tripple primary setup

2018-03-07 Thread Lars Ellenberg
ctively used, as is the case with live migrating VMs. Which would not have to be that way, it could do with single primary even, by switching roles "at the righ time"; but hypervisors do not implement it that way currently. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running

Re: [DRBD-user] Managing a 3-node DRBD 9 resource in pacemker

2018-02-13 Thread Lars Ellenberg
ike it used to be". The peer-disk state of the DR node as seen by drbdsetup may have some influence on the master-score calculations. That's a feature, not a bug ;-) (I think) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® a

Re: [DRBD-user] Interesting issue with drbd 9 and fencing

2018-02-13 Thread Lars Ellenberg
still connect to you ;-) > Note: I down'ed the dr node (node 3) an repeated the test. This time, > the fence-handler was invoked. So I assume that DRBD did route through > the third node. Impressive! Yes, "sort of". > So, is the Protocol C between 1 <-> 2 maintained, wh

Re: [DRBD-user] secundary not finish synchronizing [actually: automatic data loss by dual-primary, no-fencing, no cluster manager, and automatic after-split-brain recovery policy]

2018-02-09 Thread Lars Ellenberg
On Thu, Feb 08, 2018 at 02:52:10PM -0600, Ricky Gutierrez wrote: > 2018-02-08 7:28 GMT-06:00 Lars Ellenberg <lars.ellenb...@linbit.com>: > > And your config is? > > resource zimbradrbd { > allow-two-primaries; Why dual primary? I doubt you really need that.

Re: [DRBD-user] secundary not finish synchronizing

2018-02-08 Thread Lars Ellenberg
a loss. So if you don't mean that, don't do it. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send t

Re: [DRBD-user] Problem on drbdadm up r0

2018-02-01 Thread Lars Ellenberg
e on both server. > Anyone a idea? We have also a ticket at HGST and they tried also a lot. If you want, contact LINBIT, we should be able to help you get this all set up in a sane way. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker

Re: [DRBD-user] drbd9 default rate-limit

2017-11-02 Thread Lars Ellenberg
o tell it to try and be more or less aggressive wrt. the ongoing "application" IO that is concurrently undergoing live replication, because both obviously share the network bandwidth, as well as bandwidth and IOPS of the storage backends. These knobs, and their defaults, are documented in th

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
> Most importantly: once the trimtester (or *any* "corruption detecting" > tool) claims that a certain corruption is found, you look at what supposedly is > corrupt, and double check if it in fact is. > > Before doing anything else. > I did that, but I don't know what a "good" file is supposed to

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-17 Thread Lars Ellenberg
s found, you look at what supposedly is corrupt, and double check if it in fact is. Before doing anything else. Double check if the tool would still claim corruption exists, even if you cannot see that corruption with other tools. If so, find out why that tool does that, because that'd be clearly

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-16 Thread Lars Ellenberg
I seriously doubt, but I am biased), there may be something else going on still... -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-14 Thread Lars Ellenberg
ster-is-broken o=trimtester-is-broken/x1 echo X > $o l=$o for i in `seq 2 32`; do o=trimtester-is-broken/x$i; cat $l $l > $o ; rm -f $l; l=$o; done ./TrimTester trimtester-is-broken Wahwahwa Corrupted file found: trimteste

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-13 Thread Lars Ellenberg
evice see which request) may be changed. Maximum request size may be changed. Maximum *discard* request size *will* be changed, which may result in differently split discard requests on the backend stack. Also, we have additional memory allocations for DRBD meta data and housekeeping, so possibly dif

Re: [DRBD-user] Clarification on barriers vs flushes

2017-10-03 Thread Lars Ellenberg
the linux kernel high level block device api used the term "back then" (BIO_RW_BARRIER), do no longer exist in today's Linux kernels. That however does not mean we could drop the config keyword, nor that we can drop the functionality there yet, until all "old" kernels are really

Re: [DRBD-user] Warning: Data Corruption Issue Discovered in DRBD 8.4 and 9.0

2017-10-03 Thread Lars Ellenberg
oved wrong. To gather a few more data points, does the behavior on DRBD change, if you disk { disable-write-same; } # introduced only with drbd 8.4.10 or if you set disk { al-updates no; } # affects timing, among other things Can you reproduce with other backend devices? -- : Lars Elle

Re: [DRBD-user] ERROR: meta parameter misconfigured, expected clone-max -le 2, but found unset

2017-10-02 Thread Lars Ellenberg
ix the problem? Don't put a "primitive" DRBD definition live without the corresponding "ms" definition. If you need to, populate a "shadow" cib first, and only commit that to "live" once it is fully populated. -- : Lars Ellenberg : LINBIT | Keeping the Digital Worl

Re: [DRBD-user] Problem updating 8.3.16 to 8.4.10 -- actually problem while *downgrading* (no valid meta-data signature found)

2017-09-25 Thread Lars Ellenberg
u need to "convert" the 8.4 back to the 8.3 magic, using the 8.4 compatible drbdmeta tool, because, well, unsurprisingly the 8.3 drbdmeta tool does not know the 8.4 magics. So if you intend to downgrade to 8.3 from 8.4, while you still have the 8.4 tools installed, do: "drbdadm down al

Re: [DRBD-user] Question reg. protocol C

2017-09-11 Thread Lars Ellenberg
ach from it. Now it is no longer there. > but D1 has failed after a D2 > failure, Too bad, now we have no data anymore. > but before D2 has recovered. What is the behavior of DRBD in such > a case? Are all future disk writes blocked until both D1 and D2 are > available, and are co

Re: [DRBD-user] dead slow replication

2017-09-04 Thread Lars Ellenberg
te, c-max-rate, possibly send and receive buffer sizes. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed _

Re: [DRBD-user] [DRBD] DRBD resource works until fence is used

2017-09-04 Thread Lars Ellenberg
t; directive. (Followed instructions from DRBD 9 manual). > > Any info that might help in fixing this is welcome. With DRBD 9, you want to use "fence-peer crm-fence-peer.9.sh" and "unfence-peer crm-unfence-peer.9.sh" (mind the .9.) -- : Lars Ellenberg : LINBIT |

Re: [DRBD-user] 9.0.9rc1-1 crashes system

2017-08-21 Thread Lars Ellenberg
there is anything relating to "out of memory" in there. Can you still reproduce? If so, can you capture the kernel logs during the "crash" somehow? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are

Re: [DRBD-user] Response to Mr. Ellenberg's answer to: "Warning: If using drbd; data-loss / corruption is possible; [...]"

2017-08-17 Thread Lars Ellenberg
) them. But when using it "in spec", they don't trigger (or we'd had a lot of angry customers). That being said, again, > > If you want snapshot shipping, > > use a system designed for snapshot shipping. > > DRBD is not. Cheers, -- : Lars Ellenberg : LINBI

Re: [DRBD-user] Warning: If using drbd; data-loss / corruption is possible; PLEASE FIX IT !

2017-08-16 Thread Lars Ellenberg
nsistent version of the data before becoming inconsistent to mitigate that. Still, "constantly" cycling between Connected While not idle for long enough Ahead/Behind, SyncSource/SyncTarget is a bad idea. If you want snapshot shipping, use a system designed for snapshot

Re: [DRBD-user] drbd 8.4.6-5 oops when disconnect

2017-08-10 Thread Lars Ellenberg
.4.6-5. > > call stack: > <4>[66071017.155051] Modules linked in: softdog drbd(FN) What did you need to force the module for? Probably *that* is your problem right there. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemake

Re: [DRBD-user] 9.0.9rc1-1 crashes system

2017-08-09 Thread Lars Ellenberg
ide any additional info that is needed if you let me know > what is required. IO stack (e.g. lsblk ; lsblk -t ; lsblk -D) may be interesting, as well as the drbd configuration (drbdadm dump ), the kernel logs around the "crash" if possible, and your "best guess" as to what

Re: [DRBD-user] DRBD and TRIM -- Slow! -- RESOLVED

2017-08-07 Thread Lars Ellenberg
nd, which means you hit some other bottleneck much earlier (the discard bandwidth of the backing storage...) Note that DRBD 9.0.8 still has a problem with discards larger than 4 MB, though (will hit protocoll error, disconnect, and reconnect). That is already fixed in git, 9.0.9rc1 has that fixed. (8.4.10 a

Re: [DRBD-user] Dual primary and LVM

2017-08-07 Thread Lars Ellenberg
On Thu, Jul 27, 2017 at 10:11:48AM +0200, Gionatan Danti wrote: > To clarify: the main reason I am asking about the feasibility of a > dual-primary DRBD setup with LVs on top of it is about cache coherency. Let > me do a step back: the given explaination for deny even read access on a > secondary

Re: [DRBD-user] Getting TRIM to Work with DRBD

2017-07-24 Thread Lars Ellenberg
an other fix (unrelated to your scenario, related to the request size verification) even post 9.0.8. So yes, it *will* help. And no, you will not have any luck with 9.0.1. Not at all. And not only for discards. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbea

Re: [DRBD-user] Show Current Resync Rate

2017-07-24 Thread Lars Ellenberg
s Or mount debugfs, and find some information there. No, that is not supposed to exist, and may change at any time, and will not be documented, so don't rely on anything you may find there. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pa

Re: [DRBD-user] crm-fence-peer.sh did not place the constraint!

2017-07-18 Thread Lars Ellenberg
A cluster, why not simply stay with 8.4. In that scenario, there is currently nothing to gain from 9, and since 8.4 can optimize some situations (it "knows" there can only be one peer), it will even give better performance sometimes. -- : Lars Ellenberg : LINBIT | Keeping the D

Re: [DRBD-user] DRBD9/PVE4: udev Link Creation timed out

2017-06-21 Thread Lars Ellenberg
ically, systemd udev here) thinks that executing that helper program took too long. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subs

Re: [DRBD-user] [drbd 9] system hang with huge number of ASSERTION failure in dmesg

2017-06-12 Thread Lars Ellenberg
On Fri, Jun 09, 2017 at 11:39:05PM +0800, David Lee wrote: > Hi, > > I am experimenting with DRBD dual-primary with OCFS 2, and DRBD client as > well. > With the hope that every node can access the storage in an unified way. > But I got a > kernel call trace and huge number of ASSERTION failure

Re: [DRBD-user] getting error after changing IP address

2017-05-23 Thread Lars Ellenberg
ould drbdadm complain about? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed _

Re: [DRBD-user] getting error after changing IP address

2017-05-23 Thread Lars Ellenberg
am using. > drbd.x86_64 8.2.7-3 installed Not suggesting that this would have anything to do with what you are seeing, but this is "ancient" (you knew that, of course). -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacema

Re: [DRBD-user] Dual-Primary DRBD node fenced after other node reboots UP

2017-05-12 Thread Lars Ellenberg
en. If it did attempt to do that and failed, you will have to look into why, which, again, should be in the logs. Double check constraints, and also double check if GFS2/DLM fencing is properly integrated with pacemaker. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running :

  1   2   3   4   5   6   7   8   9   10   >