Re: [DRBD-user] Stacked DRBD device hangs

2014-12-10 Thread Lars Ellenberg
kernel: [668459.542394] 881017c57fd8 > 881017c57fd8 881017c57fd8 000137c0 > Dec 8 03:37:32 node2 kernel: [668459.565602] 8810197b4500 > 88100a612e00 881017c57a90 88207fcb4080 > Dec 8 03:37:32 node2 kernel: [668459.588736] Call Trace: > > Is thi

Re: [DRBD-user] weird behavior with metadata

2015-01-26 Thread Lars Ellenberg
a=commit;h=7969b9dc6636b5543d16b7b09ae267abc64ca111 | drbdmeta: fix regression corner cases in bitmap size calculation | | Regression was introduced with the kernel/userland split. | Regression was caused underestimating the number of bytes by | not aligning the number of bits to 64 before converti

Re: [DRBD-user] DRBD I/O problems & corrupted data

2015-01-26 Thread Lars Ellenberg
On Fri, Jan 16, 2015 at 08:53:16PM +0100, Saso Slavicic wrote: > Hello, > > We have been using DRBD for about 4 years now, so I have some > experience with it. Today it was the first time that DRBD actually > caused data loss… > > We mostly use DRBD in cases where older hardware is reused (upgr

Re: [DRBD-user] split brain issues

2015-01-26 Thread Lars Ellenberg
"Surprise" > PS: I get Eric's post where he mention: "The split brain would only happen on > dual primary. " > So i changed to Primary/Secondary and stoped the HA in Proxmox. Most "HA" in "Proxmox" I came accross over the years is very much mis

Re: [DRBD-user] DRBD I/O problems & corrupted data

2015-01-27 Thread Lars Ellenberg
from time to time. It won't be "a bit out of sync". It will be *inconsistent*. That's a polite word for "corrupt by design". Which is why you should take a snapshot of your last consistent data, from your before-resync-target handler. Yes, that's supposed to b

Re: [DRBD-user] "adjust_master_score" attribute is ignored for pacemaker drbd resource

2015-01-29 Thread Lars Ellenberg
typo in your cib. Did you check the cib, or the logs? Did you check what environment is actually passed to the agent? -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting ___ dr

Re: [DRBD-user] "adjust_master_score" attribute is ignored for pacemaker drbd resource

2015-01-30 Thread Lars Ellenberg
ebug/log log should soon contain "set -x" shell traces and stderr of the resource agent invokations, which should help you figure out what exactly happens. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consu

Re: [DRBD-user] "adjust_master_score" attribute is ignored for pacemaker drbd resource

2015-01-31 Thread Lars Ellenberg
_score attribute won't accept a > negative number, however, if the RA is altered to have > adjust_master_score_default in the RA contains a negative number, there is > no issue. > > Is there a reason the resource agent doesn't want a negative value in > adjust_master_sc

Re: [DRBD-user] DRBD module crash with KVM avec LVM

2015-02-19 Thread Lars Ellenberg
Which is very old, and may well contain bugs/races, or even simply performance bottlenecks that cause you pain. Is your overall performance still ok? Is your raid controller (and its battery) healthy? Do you even have a raid controller + cache + bbu? -- : Lars Ellenberg : http://www.LINBIT.com |

Re: [DRBD-user] WFReportParams/WFConnection following reboot of secondary node

2015-02-19 Thread Lars Ellenberg
DRBD "on-board" means do not suffice or block themselves, you can always try to get DRBD out of whatever network problem it thinks it is in by REJECTing its ports in and out with iptables... (and then remove those reject rules again, of course). Then try again. Unless DRBD is deadlock

Re: [DRBD-user] Question about using DRBD to do snapshots on AWS EBS volumes

2015-02-19 Thread Lars Ellenberg
pshot >umount /mnt/snapshot > >echo Remove the snapshot > lvremove -f /dev/important_volume/$SNAPSHOT_ID > >echo Reconnect us -- the secondary >drbdadm connect r0 > >echo Umount the backup disk >umount /mnt/backup For this, technically

Re: [DRBD-user] Question about using DRBD to do snapshots on AWS EBS volumes

2015-02-27 Thread Lars Ellenberg
On Fri, Feb 27, 2015 at 04:16:00PM +, Giles Thomas wrote: > Hi Lars, > > On 19/02/15 11:53, Lars Ellenberg wrote: > >On Fri, Feb 06, 2015 at 07:45:49PM +, Giles Thomas wrote: > >>Are we right in thinking that the purpose of using LVM [on the secondary > >

Re: [DRBD-user] DRBD offline resize problem

2015-03-21 Thread Lars Ellenberg
block, top level directory, all gone!) :-( Lars Ellenberg ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] DRBD offline resize problem

2015-03-27 Thread Lars Ellenberg
reby corrupting the first part > of the data, up to the size of the old bitmap area. > (embeded disk image partition table, file system super block, > top level directory, all gone!) > > > :-( > >Lars Ellenberg -- : Lars Ellenberg : http://www.LINBIT.com

Re: [DRBD-user] A question about DRBD-8.4.6rc1

2015-04-06 Thread Lars Ellenberg
the DRBD of started Step 5 to Primary. > > When used DRBD-8.4.5, don't able to promote the DRBD at Step 6. > > Is this behavior is correct? Depends on your DRBD configuration, specifically the fencing policy, and the fence-peer-handler. -- : Lars Ellenberg : http://www.LINB

Re: [DRBD-user] drbd and ubuntu 14.04

2015-04-06 Thread Lars Ellenberg
ing you found in some Ubuntu repo a year ago) If that was not the question, please elaborate ;-) -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LI

Re: [DRBD-user] noobie question

2015-04-06 Thread Lars Ellenberg
SI or other protocols, but again Pacemaker would decide when and where to place the services, and access is typically via some dedicated service IP, placement of which is again controlled by Pacemaker. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and

Re: [DRBD-user] Broken rpm dependency check in drbd-km (v8.4.6)

2015-04-15 Thread Lars Ellenberg
bd". But thanks, the old spec file apparently is broken, we can fix that. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LINBIT, Austr

Re: [DRBD-user] A question about drbd hang in conn_disconnect

2015-04-16 Thread Lars Ellenberg
t; Should we move set_bit(BITMAP_IO, &device->flags) to the front of > drbd_queue_work()? No. That would be the wrong fix, and cause potential inconsistencies later. It may need to be fixed, but in a different way. Let me (reproduce locally ... and) think about that for a bit. Thanks, --

Re: [DRBD-user] DRBD issue with Ganeti 2.12.1 on Ubuntu 14.04 with DRBD version 8.4.3

2015-05-07 Thread Lars Ellenberg
gt; Is there anything else I can > check to try and figure out what is going on? Check the kernel logs. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® a

Re: [DRBD-user] Dual Primary Mode: Shared Directory blocked after node crash until reboot

2015-05-13 Thread Lars Ellenberg
y cluster file system, regardless of whether you have an actually shared or a synchronously replicated storage backend. With DRBD, you *additionally* need "fencing resource-and-stonith" on the DRBD level. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linu

Re: [DRBD-user] Error "1 corrupt AL transactions found" after DRBD upgrade

2015-05-13 Thread Lars Ellenberg
; at the moment. This should address all your concerns: http://git.drbd.org/gitweb.cgi?p=drbd-utils.git;a=commitdiff;h=7e63e763a86ea691c5c5eda5b3e6bd747d74b186 commit 7e63e763a86ea691c5c5eda5b3e6bd747d74b186 Author: Lars Ellenberg Date: Mon Sep 9 14:53:01 2013 +0200 drbdmeta apply-al: do

Re: [DRBD-user] Error "1 corrupt AL transactions found" after DRBD upgrade

2015-05-26 Thread Lars Ellenberg
1c5c5eda5b3e6bd747d74b186 > > Author: Lars Ellenberg > > Date: Mon Sep 9 14:53:01 2013 +0200 > > > > drbdmeta apply-al: don't scare users with words like "corrupt" > > > > When converting drbd 8.3 to 8.4 on-disk meta data, > > som

Re: [DRBD-user] Multi-threaded on-line verification

2015-05-26 Thread Lars Ellenberg
On Sun, May 24, 2015 at 08:33:13PM +0200, Ben RUBSON wrote: > > Le 22 avr. 2014 à 14:40, Lars Ellenberg a écrit > > : > > > > On Mon, Apr 21, 2014 at 11:11:09AM +0200, Ben RUBSON wrote: > >> Hello, > >> > >> Let's assume that we have a

Re: [DRBD-user] Can not attach drbd device

2015-05-26 Thread Lars Ellenberg
teps you need to bring up a DRBD from unconfigured to configured: # drbdadm -d up mydata List steps you need to bring a DRBD from "whatever" to "as described in drbd.conf" # drbdadm -d adjust mydata Actually do it: # drbdadm adjust mydata -- : Lars Ellenberg : http://www.LI

Re: [DRBD-user] [PATCH v4 08/11] block: kill merge_bvec_fn() completely

2015-05-26 Thread Lars Ellenberg
On Tue, May 26, 2015 at 04:18:24AM +, Klint Gore wrote: > Do these things belong in the users list or should they be in the dev list? They probably should be on -dev, but in upstream linux MAINTAINERS we have both -dev and -user addresses mentioned, and apparently the (semi-)automatic tools th

Re: [DRBD-user] add volume to resource

2015-06-08 Thread Lars Ellenberg
lume but the > create-md command only specifies the complete resource? drbdadm create-md / as in drbdadm create-md stuff/7 -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trade

Re: [DRBD-user] Strange problem with output of /proc/drbd

2015-06-12 Thread Lars Ellenberg
;ns:0 nr:0 dw:0 dr:25250432 al:0 bm:0 lo:64 pe:6175 ua:129 ap:0 > >ep:1 wo:f oos:0 > >[=>..] verified: 72.0% (4016/14336)Mfinish: > >0:01:36 speed: 42,620 (42,604) want: 57,560 K/sec -- : Lars Ellenberg : http://ww

Re: [DRBD-user] LVM under DRBD resource for NFS server

2015-06-25 Thread Lars Ellenberg
Considering that i don't need > to resize /dev/sda1, can I avoid the use of LVM? You can, but that does not make a difference, only limits you in flexibility. You should probably use LVM *instead* of partitioning. Lars Ellenberg ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] Kernel Oops on peer when [removing LVM snapshot] receiving DISCARD/TRIM command

2015-06-25 Thread Lars Ellenberg
You are using some ubuntu 3.16.whatever kernel. That should be roughly equivalent to "drbd 8.4.4". That bug (secondary bombs out when receiving TRIM) is most likely already fixed, we would receive a flood of complaints otherwise. *IF* you can still reproduce with more recent

Re: [DRBD-user] Pacemaker linbit:drbd resource agent error: Failed actions:..not configured

2015-06-26 Thread Lars Ellenberg
a "primitive". See the DRBD User's Guide section on Pacemaker integration. > Jun 25 14:26:18 [1053] trebles-postgresqlcib: info: > cib_process_request:Completed cib_modify operation for section status: OK > (rc=0, origin=trebles/crmd/19, version=0.362.2)

Re: [DRBD-user] drbd_set_status_variables() outputs `This command will ignore resource names!'

2015-06-29 Thread Lars Ellenberg
uot;$($DRBDADM sh-status "$DRBD_RESOURCE")" + if $DRBD_HAS_MULTI_VOLUME ; then + eval "$($DRBDSETUP sh-status "$DRBD_RESOURCE")" + else + eval "$($DRBDSETUP "$DRBD_DEVICE" sh-status)" + fi

Re: [DRBD-user] drbd_set_status_variables() outputs `This command will ignore resource names!'

2015-06-29 Thread Lars Ellenberg
On Mon, Jun 29, 2015 at 12:40:33PM +0200, Lars Ellenberg wrote: > On Mon, Jun 29, 2015 at 05:14:10PM +0900, Hiroshi Fujishima wrote: > > > I notice the following error message. > > > > > > Jun 26 13:41:01 sac-tkh-sv001 pacemaker_remoted[32149]: not

Re: [DRBD-user] drbd_set_status_variables() outputs `This command will ignore resource names!'

2015-06-29 Thread Lars Ellenberg
On Mon, Jun 29, 2015 at 12:50:50PM +0200, Lars Ellenberg wrote: > On Mon, Jun 29, 2015 at 12:40:33PM +0200, Lars Ellenberg wrote: > > On Mon, Jun 29, 2015 at 05:14:10PM +0900, Hiroshi Fujishima wrote: > > > > I notice the following error message. > > > > >

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
sv002-res_drbd_r0_start_0:7 [ drbdadm: Unknown command 'syncer'\n ] You happen to have more than one DRBD module available? -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and cons

Re: [DRBD-user] kmod-drbd-9.0.0 conflicts CentOS 7 kernel pacakge

2015-06-30 Thread Lars Ellenberg
release 7.1.1503 (Core) > # uname -r > 3.10.0-229.7.2.el7.x86_64 > # wget http://oss.linbit.com/drbd/9.0/drbd-9.0.0.tar.gz > # tar xf drbd-9.0.0.tar.gz > # cd drbd-9.0.0 > # make kmp-rpm Works for me. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : D

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
On Tue, Jun 30, 2015 at 10:16:42AM +0200, Lars Ellenberg wrote: > On Tue, Jun 30, 2015 at 09:46:49AM +0900, Hiroshi Fujishima wrote: > > Hello > > > > In combination of drbd-utils-8.9.3 and drbd-8.4.6, the following > > command failed to start drbd resource. > >

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
On Tue, Jun 30, 2015 at 05:57:17PM +0900, Hiroshi Fujishima wrote: > >>>>> In <20150630082043.GH7381@soda.linbit> > >>>>> Lars Ellenberg wrote: > > On Tue, Jun 30, 2015 at 10:16:42AM +0200, Lars Ellenberg wrote: > > > On Tue,

Re: [DRBD-user] kmod-drbd-9.0.0 conflicts CentOS 7 kernel pacakge

2015-06-30 Thread Lars Ellenberg
On Tue, Jun 30, 2015 at 06:00:43PM +0900, Hiroshi Fujishima wrote: > >>>>> In <20150630081801.GG7381@soda.linbit> > >>>>> Lars Ellenberg wrote: > > > # cat /etc/redhat-release > > > CentOS Linux release 7.1.1503 (Core) > &

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
On Tue, Jun 30, 2015 at 11:14:01AM +0200, Lars Ellenberg wrote: > On Tue, Jun 30, 2015 at 05:57:17PM +0900, Hiroshi Fujishima wrote: > > >>>>> In <20150630082043.GH7381@soda.linbit> > > >>>>> Lars Ellenberg wrote: > > > On

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
Am 30. Juni 2015 15:32:17 MESZ, schrieb Andreas Mock : >Hi Lars, > >looking at the patch I'm pretty sure that there is another >bug lurking around. Are you, now. You did read the other thread, and look into git? >I made a diff ... Thanks, Lars ___

Re: [DRBD-user] Pacemaker resource start failure in combination of drbd-utils-8.9.3 and drbd-8.4.6

2015-06-30 Thread Lars Ellenberg
On Tue, Jun 30, 2015 at 03:18:51PM +, Andreas Mock wrote: > Hi Lars, > > I haven't looked at git until now. I've found your > patch. Thank you for the pointer and sorry for the > noise. No problem, thanks for reporting, I rather have double reports than non at all. Cheers, Lars

Re: [DRBD-user] DRBD9 and RPM Install

2015-07-14 Thread Lars Ellenberg
On Tue, Jul 14, 2015 at 02:22:11PM +, Wieck, Owen wrote: > > It appears that you already have drbd-utils installed. You might want > to use "rpm -Uvh " instead? > > --OLW 24k in ms-tnef winmail.dat for the above ~100 byte? really? -> Owen, if it is not too much to ask, please configure yo

Re: [DRBD-user] How to rate-limit device cleanup (shred, dd)

2015-07-15 Thread Lars Ellenberg
he impact on overall system performance? You really want to avoid to clobber your precious cache, or even drive out "idle" data pages into swap. Use direct IO. Or limit total memory usage, including buffer cache pages, using cgroups. And use a rate limit (again, using cgroups, if yo

Re: [DRBD-user] How to rate-limit device cleanup (shred, dd)

2015-07-17 Thread Lars Ellenberg
On Thu, Jul 16, 2015 at 12:07:32PM +0200, Helmut Wollmersdorfer wrote: > > Am 15.07.2015 um 14:49 schrieb Lars Ellenberg : > > > On Wed, Jul 15, 2015 at 01:01:02PM +0200, Helmut Wollmersdorfer wrote: > > […] > > >> > >> This works nice for small devi

Re: [DRBD-user] help for recovering data (magic not found, no mirror available)

2015-07-21 Thread Lars Ellenberg
a completely, or at least do a full sync later. Or, of course, if you intend to "un-deploy" DRBD, and just want to get at the data. In this case, that would be /dev/vdc (maybe double check first that vdc is in fact the correct device). Depending on how exactly your system

Re: [DRBD-user] DRBD resyncs completely after each partitial sync

2015-07-21 Thread Lars Ellenberg
/proto:86-101) > And it is repeatable. > > Any idea whats going wrong here? See if 8.4.6 gives better results. I seem to remember a squashed bug with similar symptoms. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support

Re: [DRBD-user] DRBD9 and RPM Install

2015-07-23 Thread Lars Ellenberg
On Wed, Jul 22, 2015 at 03:42:38PM +0100, Phil Daws wrote: > So matched a custom kernel with the driver and now it complains about > not having drbd-utils >= 9.0.0 yet the 8.9 branch is backward > compatible ? > > [root@drs01 ~]# rpm -Uvh > drbd-km-3.10.0_229.el7.centos.uxbod.x86_64-9.0.0-1.x86_6

Re: [DRBD-user] Pacemaker cluster: drbdadm: Unknown command 'syncer'

2015-08-04 Thread Lars Ellenberg
t. Please use 8.9.3-2. If you built your own packages, rebuild. If you got them from some "vendor", complain to that vendor. If you got them from LINBIT (does not look like it; also I'm pretty sure we did not put out broken packages in customer visible repos), complain to us. Ch

Re: [DRBD-user] Bad DRBD Performance

2015-08-06 Thread Lars Ellenberg
000s > sys 0m0.492s > 1+0 records in > 1+0 records out > 1073741824 bytes (1,1 GB) copied, 25,4277 s, 42,2 MB/s > > real0m25.514s > user0m0.000s > sys 0m0.504s > 1+0 records in > 1+0 records out > 1073741824 bytes (1,1 GB)

Re: [DRBD-user] DRBD 8.3.6: Primary hangs for ~120 seconds

2015-08-06 Thread Lars Ellenberg
k about things. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list

Re: [DRBD-user] BAD! sector after sync

2015-08-06 Thread Lars Ellenberg
afaicr, this was just some unneccessary loud warning about broken internal refcounts, will trigger a disconnect/reconnect cycle, which should reset said refcounts. While scary, this should usually not be a problem. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : D

Re: [DRBD-user] BAD! BarrerACK #... received, #... expected

2015-08-06 Thread Lars Ellenberg
\ phil@fat-tyre\,\ > 2013-10-11\ > 16:42:48DRBDADM_API_VERSION=1DRBD_KERNEL_VERSION_CODE=0x080403DRBDADM_VERSION_CODE=0x080404DRBDADM_VERSION=8.4.4 Use a DRBD more recent than 2013? DRBD 8.4.6 maybe? Cheers, -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD

Re: [DRBD-user] Kernel oopses with DRBD

2015-08-18 Thread Lars Ellenberg
On Fri, Aug 14, 2015 at 12:32:48PM +0200, Ben Siemerink wrote: > Hello, > > > Lately we have experienced five kernel oopses in our DRBD setup. The > stack trace is very similar every time. > > If you need more information, please let me know. Thank you in advance. > > > Kind regards, > Ben. >

Re: [DRBD-user] <1>error creating netlink socket - can't get drbd startet on ubuntu 14.04

2015-08-18 Thread Lars Ellenberg
e with DRBD >address 192.168.57.130:7789; # IP Address and port of Node2 >meta-disk internal; > } > } > > There is little to nothing to find on the net on how to solve this. Any > help is greatly appreciated. > ___ > drbd-user mailing list > drbd-u

Re: [DRBD-user] DRBD keeps doing full resync

2015-08-18 Thread Lars Ellenberg
> But to no avail. > > Currently I'm in process of downgrading the kernel to see what happens > but as it's a production server this will take little time (outage) :/ > > I'm also fairly new to DRBD so please point out anything obvious I may > be missing. >

Re: [DRBD-user] repeatable, infrequent, loss of data with DRBD

2015-08-22 Thread Lars Ellenberg
} > } > > The only interesting bit of global_common.conf is protocol C; and > allow-two-primaries; > > Regards, > > Matthew > > [0] the actual script is a bit fiddlier, as it has to deal with > systemd-udevd sometimes holding /dev/drbd7 open a bit longer t

Re: [DRBD-user] drbd.ocf misinterpreting role status with multiple volumes

2015-08-22 Thread Lars Ellenberg
On Tue, Aug 18, 2015 at 09:31:33PM +0200, Matthias Ferdinand wrote: > On Tue, Aug 18, 2015 at 06:22:10PM +0200, Lars Ellenberg wrote: Taking this back on the list. Sorry for accidentally taking it offlist before. > > > I think there is a conceptual bug in the DRBD OCF RA with

Re: [DRBD-user] drbd.ocf misinterpreting role status with multiple volumes

2015-08-25 Thread Lars Ellenberg
On Mon, Aug 24, 2015 at 10:59:52PM +0200, Matthias Ferdinand wrote: > On Sat, Aug 22, 2015 at 12:00:01PM +0200, drbd-user-requ...@lists.linbit.com > wrote: > > Date: Sat, 22 Aug 2015 11:29:37 +0200 > > From: Lars Ellenberg > > Subject: Re: [DRBD-user] drbd.ocf misinter

Re: [DRBD-user] repeatable, infrequent, loss of data with DRBD

2015-09-11 Thread Lars Ellenberg
On Thu, Sep 03, 2015 at 04:55:51PM +0100, Matthew Vernon wrote: > Hi, > > On 22/08/15 10:07, Lars Ellenberg wrote: > > Sorry for the delay - it took a while to sort out the necessary > debugging output &c. You have some strange effects in there. In the failed run, the

Re: [DRBD-user] Questions regarding the initial syncing of a drbd partition

2015-09-11 Thread Lars Ellenberg
quot;) run an "fstrim -v" against the DRBD mount point on the SyncSource while the resync is running, that should usually speed it up (because it won't read data -> transfer data -> write data, but simply send "discard this LBAs" and then do so...) -- : Lars

Re: [DRBD-user] [Drbd-dev] [PATCH 21/39] drbd: drop null test before destroy functions

2015-09-17 Thread Lars Ellenberg
pool: allow NULL `pool' pointer in mempool_destroy() 3942d29 mm/slab_common: allow NULL cache pointer in kmem_cache_destroy() (v4.3-rc1) Lars Ellenberg ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] drbd-utils-8.9.4.tar.gz

2015-09-23 Thread Lars Ellenberg
On Sat, Sep 19, 2015 at 03:02:19PM +0200, Dietmar Maurer wrote: > > this drbd-utils release is triggered by a drbdmanage. Drbdmanage > > needs the ability to have multiple connection-mesh statements > > per resource. > > If you do not use drbdmanage, there is no pressing reason to > > upgrade to th

Re: [DRBD-user] TRIM/discard leads to secondary becoming diskless

2015-10-05 Thread Lars Ellenberg
gt; https://github.com/torvalds/linux/blob/v4.3-rc4/drivers/block/drbd/drbd_req.c#L651 > > > > I tried searching for this problem but couldn't find any mentions of > it. I can't be the only one running DRBD on volumes that do not > support the TRIM/discard operation

Re: [DRBD-user] Meaning of assertion in drbd?

2015-10-05 Thread Lars Ellenberg
sh/WO_drain_io case of receive_Barrier(). However, I'm > not totally clear on the implications. Could someone explain what > this means and how severe it is? It means you use a 3.5 year old slightly broken DRBD version. Don't. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way

Re: [DRBD-user] DRBD 8.3.12 - Cluster Interface discarded [Lantone CRM #938849]

2015-10-08 Thread Lars Ellenberg
s certainly a good idea, it won't fix your "cluster IP getting discarded" issue. The problem is with your HA setup. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registere

Re: [DRBD-user] DRBD metadata issue + replacing a disk

2015-10-15 Thread Lars Ellenberg
tem corruption in case of > system crash. > > What is the appropriate action to fix this issue Did you even google for the message? "JBD: Spotted dirty metadata buffer" turns up several bugzillas from 2010 and 2012, suggesting kernel upgrade would help. -- : Lars Ellenberg :

Re: [DRBD-user] DRBD metadata issue + replacing a disk

2015-10-15 Thread Lars Ellenberg
t; > > drbd1, blocknr = 0). There's a risk of filesystem corruption in case of > > > system crash. > > > > > > What is the appropriate action to fix this issue > > > > Did you even google for the message? > > "JBD: Spotted dirty metadata

Re: [DRBD-user] Protocol A vs Protocol C performance issue?

2015-10-20 Thread Lars Ellenberg
es request size by request completion latency. dd has no concurrency, in the given example, request size is 1M, if you get 450 MB/s, your latency apparently is in the order of 2.3 ms. If you want more throughput, you need to decrease latency, or increase concurrency. -- : Lars Ellenberg : http:/

Re: [DRBD-user] Protocol A vs Protocol C performance issue?

2015-10-20 Thread Lars Ellenberg
than one dd, concurrently ;-) Decrease latency? Check if it's your disk or network or both, and if possible, make it complete stuff faster (without losing it in volatile caches in the event of a hard crash). -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability :

Re: [DRBD-user] Error using SDP in DRBD8.4

2015-10-21 Thread Lars Ellenberg
. Last time I checked, it "worked for me". > Oct 17 11:36:54 s2 multipathd: drbd0: remove path (uevent) Multipath on top of DRBD? really? -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulti

Re: [DRBD-user] Error using SDP in DRBD8.4

2015-10-22 Thread Lars Ellenberg
I checked, it "worked for me". > > It works with drbd 8.3. How can i continue in the debug process? Is there any > loglevel i can increase in drbd kernel module? So, it *still* works with drbd 8.3 on the upgraded kernel, and the only variable is the DRBD version? Or it *used* to wo

Re: [DRBD-user] How efficient is DRBD during Sync?

2015-10-22 Thread Lars Ellenberg
tes is total capacity minus capacity used by resync. But yes, any write during resync to no-yet synced, "dirty" blocks will bring those blocks in sync as well. As an extreme example, if you'd just zero-out the full device during resync, you'd bring it in-sync by applicat

Re: [DRBD-user] Protocol A vs Protocol C performance issue?

2015-10-22 Thread Lars Ellenberg
s writes with protocol C over 10Gbps Ethernet > (ConnectX-3 in ethernet or IPoIB mode) with IO depth >=32 and IO > size 512k (I use disktest from LTP project). That is what 6 Intel DC > S3610 can provide in HW RAID-10 with disk cache turned on. > > And, of course, Lars is ab

Re: [DRBD-user] DRBD Dual Primary (writable/writeble) setup over VDSL WAN links

2015-10-27 Thread Lars Ellenberg
ly increase reliability or availability ;-) Depending on the exact requirements and goals, there are other methods to synchronize data between branch-offices or similar. But "cluster file systems" (as in GFS2 or OCFS2) are made for environments with reliable low-latency LAN (and r

Re: [DRBD-user] Error using SDP in DRBD8.4

2015-10-28 Thread Lars Ellenberg
On Fri, Oct 23, 2015 at 09:49:37AM +0100, Nuno Fernandes wrote: > A Quinta, 22 de Outubro de 2015 16:47:53 Lars Ellenberg escreveu: > > On Wed, Oct 21, 2015 at 05:00:56PM +0100, Nuno Fernandes wrote: kernel: drbd infiniband: conn( Unconnected -> WFConnection ) kernel: drb

Re: [DRBD-user] Determining whether a resource is in dual-primary mode

2015-11-02 Thread Lars Ellenberg
gt; > So I was unable to find a flag in /proc/drbd that indicates whether > allow-dual-primary is active or not. What's wrong with # drbdsetup XYZ show --show-defaults -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support a

Re: [DRBD-user] Determining whether a resource is in dual-primary mode

2015-11-02 Thread Lars Ellenberg
On Mon, Nov 02, 2015 at 04:01:00PM +0100, Veit Wahlich wrote: > Am Montag, den 02.11.2015, 15:53 +0100 schrieb Lars Ellenberg: > > What's wrong with > > # drbdsetup XYZ show --show-defaults > > That is exactly what I was looking for, thank you! > > Just a little

Re: [DRBD-user] [PATCH] tree wide: Use kvfree() than conditional kfree()/vfree()

2015-11-10 Thread Lars Ellenberg
On Mon, Nov 09, 2015 at 08:56:10PM +0900, Tetsuo Handa wrote: > There are many locations that do > > if (memory_was_allocated_by_vmalloc) > vfree(ptr); > else > kfree(ptr); > > but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory > using is_vmalloc_addr(). Unless cal

Re: [DRBD-user] drbd-utils 8.9.4 checksum changed

2015-11-16 Thread Lars Ellenberg
we regenerated the uploaded tarball to this git id. The git hash is embeded in the tarball inside drbd-utils-8.9.4/user/shared/drbd_buildtag.c, zgrep -a 'return "GIT-hash:' drbd-utils-8.9.4.tar.gz return "GIT-hash: 337672266dabdd4bc8d9bced0aff28eb620d006d"

Re: [DRBD-user] SDP Connection problems with DRBD 8.4.6

2015-11-19 Thread Lars Ellenberg
rr = kernel_accept(ad->s_listen, &s_estab, O_NONBLOCK); + err = kernel_accept(ad->s_listen, &s_estab, 0); And hope that this does not block forever, and maybe even works similar to how it used to work with drbd 8.3. At least that should give you a starting point... -- : Lars Ellenbe

Re: [DRBD-user] SDP Connection problems with DRBD 8.4.6

2015-11-19 Thread Lars Ellenberg
ust like everybody else ;-) > If IPv4 is just flat out more supported and more stable then I will > just use that and tune it the best I can. IP (v4 or 6, does not matter) is what everybody is using. So yes. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability

Re: [DRBD-user] DRBD and XFS filesystem on lvm corruption

2015-11-23 Thread Lars Ellenberg
ou are already desperate to get at the data, even just for "recovery": many times you still can do a read-only mount ("mount -o ro"). > please let me know if you want me to check anything. > My DRBD version is: 8.4.4 > > i have other non drbd xfs volumes

Re: [DRBD-user] DRBD terrible sync performance on 10GigE

2015-12-04 Thread Lars Ellenberg
minimum setting that still reaches that rate. And finally, while checking application request latency/responsiveness, tune c-min-rate to the maximum that still allows for acceptable responsiveness. You may need to adjust max-buffers and/or tcp send/receive buffer sizes as well. -- : Lars Ellenberg :

Re: [DRBD-user] DRBD spontaneously loses connection

2015-12-10 Thread Lars Ellenberg
ly try to run dual-primary without any fencing at all. Without fencing, you get to keep the pieces. Without fencing, any replication link hickup results in data divergence, aka a "split brain" scenario. You really want to avoid that. -- : Lars Ellenberg : http://www.LINBIT.com | Your W

Re: [DRBD-user] requests get stuck in secondary

2015-12-10 Thread Lars Ellenberg
before the upgrade this > exact same setup worked flawlessly... How can I debug this further? > > > > Current versions: > > version: 8.4.3 (api:1/proto:86-101) You may want to upgrade. I seem to remember potential false hits for the "timeout" detection. But maybe i

Re: [DRBD-user] ionice/schedulers/drdb

2015-12-10 Thread Lars Ellenberg
tuned CFQ may be beneficial. Tuning IO schedulers can be a challenge, especially if you are already maxing out your IO backend or other system resources.. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and

Re: [DRBD-user] DRBD terrible sync performance on 10GigE

2015-12-11 Thread Lars Ellenberg
On Fri, Dec 11, 2015 at 09:04:28AM +0100, Harald Dunkel wrote: > Hi Lars, > > On 12/04/2015 05:04 PM, Lars Ellenberg wrote: > > > > You are not supposed to disable the resync controller, > > you are supposed to correctly use it. > > > > https://blogs.linbi

Re: [DRBD-user] mke2fs issues an unused block warning when creating drbd0

2015-12-11 Thread Lars Ellenberg
n 455 * 32768 blocks of 4kB each, leaving 188kB or 47 blocks unused. Try with a loop device of specific size, or with some other partition of "odd" size, and you'll get similar results. Nothing to do with DRBD. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availab

Re: [DRBD-user] issue with ipv6

2015-12-14 Thread Lars Ellenberg
On Fri, Dec 11, 2015 at 06:16:48PM +, Matthew Vernon wrote: > On 04/12/15 00:45, Judit Flo Gaya wrote: > > > # drbdadm adjust all > > drbdsetup show drbd17:9: Parse error: 'TK_IPADDR6' expected, > > but got 'ipv6' (TK 315) Yes, sorry for that :-( Should be fixed now: http://git.linbit.co

Re: [DRBD-user] DRBD Verify Log Message Out Of Sync Blocks Units

2015-12-21 Thread Lars Ellenberg
aybe simply "(sectors=512 Byte) / (ext4 block size=4096 Byte)"? > > ... block drbd7: Out of sync: start=22378360, size=8 (sectors) That's linear block address sectors (512 byte), so if the first byte is byte number 0, this is the 4k block from byte offset 512 * 22378360 (e

Re: [DRBD-user] DRBD Verify Log Message Out Of Sync Blocks Units

2015-12-24 Thread Lars Ellenberg
t linear block addresses? DRBD is "transparent", no offset or other transformation added. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ plea

Re: [DRBD-user] writer order on secondary site

2015-12-24 Thread Lars Ellenberg
n) to make sure anything from a previous epoch is on stable storage before starting to submit for the next epoch. -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of

Re: [DRBD-user] About promoting secondary node

2015-12-24 Thread Lars Ellenberg
7;,'Consistent'] DRBD will allow promotion, if it is UpToDate, or has access to UpToDate data on some peer, or is Consistent, and can ensure via call to the fence-peer handler that it actually is UpToDate. Or if you --force it to consider whatever data it has access to as UpToDate. -- : La

Re: [DRBD-user] writer order on secondary site

2016-01-08 Thread Lars Ellenberg
ring is allowed. > > "on the wire", epochs are separated by "DRBD barriers". > > Receiving side uses drain/flush/barrier (depending on configuration) > > to make sure anything from a previous epoch is on stable storage > > before starting to submi

Re: [DRBD-user] drbd lost lvm configuration

2016-01-19 Thread Lars Ellenberg
te of this volumegroup. > > So I did: > > drbdadm primary > vgchange -a y > > When I do a pvscan after changing the primary node, no pv's were found > and the same, by changing the primary again. stale lvm meta data cache (daemon)? try a pvscan --cache, see the man

Re: [DRBD-user] Trouble with connecting / starting drbd.

2016-01-19 Thread Lars Ellenberg
om drbd_r_r0 [3400]) > > It would reconnect, sync, and disconnect again. I stopped the node, > checked the hardware (all seems fine), rebooted and tried to start drbd > again: > > root@data2:/var/log# drbdadm connect r0 > r0: Failure: (158) Unknown resource As you probably f

Re: [DRBD-user] Automatic recovery from "unrelated data"?

2016-01-19 Thread Lars Ellenberg
s without re-initialization? And if that is > the only way out, how could we detect the situation programmatically? > (apart from parsing kernel logs) > > TIA, > > --Timo -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacema

Re: [DRBD-user] Kernel panic with DRBD 9.0 on Kernel 4.2.6 "LOGIC BUG for enr=x"

2016-01-19 Thread Lars Ellenberg
40960k; # bytes/second > } > } > } > } > > Shortly after the tg3 watchdog trigger, it's probably a consequence of > the drbd kernel panic but maybe not ? > > See here: https://pastebin.synalabs.hosting/#cI5nWLuuD37_yN6ii8RLtg >

Re: [DRBD-user] drbd lost lvm configuration

2016-01-19 Thread Lars Ellenberg
c0d0p4 > Using duplicate PV /dev/drbd0 from subsystem DRBD, ignoring > /dev/cciss/c0d0p4 You should have filtered out the cciss path. also check the "global_filter" in lvm.conf, not just the filter, and double check that your initramfs knows about lvm device filters as well. > E

<    4   5   6   7   8   9   10   11   12   13   >