Re: [DRBD-user] [linux-lvm] LVM on top of DRBD [actually: mkfs.ext4 then mount results in detach on RHEL 7 on VMWare]
Hi all, sorry to be so stubborn- still no real explanation for the behaviour. I did some test meanwhile: Created drbd device, set up LV. When using xfs instead of ext4 --> runs fine. On CentOS6: mkfs.ext4- no matter on which host I mount it the first time --> runs fine. On CentOS7: mkfs.ext4- mounted on CentOS6 --> runs fine. On CentOS7: mkfs.ext4- mounted on CentOS7 --> disk detached. Now I skipped LVM in-between. On CentOS7: mkfs.ext4- mounted on CentOS7 --> runs fine (detached with LVM!) If this is related to the lazy writes it appears to me LVM shows different capabilities to mkfs than DRBD does. Lars wrote: What really happens is that the file system code calls blkdev_issue_zeroout(), which will try discard, if discard is available and discard zeroes data, or, if discard (with discard zeroes data) is not available or returns failure, tries write-same with ZERO_PAGE, or, if write-same is not available or returns failure, tries __blkdev_issue_zeroout() (which uses "normal" writes). At least in "current upstream", probably very similar in your almost-3.10.something kernel. DRBD sits in between, sees the failure return of write-same, and handles it by detaching. blkdev_issue_zeroout() is called. Which tries different possibilities. DRBD sees the error on write-same (after discard failed/ is not available) and detaches. Sounds reasonable. If I skip LVM usage everything is fine. Means, mkfs.ext4 succeeds in using discard or uses "normal" writes without trying first discard and write-same. In first case- why does it succedd with write-same (or discard?) when there is no LVM in-between? In the second case- why does it not try to use the faster ones? Does DRBD not offer these capabilities? If so, why does LVM if the underlying device does not? Greetings Christian Am 10.01.2017 um 10:42 schrieb Lars Ellenberg: > On Sat, Jan 07, 2017 at 11:16:09AM +0100, Christian Völker wrote: >> Hi all, >> >> >> I have to cross-post to LVM as well to DRBD mailing list as I have no >> clue where the issue is- if it's not a bug... >> >> I can not get working LVM on top of drbd- I am getting I/O erros >> followed by "diskless" state. > For some reason, (some? not only?) VMWare virtual disks tend to pretend > to support "write same", even if they fail such requests later. > > DRBD treats such failed WRITE-SAME the same way as any other backend > error, and by default detaches. > > mkfs.ext4 by default uses "lazy_itable_init" and "lazy_journal_init", > which makes it complete faster, but delays initialization of some file system > meta data areas until first mount, where some kernel daemon will zero-out the > relevant areas in the background. > > Older kernels (RHEL 6) and also older drbd (8.3) are not affected, because > they > don't know about write-same. > > Workarounds exist: > > Don't use the "lazy" mkfs. > During normal operation, write-same is usually not used. > > Or tell the system that the backend does not support write-same: > Check setting: > grep ^ /sys/block/*/device/scsi_disk/*/max_write_same_blocks > disable: > echo 0 | tee /sys/block/*/device/scsi_disk/*/max_write_same_blocks > > You then need to re-attach DRBD (drbdadm down all; drbdadm up all) > to make it aware of this change. > > Fix: > > Well, we need to somehow add some ugly heuristic to better detect > wether some backend really supports write-same. [*] > > Or, more likely, add an option to tell DRBD to ignore any pretend-only > write-same support. > > Thanks, > > Lars > > [*] No, it is not as easy as "just ignore any IO error if it was a write-same > request", because we try to "guarantee" that during normal operation, all > replicas are in sync (within the limits defined by the replication protocol). > If replicas fail in different ways, we can not do that (at least not without > going through some sort of "recovery" first). > >> Steps to reproduce: >> >> Two machine2. >> >> A: CentOS7 x64; epel-providedd packages >> kmod-drbd84-8.4.9-1.el7.elrepo.x86_64 >> drbd84-utils-8.9.8-1.el7.elrepo.x86_64 >> >> B: CentOS6 x64; epel-provided packages >> kmod-drbd83-8.3.16-3.el6.elrepo.x86_64 >> drbd83-utils-8.3.16-1.el6.elrepo.x86_64 >> >> drbd1.res: >> resource drbd1 { >> protocol A; >> startup { >> wfc-timeout 240; >> degr-wfc-timeout 120; >> become-primary-on backuppc; >> } >> net { >> max-buffers 8000; >> max-epoch-size 8000; >> sndbuf-size 128k; >> shared-secret "13Lue=3"; >> } >> syncer { >> rate 500M; >> } >> on backuppc { >> device /dev/drbd1; >> disk /dev/sdc; >> address 192.168.0.1:7790; >> meta-disk internal; >> } >> on drbd { >> device /dev/drbd1; >> disk /dev/sda; >> address 192.168.2.16:7790; >> meta-disk internal; >> } >> } >> >> I was able to create the drbd as expected (see first line of following >> syslog), it gets in sync. >> So I set up LVM and create filter rules so
[DRBD-user] Testing new DRBD9 dedicated repo for PVE
Stupid me, now all working, maybe the resources was gone down upgrading? Thanks a lot, Michele Il 13/01/2017 19:58, Michele Rossetti ha scritto: root@mpve1:~# drbdsetup status .drbdctrl role:Secondary volume:0 disk:UpToDate volume:1 disk:UpToDate mpve2 role:Secondary volume:0 peer-disk:UpToDate volume:1 peer-disk:UpToDate mpve3 role:Primary volume:0 peer-disk:UpToDate volume:1 peer-disk:UpToDate This means that all your resources are down, only control volumes are up ... Have you tried a: "drbdmanage restart" ? rob -- """ MICRO srl Informatica e Telecomunicazioni - Web services - Web sites Michele Rossetti sede legale: via Raffa Garzia 7 09126 Cagliari (Italy) sede operativa: viale Marconi 222 09131 Cagliari Ph. +39 070 400240 Fax +39 070 4526207 MKM-REG Web: http://www.microsrl.com http://www.sardi.it E-mail: micro...@microsrl.com """ ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] Testing new DRBD9 dedicated repo for PVE
Il 13/01/2017 19:58, Michele Rossetti ha scritto: > root@mpve1:~# drbdsetup status > .drbdctrl role:Secondary > volume:0 disk:UpToDate > volume:1 disk:UpToDate > mpve2 role:Secondary > volume:0 peer-disk:UpToDate > volume:1 peer-disk:UpToDate > mpve3 role:Primary > volume:0 peer-disk:UpToDate > volume:1 peer-disk:UpToDate This means that all your resources are down, only control volumes are up ... Have you tried a: "drbdmanage restart" ? rob ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
[DRBD-user] Testing new DRBD9 dedicated repo for PVE
After upgrade, dist upgrade to PVE 4.4 and drbdmanage-proxmox install, the KVM VMs don't start anymore at boot, they start only if are on primary node, and when started don't migrate for priaery node to secondary, even in HA. The error messages from PVE are: kvm: -drive file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on: Could not open '/dev/drbd/by-res/vm-104-disk-1/0': No such file or directory TASK ERROR: start failed: command '/usr/bin/kvm -id 104 -chardev 'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid -daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' -name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc-i440fx-2.7' -incoming unix:/run/qemu-server/104.migrate -S' failed: exit code 1 and trying to migrate a started VM in HA: task started by HA resource agent Jan 13 19:13:47 starting migration of VM 104 to node 'mpve1' (82.xx.xx.xx) Jan 13 19:13:47 copying disk images Jan 13 19:13:47 starting VM 104 on remote node 'mpve1' Jan 13 19:13:50 start failed: command '/usr/bin/kvm -id 104 -chardev 'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid -daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' -name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc-i440fx-2.7' -incoming unix:/run/qemu-server/104.migrate -S' failed: exit code 1 Jan 13 19:13:50 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' r...@82.xx.xx.xx qm start 104 --skiplock --migratedfrom mpve3 --migration_type secure --stateuri unix --machine pc-i440fx-2.7' failed: exit code 255 Jan 13 19:13:50 aborting phase 2 - cleanup resources Jan 13 19:13:50 migrate_cancel Jan 13 19:13:51 ERROR: migration finished with problems (duration 00:00:04) TASK ERROR: migration problems It's true that /dev/drbd/by-res/vm-104-disk-1/0': No such file or directory Trying to locate vm-104-disk this is the output: root@mpve1:/dev/drbd/by-disk/drbdpool# locate vm-104-disk /dev/drbdpool/vm-104-disk-1_00 /var/lib/drbd.d/drbdmanage_vm-104-disk-1.res.q Cheching DRBD all seem ok. root@mpve1:~# drbd-overview 0:.drbdctrl/0 Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3) UpTo(mpve1)/UpTo(mpve3,mpve2) 1:.drbdctrl/1 Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3)
Re: [DRBD-user] [linux-lvm] LVM on top of DRBD [actually: mkfs.ext4 then mount results in detach on RHEL 7 on VMWare]
On Thu, Jan 12, 2017 at 06:00:53PM +0100, Lars Ellenberg wrote: > On Wed, Jan 11, 2017 at 06:23:08PM +0100, kn...@knebb.de wrote: > > Hi Lars and all, > > > > > > >> I have to cross-post to LVM as well to DRBD mailing list as I have no > > >> clue where the issue is- if it's not a bug... > > >> > > >> I can not get working LVM on top of drbd- I am getting I/O erros > > >> followed by "diskless" state. > > > For some reason, (some? not only?) VMWare virtual disks tend to pretend > > > to support "write same", even if they fail such requests later. > > > > > > DRBD treats such failed WRITE-SAME the same way as any other backend > > > error, and by default detaches. > > Ok, it is beyond my knowledge, but I understand what the "write-same" > > command does. But if the underlying physical disk offers the command and > > reports an error when used this should apply to mkfs.ext4 on the device/ > > partition as well, shouldn't it? > > In this case, it happens on first mount. > Also, it is not an "EIO", but an "EOPNOTSUP". > > What really happens is that the file system code calls > blkdev_issue_zeroout(), > which will try discard, if discard is available and discard zeroes data, > or, if discard (with discard zeroes data) is not available or returns > failure, tries write-same with ZERO_PAGE, > or, if write-same is not available or returns failure, > tries __blkdev_issue_zeroout() (which uses "normal" writes). > > At least in "current upstream", probably very similar in your > almost-3.10.something kernel. > > DRBD sits in between, sees the failure return of write-same, > and handles it by detaching. > > > drbd detacheds when an error is > > reported- but why does Linux not report an error without drbd? And why > > does this only happen when using LVM in-between? Should be the same when > > LVM is not used > > Yes. And it is, as far as I can tell. > > > > Older kernels (RHEL 6) and also older drbd (8.3) are not affected, > > > because they > > > don't know about write-same. > > My primary host is running CentOS7 while the secondary ist older > > (CentOS6). I will try to create the ext4 on the secondary and then > > switch to primary. > > > > > Or tell the system that the backend does not support write-same: > > > Check setting: > > > grep ^ /sys/block/*/device/scsi_disk/*/max_write_same_blocks > > > disable: > > > echo 0 | tee /sys/block/*/device/scsi_disk/*/max_write_same_blocks > > > > > A "find /sys -name "*same*"" does not report any files named > > double check that, please. > all my centos7 / RHEL 7 (and other distributions with sufficiently new > kernel) have that. > > there are both the read-only /sys/block/*/queue/write_same_max_bytes > and the write-able > /sys/devices/*/*/*/host*/target*/*/scsi_disk/*/max_write_same_blocks > > > "max_write_same_blocks". On none of the both nodes. So I dcan not > > disable nor verify if it's enabled. I assume no as it does not exist. So > > this might not be the reason. > > show us lsblk -t and lsblk -D from the box that detaches. > (the "7" one) > > It may also be that a discard failed, in which case it could be > devicemapper pretending discard was supported, and the backend failing > that discard request. Or some combination there. > > Your original logs show > > Jan 7 10:58:44 backuppc kernel: EXT4-fs (dm-2): mounted filesystem with > > ordered data mode. Opts: (null) > > Jan 7 10:58:48 backuppc kernel: block drbd1: local WRITE IO error sector > > 5296+3960 on sdc > > The "+..." part is the length (number of sectors) of the request. > We don't allow "normal" requests of that size, so this is either a > discard or write-same. > > > Jan 7 10:58:48 backuppc kernel: block drbd1: disk( UpToDate -> Failed ) > > > Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor > > remote data, sector 29096+3968 > > > Jan 7 10:58:48 backuppc kernel: dm-2: WRITE SAME failed. Manually zeroing. > > And here we see that at least some WRITE SAME was issued, and returned > failure. > and device mapper, which in your case sits above DRBD, > and consumes that error, has its own fallback code for failed write-same. Correcting myself, the presence of the warning message misled me. The 3.10 kernel still has that warning message directly in blkdev_issue_zeroout(), so that's not the device mapper fallback, but simply the mechanism I described above, with additional "log that I took the fallback because of failure". Which means DISCARDS have not even been tried, or we'd have a message about that as well. > Which can no longer be services, because DRBD already detached. > > So yes, > I'm pretty sure that I did not pull my "best guess" out of thin air only > > ;-) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed ___
Re: [DRBD-user] Got stuck while installing DRBD for HA
On 01/12/2017 12:46 PM, Shuvam Jha wrote: [...] > I am unable to install pacemaker corosync package , this error is coming > I am using centos 7.3 > > [root@ccm1 ~]# yum install -y corosync pacemaker pcs > Loaded plugins: amazon-id, rhui-lb, search-disabled-repos > No package corosync available. > No package pacemaker available. > No package pcs available. > Error: Nothing to do https://fedoraproject.org/wiki/EPEL Good luck. smime.p7s Description: S/MIME Cryptographic Signature ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user