Re: [DRBD-user] [linux-lvm] LVM on top of DRBD [actually: mkfs.ext4 then mount results in detach on RHEL 7 on VMWare]

2017-01-13 Thread knebb
Hi all,

sorry to be so stubborn- still no real explanation for the behaviour.

I did some test meanwhile:

Created drbd device, set up LV.

When using xfs instead of ext4 --> runs fine.
On CentOS6: mkfs.ext4- no matter on which host I mount it the first time
--> runs fine.
On CentOS7: mkfs.ext4- mounted on CentOS6 --> runs fine.
On CentOS7: mkfs.ext4- mounted on CentOS7 --> disk detached.

Now I skipped LVM in-between.

On CentOS7: mkfs.ext4- mounted on CentOS7 --> runs fine (detached with LVM!)

If this is related to the lazy writes it appears to me LVM shows
different capabilities to mkfs than DRBD does.

Lars wrote:

What really happens is that the file system code calls
blkdev_issue_zeroout(),
which will try discard, if discard is available and discard zeroes data,
or, if discard (with discard zeroes data) is not available or returns
failure, tries write-same with ZERO_PAGE,
or, if write-same is not available or returns failure,
tries __blkdev_issue_zeroout() (which uses "normal" writes).

At least in "current upstream", probably very similar in your
almost-3.10.something kernel.

DRBD sits in between, sees the failure return of write-same,
and handles it by detaching.

blkdev_issue_zeroout() is called. Which tries different possibilities.
DRBD sees the error on write-same (after discard failed/ is not
available) and detaches. Sounds reasonable.

If I skip LVM usage everything is fine. Means, mkfs.ext4 succeeds in
using discard or uses "normal" writes without trying first discard and
write-same.

In first case- why does it succedd with write-same (or discard?) when
there is no LVM in-between?

In the second case- why does it not try to use the faster ones? Does
DRBD not offer these capabilities? If so, why does LVM if the underlying
device does not?

Greetings

Christian






Am 10.01.2017 um 10:42 schrieb Lars Ellenberg:
> On Sat, Jan 07, 2017 at 11:16:09AM +0100, Christian Völker wrote:
>> Hi all,
>>
>>
>> I have to cross-post to LVM as well to DRBD mailing list as I have no
>> clue where the issue is- if it's not a bug...
>>
>> I can not get working LVM  on top of drbd- I am getting I/O erros
>> followed by "diskless" state.
> For some reason, (some? not only?) VMWare virtual disks tend to pretend
> to support "write same", even if they fail such requests later.
>
> DRBD treats such failed WRITE-SAME the same way as any other backend
> error, and by default detaches.
>
> mkfs.ext4 by default uses "lazy_itable_init" and "lazy_journal_init",
> which makes it complete faster, but delays initialization of some file system
> meta data areas until first mount, where some kernel daemon will zero-out the
> relevant areas in the background.
>
> Older kernels (RHEL 6) and also older drbd (8.3) are not affected, because 
> they
> don't know about write-same.
>
> Workarounds exist:
>
> Don't use the "lazy" mkfs.
> During normal operation, write-same is usually not used.
>
> Or tell the system that the backend does not support write-same:
> Check setting:
>   grep ^ /sys/block/*/device/scsi_disk/*/max_write_same_blocks
> disable:
>   echo 0 | tee /sys/block/*/device/scsi_disk/*/max_write_same_blocks
>
> You then need to re-attach DRBD (drbdadm down all; drbdadm up all)
> to make it aware of this change.
>
> Fix:
>
> Well, we need to somehow add some ugly heuristic to better detect
> wether some backend really supports write-same. [*]
>
> Or, more likely, add an option to tell DRBD to ignore any pretend-only
> write-same support.
>
> Thanks,
>
> Lars
>
> [*] No, it is not as easy as "just ignore any IO error if it was a write-same
> request", because we try to "guarantee" that during normal operation, all
> replicas are in sync (within the limits defined by the replication protocol).
> If replicas fail in different ways, we can not do that (at least not without
> going through some sort of "recovery" first).
>
>> Steps to reproduce:
>>
>> Two machine2.
>>
>> A: CentOS7 x64; epel-providedd packages
>> kmod-drbd84-8.4.9-1.el7.elrepo.x86_64
>> drbd84-utils-8.9.8-1.el7.elrepo.x86_64
>>
>> B: CentOS6 x64; epel-provided packages
>> kmod-drbd83-8.3.16-3.el6.elrepo.x86_64
>> drbd83-utils-8.3.16-1.el6.elrepo.x86_64
>>
>> drbd1.res:
>> resource drbd1 {
>>   protocol A;
>>   startup {
>> wfc-timeout 240;
>> degr-wfc-timeout 120;
>> become-primary-on backuppc;
>> }
>>   net {
>> max-buffers 8000;
>> max-epoch-size 8000;
>> sndbuf-size 128k;
>> shared-secret "13Lue=3";
>> }
>>   syncer {
>> rate 500M;
>> }
>>   on backuppc {
>> device /dev/drbd1;
>> disk /dev/sdc;
>> address 192.168.0.1:7790;
>> meta-disk internal;
>>   }
>>   on drbd {
>> device /dev/drbd1;
>> disk /dev/sda;
>> address 192.168.2.16:7790;
>> meta-disk internal;
>>   }
>> }
>>
>> I was able to create the drbd as expected (see first line of following
>> syslog), it gets in sync.
>> So I set up LVM and create filter rules so 

[DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-01-13 Thread Michele Rossetti

Stupid me, now all working, maybe the resources was gone down upgrading?
Thanks a lot,
Michele



Il 13/01/2017 19:58, Michele Rossetti ha scritto:

  root@mpve1:~# drbdsetup status
 .drbdctrl role:Secondary
   volume:0 disk:UpToDate
   volume:1 disk:UpToDate
   mpve2 role:Secondary
 volume:0 peer-disk:UpToDate
 volume:1 peer-disk:UpToDate
   mpve3 role:Primary
 volume:0 peer-disk:UpToDate
 volume:1 peer-disk:UpToDate


This means that all your resources are down, only control volumes are up ...

Have you tried a:

"drbdmanage restart"

?

rob



--
"""
MICRO srl
Informatica e Telecomunicazioni - Web services - Web sites
 
  	Michele Rossetti


sede legale: via Raffa Garzia 7   09126 Cagliari (Italy)
sede operativa: viale Marconi 222  09131 Cagliari
Ph. +39 070 400240  Fax +39 070 4526207

MKM-REG
Web:  http://www.microsrl.com http://www.sardi.it
E-mail: micro...@microsrl.com
"""
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-01-13 Thread Roberto Resoli
Il 13/01/2017 19:58, Michele Rossetti ha scritto:
>  root@mpve1:~# drbdsetup status
> .drbdctrl role:Secondary
>   volume:0 disk:UpToDate
>   volume:1 disk:UpToDate
>   mpve2 role:Secondary
> volume:0 peer-disk:UpToDate
> volume:1 peer-disk:UpToDate
>   mpve3 role:Primary
> volume:0 peer-disk:UpToDate
> volume:1 peer-disk:UpToDate

This means that all your resources are down, only control volumes are up ...

Have you tried a:

"drbdmanage restart"

?

rob
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-01-13 Thread Michele Rossetti
After upgrade, dist upgrade to PVE 4.4 and drbdmanage-proxmox 
install, the KVM VMs don't start anymore at boot, they start only if 
are on primary node, and when started don't migrate for priaery node 
to secondary, even in HA.


The error messages from PVE are:

kvm: -drive 
file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on: 
Could not open '/dev/drbd/by-res/vm-104-disk-1/0': No such file or 
directory
TASK ERROR: start failed: command '/usr/bin/kvm -id 104 -chardev 
'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 
'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid 
-daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' 
-name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 
'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' 
-vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu 
kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it 
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' 
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' 
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 
'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 
'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 
'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 
'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 
'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' 
-device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 
'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' 
-device 
'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' 
-netdev 
'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' 
-device 
'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' 
-machine 'type=pc-i440fx-2.7' -incoming 
unix:/run/qemu-server/104.migrate -S' failed: exit code 1


and trying to migrate a started VM in HA:

task started by HA resource agent
Jan 13 19:13:47 starting migration of VM 104 to node 'mpve1' (82.xx.xx.xx)
Jan 13 19:13:47 copying disk images
Jan 13 19:13:47 starting VM 104 on remote node 'mpve1'
Jan 13 19:13:50 start failed: command '/usr/bin/kvm -id 104 -chardev 
'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 
'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid 
-daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' 
-name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 
'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' 
-vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu 
kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it 
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' 
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' 
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 
'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 
'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 
'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 
'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 
'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' 
-device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 
'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' 
-device 
'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' 
-netdev 
'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' 
-device 
'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' 
-machine 'type=pc-i440fx-2.7' -incoming 
unix:/run/qemu-server/104.migrate -S' failed: exit code 1
Jan 13 19:13:50 ERROR: online migrate failure - command '/usr/bin/ssh 
-o 'BatchMode=yes' r...@82.xx.xx.xx qm start 104 --skiplock 
--migratedfrom mpve3 --migration_type secure --stateuri unix 
--machine pc-i440fx-2.7' failed: exit code 255

Jan 13 19:13:50 aborting phase 2 - cleanup resources
Jan 13 19:13:50 migrate_cancel
Jan 13 19:13:51 ERROR: migration finished with problems (duration 00:00:04)
TASK ERROR: migration problems

It's true that /dev/drbd/by-res/vm-104-disk-1/0': No such file or directory

Trying to locate vm-104-disk this is the output:

root@mpve1:/dev/drbd/by-disk/drbdpool# locate vm-104-disk
/dev/drbdpool/vm-104-disk-1_00
/var/lib/drbd.d/drbdmanage_vm-104-disk-1.res.q

Cheching DRBD all seem ok.

root@mpve1:~# drbd-overview
 0:.drbdctrl/0  Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3) 
UpTo(mpve1)/UpTo(mpve3,mpve2)
 1:.drbdctrl/1  Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3) 

Re: [DRBD-user] [linux-lvm] LVM on top of DRBD [actually: mkfs.ext4 then mount results in detach on RHEL 7 on VMWare]

2017-01-13 Thread Lars Ellenberg
On Thu, Jan 12, 2017 at 06:00:53PM +0100, Lars Ellenberg wrote:
> On Wed, Jan 11, 2017 at 06:23:08PM +0100, kn...@knebb.de wrote:
> > Hi Lars and all,
> > 
> > 
> > >> I have to cross-post to LVM as well to DRBD mailing list as I have no
> > >> clue where the issue is- if it's not a bug...
> > >>
> > >> I can not get working LVM  on top of drbd- I am getting I/O erros
> > >> followed by "diskless" state.
> > > For some reason, (some? not only?) VMWare virtual disks tend to pretend
> > > to support "write same", even if they fail such requests later.
> > >
> > > DRBD treats such failed WRITE-SAME the same way as any other backend
> > > error, and by default detaches.
> > Ok, it is beyond my knowledge, but I understand what the "write-same"
> > command does. But if the underlying physical disk offers the command and
> > reports an error when used this should apply to mkfs.ext4 on the device/
> > partition as well, shouldn't it?
> 
> In this case, it happens on first mount.
> Also, it is not an "EIO", but an "EOPNOTSUP".
> 
> What really happens is that the file system code calls
> blkdev_issue_zeroout(),
> which will try discard, if discard is available and discard zeroes data,
> or, if discard (with discard zeroes data) is not available or returns
> failure, tries write-same with ZERO_PAGE,
> or, if write-same is not available or returns failure,
> tries __blkdev_issue_zeroout() (which uses "normal" writes).
> 
> At least in "current upstream", probably very similar in your
> almost-3.10.something kernel.
> 
> DRBD sits in between, sees the failure return of write-same,
> and handles it by detaching.
> 
> > drbd detacheds when an error is
> > reported- but why does Linux not report an error without drbd? And why
> > does this only happen when using LVM in-between? Should be the same when
> > LVM is not used
> 
> Yes. And it is, as far as I can tell.
> 
> > > Older kernels (RHEL 6) and also older drbd (8.3) are not affected, 
> > > because they
> > > don't know about write-same.
> > My primary host is running CentOS7 while the secondary ist older
> > (CentOS6). I will try to create the ext4 on the secondary and then
> > switch to primary.
> > 
> > > Or tell the system that the backend does not support write-same:
> > > Check setting:
> > >   grep ^ /sys/block/*/device/scsi_disk/*/max_write_same_blocks
> > > disable:
> > >   echo 0 | tee /sys/block/*/device/scsi_disk/*/max_write_same_blocks
> > >
> > A "find /sys -name "*same*"" does not report any files named
> 
> double check that, please.
> all my centos7 / RHEL 7 (and other distributions with sufficiently new
> kernel) have that.
> 
> there are both the read-only /sys/block/*/queue/write_same_max_bytes
> and the write-able 
> /sys/devices/*/*/*/host*/target*/*/scsi_disk/*/max_write_same_blocks
> 
> > "max_write_same_blocks". On none of the both nodes. So I dcan not
> > disable nor verify if it's enabled. I assume no as it does not exist. So
> > this might not be the reason.
> 
> show us lsblk -t and lsblk -D from the box that detaches.
> (the "7" one)
> 
> It may also be that a discard failed, in which case it could be
> devicemapper pretending discard was supported, and the backend failing
> that discard request. Or some combination there.
> 
> Your original logs show
> > Jan  7 10:58:44 backuppc kernel: EXT4-fs (dm-2): mounted filesystem with 
> > ordered data mode. Opts: (null)
> > Jan  7 10:58:48 backuppc kernel: block drbd1: local WRITE IO error sector 
> > 5296+3960 on sdc
> 
> The "+..." part is the length (number of sectors) of the request.
> We don't allow "normal" requests of that size, so this is either a
> discard or write-same.
> 
> > Jan  7 10:58:48 backuppc kernel: block drbd1: disk( UpToDate -> Failed )
> 
> > Jan  7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor 
> > remote data, sector 29096+3968
> 
> > Jan  7 10:58:48 backuppc kernel: dm-2: WRITE SAME failed. Manually zeroing.
> 
> And here we see that at least some WRITE SAME was issued, and returned 
> failure.
> and device mapper, which in your case sits above DRBD,
> and consumes that error, has its own fallback code for failed write-same.

Correcting myself, the presence of the warning message misled me.

The 3.10 kernel still has that warning message directly in
blkdev_issue_zeroout(), so that's not the device mapper fallback,
but simply the mechanism I described above, with additional "log that I
took the fallback because of failure".

Which means DISCARDS have not even been tried,
or we'd have a message about that as well.

> Which can no longer be services, because DRBD already detached.
> 
> So yes,
> I'm pretty sure that I did not pull my "best guess" out of thin air only
> 
>   ;-)

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
___

Re: [DRBD-user] Got stuck while installing DRBD for HA

2017-01-13 Thread Peter Schwindt


On 01/12/2017 12:46 PM, Shuvam Jha wrote:

[...]

> I am unable to install pacemaker corosync package , this error is coming
> I am using centos 7.3
> 
> [root@ccm1 ~]# yum install -y corosync pacemaker pcs
> Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
> No package corosync available.
> No package pacemaker available.
> No package pcs available.
> Error: Nothing to do

https://fedoraproject.org/wiki/EPEL

Good luck.




smime.p7s
Description: S/MIME Cryptographic Signature
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user