Re: [PVE-User] Moving disk with ZFS over iSCSI = IO error

2019-09-20 Thread Daniel Berteaud
- Le 19 Sep 19, à 7:57, Daniel Berteaud  a 
écrit : 

> Forgot to mention. When moving a disk offline, from ZFS over iSCSI to 
> something
> else (in my case to an NFS storage), I do have warnings like this :

> create full clone of drive scsi0 (zfs-test:vm-132-disk-0)
> Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2
> size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off
> refcount_bits=16
> transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes
> progression: 0.00 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5)
> ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> [...]
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 83886060: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 88080363: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 92274666: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 96468969: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 100663272: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 104857575: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5)
> ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 536870912 bytes remaining: 53150220288 bytes total: 53687091200
> bytes progression: 1.00 %
> transferred: 1079110533 bytes remaining: 52607980667 bytes total: 53687091200
> bytes progression: 2.01 %
> transferred: 1615981445 bytes remaining: 52071109755 bytes total: 53687091200
> bytes progression: 3.01 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 2158221066 bytes remaining: 51528870134 bytes total: 53687091200
> bytes progression: 4.02 %
> transferred: 2695091978 bytes remaining: 50991999222 bytes total: 53687091200
> bytes progression: 5.02 %
> transferred: 3231962890 bytes remaining: 50455128310 bytes total: 53687091200
> bytes progression: 6.02 %
> transferred: 3774202511 bytes remaining: 49912888689 bytes total: 53687091200
> bytes progression: 7.03 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 4311073423 bytes remaining: 4937601 bytes total: 53687091200
> bytes progression: 8.03 %
> transferred: 4853313044 bytes remaining: 48833778156 bytes total: 53687091200
> bytes progression: 9.04 %
> transferred: 5390183956 bytes remaining: 48296907244 bytes total: 53687091200
> bytes progression: 10.04 %
> transferred: 5927054868 bytes remaining: 47760036332 bytes total: 53687091200
> bytes progression: 11.04 %
> Which might well be related to the problem (the same errors when the VM is
> running are reported back to the upper stacks, until the guest FS, which 
> panics
> ?)
> When running offline, even with these error messages, the transfert is OK

Another case which might be related : [ 
https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/
 | 
https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/
 ] 

-- 

[ https://www.firewall-services.com/ ]  
Daniel Berteaud 
FIREWALL-SERVICES SAS, La sécurité des réseaux 
Société de Services en Logiciels Libres 
Tél : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ] 
___
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


Re: [PVE-User] Moving disk with ZFS over iSCSI = IO error

2019-09-19 Thread Daniel Berteaud
- Le 17 Sep 19, à 18:27, Daniel Berteaud  a 
écrit : 

> Hi there.

> I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 
> box
> with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see
> further). For the PVE side, I'm running PVE6 with all updates applied.

> Except a few minor issues I found in the LIO backend (for which I sent a patch
> serie earlier today), most things do work nicely. Except one which is 
> important
> to me : I can't move disk from ZFS over iSCSI to any other storage. 
> Destination
> storage type doesn't matter, but the porblem is 100% reproducible when the
> source storage is ZFS over iSCSI

> A few seconds after I started disk move, the guest FS will "panic". For 
> example,
> with an el7 guest using XFS, I get :

> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
> driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 7962536
> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
> driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 7962536
> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
> driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 12324392

> And the system completely crash. The data itself is not impacted. I can 
> restart
> the guest and everything appears OK. It doesn't matter if I let the disk move
> operation terminates or if I cancel it.
> Moving the disk offline works as expected.

> Sparse or non sparse zvol backend doesn't matter either.

> I searched a lot about this issue, and found at least two other persons having
> the same, or a very similar issue :

>* One using ZoL but with SCST, see [
>https://sourceforge.net/p/scst/mailman/message/35241011/ |
> https://sourceforge.net/p/scst/mailman/message/35241011/ ]
>* Another, using OmniOS, so with Comstar, see [
>
> https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/
>|
>
> https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/
> ]

> Both are likely running PVE5, so it looks like it's not a recently introduced
> regression.

> I also was able to reproduce the issue with a FreeNAS storage, so using ctld. 
> As
> the issue is present with so many different stack, I think we can eliminate an
> issue on the storage side. The problem is most likely on qemu, in it's iSCSI
> block implementation.
> The SCST-Devel thread is interesting, but infortunately, it's beyond my skills
> here.

> Any advice on how to debug this further ? I can reproduce it whenever I want, 
> on
> a test setup. I'm happy to provide any usefull informations

> Regards, Daniel

Forgot to mention. When moving a disk offline, from ZFS over iSCSI to something 
else (in my case to an NFS storage), I do have warnings like this : 

create full clone of drive scsi0 (zfs-test:vm-132-disk-0) 
Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2 
size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off 
refcount_bits=16 
transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes 
progression: 0.00 % 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) 
ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 25165818: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 29360121: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 33554424: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 37748727: SENSE 
KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)