Hi there. 

I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 
box with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see 
further). For the PVE side, I'm running PVE6 with all updates applied. 

Except a few minor issues I found in the LIO backend (for which I sent a patch 
serie earlier today), most things do work nicely. Except one which is important 
to me : I can't move disk from ZFS over iSCSI to any other storage. Destination 
storage type doesn't matter, but the porblem is 100% reproducible when the 
source storage is ZFS over iSCSI 

A few seconds after I started disk move, the guest FS will "panic". For 
example, with an el7 guest using XFS, I get : 

kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 7962536 
kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 7962536 
kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 12324392 


And the system completely crash. The data itself is not impacted. I can restart 
the guest and everything appears OK. It doesn't matter if I let the disk move 
operation terminates or if I cancel it. 
Moving the disk offline works as expected. 

Sparse or non sparse zvol backend doesn't matter either. 

I searched a lot about this issue, and found at least two other persons having 
the same, or a very similar issue : 



    * One using ZoL but with SCST, see [ 
https://sourceforge.net/p/scst/mailman/message/35241011/ | 
https://sourceforge.net/p/scst/mailman/message/35241011/ ] 
    * Another, using OmniOS, so with Comstar, see [ 
https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ 
| 
https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ 
] 

Both are likely running PVE5, so it looks like it's not a recently introduced 
regression. 

I also was able to reproduce the issue with a FreeNAS storage, so using ctld. 
As the issue is present with so many different stack, I think we can eliminate 
an issue on the storage side. The problem is most likely on qemu, in it's iSCSI 
block implementation. 
The SCST-Devel thread is interesting, but infortunately, it's beyond my skills 
here. 

Any advice on how to debug this further ? I can reproduce it whenever I want, 
on a test setup. I'm happy to provide any usefull informations 

Regards, Daniel 


-- 


[ https://www.firewall-services.com/ ]  
Daniel Berteaud 
FIREWALL-SERVICES SAS, La sécurité des réseaux 
Société de Services en Logiciels Libres 
Tél : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ] 
_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to