Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-27 Thread Milan Oravec
hello Sarraf,

On Thu, Feb 23, 2023 at 2:44 PM Ritesh Raj Sarraf  wrote:

> On Thu, 2023-02-23 at 14:19 +0100, Milan Oravec wrote:
> > > >
> > >
> > > By KVM, do you mean just pure KVM ? Or the management suite too,
> > > libvirt ?
> > >
> >
> >
> > Yes, libvirt is used to manage KVM, and virt-manager on my desktop to
> > connect to it. It is simple setup without clustering.
> >
>
> Thanks for confirming.
>
> You should start with the hypervisor where libvirtd is running. Check
> if that hypervisor host is running normal.
>

No unusual errors reported.


>
> From all the symptoms you've shared, my wild arrow in the dark is that
> your hypervisor/guest network may be running into issues.
>
>
There is all ok with network.


> >
> >
> > Do you know someone who can help with this? Here is example of KVM
> > guest running configuration:
> >
> > libvirt+ 244533  1  5 Feb03 ?1-03:54:37 qemu-system-
> > x86_64 -enable-kvm -name guest=me_test,process=qemu:me_test,debug-
> > threads=on -S -object
> > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-
> > 108-me_test/master-key.aes -machine pc-1.1,accel=kvm,usb=off,dump-
> > guest-core=off -cpu host -m 8096 -realtime mlock=off -smp
> > 4,sockets=1,cores=4,threads=1 -uuid 1591f345-96b5-4077-9d32-
> > b2991003753d -no-user-config -nodefaults -chardev
> > socket,id=charmonitor,fd=57,server,nowait -mon
> > chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-
> > shutdown -boot strict=on -device piix3-usb-
> > uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-
> > pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive if=none,id=drive-
> > ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-
> > ide0-1-0,id=ide0-1-0,bootindex=1 -drive
> > file=/dev/mapper/me_test,format=raw,if=none,id=drive-virtio-
> > disk0,cache=none,discard=unmap,aio=native -device virtio-blk-
> > pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-
> > disk0,bootindex=2,write-cache=on -netdev
> > tap,fd=60,id=hostnet0,vhost=on,vhostfd=61 -device virtio-net-
> > pci,netdev=hostnet0,id=net0,mac=aa:bb:cc:00:10:31,bus=pci.0,addr=0x3
> > -chardev pty,id=charserial0 -device isa-
> > serial,chardev=charserial0,id=serial0 -chardev
> > spicevmc,id=charchannel0,name=vdagent -device
> > virtserialport,bus=virtio-
> > serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice
> > .0 -spice port=5929,addr=0.0.0.0,disable-ticketing,seamless-
> > migration=on -device qxl-
> > vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,v
> > gamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device virtio-balloon-
> > pci,id=balloon0,bus=pci.0,addr=0x5 -sandbox
> > on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=de
> > ny -msg timestamp=on
> >
> > Maybe there is something undesirable for the iscsi target.
> >
>
> Seems like you are using host networking ? Have you validated that that
> part is working reliably ?
>

Yes, this is working fine too.

I've made some experiments with storage parameters passed to KVM, standard
disk created within virt-manager is running with following paramaters:

format=raw,cache=none,discard=unmap,aio=native.

I've removed cache  parameter (let hypervisor set its default) and now
running disk with:
format=raw,discard=unmap.

With this setup, there are no errors (connection1:0: pdu (op 0x5 itt 0x48)
rejected. Reason code 0x9) in logs. Virtual host was stressed with:

ns2:/home/migo# stress-ng -v -d 1
stress-ng: debug: [858] 2 processors online, 2 processors configured
stress-ng: info:  [858] defaulting to a 86400 second (1 day, 0.00 secs) run
per stressor
stress-ng: info:  [858] dispatching hogs: 1 hdd
stress-ng: debug: [858] cache allocate: default cache size: 262144K
stress-ng: debug: [858] starting stressors
stress-ng: debug: [858] 1 stressor started
stress-ng: debug: [859] stress-ng-hdd: started [859] (instance 0)
^Cstress-ng: debug: [859] stress-ng-hdd: exited [859] (instance 0)
stress-ng: debug: [858] process [859] terminated
stress-ng: info:  [858] successful run completed in 30377.75s (8 hours, 26
mins, 17.75 secs)
stress-ng: debug: [858] metrics-check: all stressor metrics validated and
sane

It seems that using hypervisors cache is filtering those bogus commands
(causing trouble for target) sent to the target. What do you think about it?

But running iscsi target with cache is not a good idea, and this is no real
solution. :(



>
> > >
> > > > > There will be errors in your system journal for this particular
> > > > > setup.
> > > > >
> > > > > Errors like:
> > > > >
> > > > > * connection drops
> > > > > * iscsi session drops/terminations
> > > > > * SCSI errors
> > > > > * multipath path checker errors
> > > > >
> > > > > All these will be errors which will be recovered eventually.
> > > > > That
> > > > > is
> > > > > why we have the need for close integration in between these
> > > > > layers,
> > > > > when building a storage solution on top.
> > > > >
> > > >
> > > >
> > > > 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-23 Thread Ritesh Raj Sarraf
Also, on my test setup, my LIO target is a standard one. No multinode
cluster. It is a single-node target.

The way I trigger path failovers is by bringing down individual network
interfaces. That serves well enough for my initiator focused tests.


On Thu, 2023-02-23 at 19:14 +0530, Ritesh Raj Sarraf wrote:
> > > > 
> > > > Are you running KVM virtualization atop of your SAN target?
> > > >  
> > > 
> > > My LIO target runs in a KVM guest. So does the iSCSI initiator
> > > too.
> > > 
> > 
> > 
> > Pure KVM or libvirtd too?
> > 
> 
> libvirtd it is. The same hypervisor with libvirtd running, hosts the
> initiator and the target, in different VMs.

-- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System


signature.asc
Description: This is a digitally signed message part


Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-23 Thread Ritesh Raj Sarraf
Also, what 10 GiB ethernet do you run on your hypervisor ? What
make/model ? Is that known to work reliably ?

My experience from around some years ago, 10 GiB ethernet support was
still budding. But things may have changed lately. I just don't have
access to those things no more.


On Thu, 2023-02-23 at 19:14 +0530, Ritesh Raj Sarraf wrote:
> On Thu, 2023-02-23 at 14:19 +0100, Milan Oravec wrote:
> > > > 
> > > 
> > > By KVM, do you mean just pure KVM ? Or the management suite too,
> > > libvirt ?
> > > 
> > 
> > 
> > Yes, libvirt is used to manage KVM, and virt-manager on my desktop
> > to
> > connect to it. It is simple setup without clustering.  
> > 
> 
> Thanks for confirming.
> 
> You should start with the hypervisor where libvirtd is running. Check
> if that hypervisor host is running normal.
> 
> From all the symptoms you've shared, my wild arrow in the dark is
> that
> your hypervisor/guest network may be running into issues.
> 
> 

-- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System


signature.asc
Description: This is a digitally signed message part


Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-23 Thread Ritesh Raj Sarraf
On Thu, 2023-02-23 at 14:19 +0100, Milan Oravec wrote:
> > > 
> > 
> > By KVM, do you mean just pure KVM ? Or the management suite too,
> > libvirt ?
> > 
> 
> 
> Yes, libvirt is used to manage KVM, and virt-manager on my desktop to
> connect to it. It is simple setup without clustering.  
> 

Thanks for confirming.

You should start with the hypervisor where libvirtd is running. Check
if that hypervisor host is running normal.

From all the symptoms you've shared, my wild arrow in the dark is that
your hypervisor/guest network may be running into issues.

> 
> 
> Do you know someone who can help with this? Here is example of KVM
> guest running configuration:
> 
> libvirt+ 244533      1  5 Feb03 ?        1-03:54:37 qemu-system-
> x86_64 -enable-kvm -name guest=me_test,process=qemu:me_test,debug-
> threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-
> 108-me_test/master-key.aes -machine pc-1.1,accel=kvm,usb=off,dump-
> guest-core=off -cpu host -m 8096 -realtime mlock=off -smp
> 4,sockets=1,cores=4,threads=1 -uuid 1591f345-96b5-4077-9d32-
> b2991003753d -no-user-config -nodefaults -chardev
> socket,id=charmonitor,fd=57,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-
> shutdown -boot strict=on -device piix3-usb-
> uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-
> pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive if=none,id=drive-
> ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-
> ide0-1-0,id=ide0-1-0,bootindex=1 -drive
> file=/dev/mapper/me_test,format=raw,if=none,id=drive-virtio-
> disk0,cache=none,discard=unmap,aio=native -device virtio-blk-
> pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-
> disk0,bootindex=2,write-cache=on -netdev
> tap,fd=60,id=hostnet0,vhost=on,vhostfd=61 -device virtio-net-
> pci,netdev=hostnet0,id=net0,mac=aa:bb:cc:00:10:31,bus=pci.0,addr=0x3
> -chardev pty,id=charserial0 -device isa-
> serial,chardev=charserial0,id=serial0 -chardev
> spicevmc,id=charchannel0,name=vdagent -device
> virtserialport,bus=virtio-
> serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice
> .0 -spice port=5929,addr=0.0.0.0,disable-ticketing,seamless-
> migration=on -device qxl-
> vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,v
> gamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device virtio-balloon-
> pci,id=balloon0,bus=pci.0,addr=0x5 -sandbox
> on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=de
> ny -msg timestamp=on
> 
> Maybe there is something undesirable for the iscsi target. 
>  

Seems like you are using host networking ? Have you validated that that
part is working reliably ?

> > 
> > > > There will be errors in your system journal for this particular
> > > > setup.
> > > > 
> > > > Errors like:
> > > > 
> > > > * connection drops
> > > > * iscsi session drops/terminations
> > > > * SCSI errors
> > > > * multipath path checker errors
> > > > 
> > > > All these will be errors which will be recovered eventually.
> > > > That
> > > > is
> > > > why we have the need for close integration in between these
> > > > layers,
> > > > when building a storage solution on top.
> > > > 
> > > 
> > > 
> > > This is very complex ecosystem. I know that error reporting is
> > > good
> > > think :) and helping out to troubleshoot problems. But when
> > > everything is all right there should be no errors, right?
> > >  
> > 
> > In an ideal scenario, yes, there will be no errors. But on a SAN
> > setup,
> > cluster failovers are a feature of the SAN target, and as such
> > during
> > that transition some errors are expected on the initiator, which
> > are
> > eventually recovered.
> > 
> > Recovery is the critical part here. When states do not recover to
> > normal, it is an error; either the target or the initiator. Or even
> > the
> > middleman (network) at times.
> > 
> 
> 
> This part seems to work OK, so far no data loss was detected.
>  

From your initial logs, mpath reported 'io pending' on one of the
paths. I'm not sure if I'd call that fully recovered.

What if you down the other (healthy) path ? Does IO progress on the
initial (io pending) path ?

> > > 
> > > Are you running KVM virtualization atop of your SAN target?
> > >  
> > 
> > My LIO target runs in a KVM guest. So does the iSCSI initiator too.
> > 
> 
> 
> Pure KVM or libvirtd too?
> 

libvirtd it is. The same hypervisor with libvirtd running, hosts the
initiator and the target, in different VMs.


-- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System


signature.asc
Description: This is a digitally signed message part


Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-23 Thread Milan Oravec
Hello,

On Thu, Feb 23, 2023 at 1:35 PM Ritesh Raj Sarraf  wrote:

> Hello Milan,
>
> On Mon, 2023-02-20 at 09:51 +0100, Milan Oravec wrote:
> > > >
> > >
> > > So did you trigger a cluster takeover ? My guess is it is your
> > > target
> > > initiating the connection drops, while taking over to the other
> > > node.
> > >
> > >
> >
> >
> > There was no intentional cluster takeover. This happens during normal
> > operation.
> >
> > > How a target behaves during the transition is left to the target.
> > > The
> > > initiator will keep querying for recovery, until either it times
> > > out or
> > > recovers.
> > >
> >
> >
> > recovery seems to work fine. I've tested it by disconnecting one path
> > to target and all was OK, second path was used and when first one
> > recovered is switched to that one over.
> >
>
>
> > >
> > > > > >
> >
> > I can understand this is very complex problem, what do you suggest me
> > to debug this issues? We have 5 host connected to Fujitsu target, but
> > errors are only on those two which are running KVM guests. Other
> > servers running mail hosts and camera surveillance are not affected
> > it seems. They are Debian systems but not running multipath, so I've
> > made test:
> >
> > So far I've disabled all KVM guests on one (of those two KVM hosts
> > used fro virtualization) host and mounted  one KVM guest file system
> > locally (/mnt) and performed stress test on that mount:
> >
>
> By KVM, do you mean just pure KVM ? Or the management suite too,
> libvirt ?
>

Yes, libvirt is used to manage KVM, and virt-manager on my desktop to
connect to it. It is simple setup without clustering.


>
> >
> > When I let run one KVM guest (Debian) on this host system following
> > errors are in dmesg:
> >
> > [Mon Feb 20 09:49:39 2023]  connection1:0: pdu (op 0x5 itt 0x53)
> > rejected. Reason code 0x9
> > [Mon Feb 20 09:49:41 2023]  connection1:0: pdu (op 0x5 itt 0x7c)
> > rejected. Reason code 0x9
> > [Mon Feb 20 09:49:41 2023]  connection1:0: pdu (op 0x5 itt 0x51)
> > rejected. Reason code 0x9
> > [Mon Feb 20 09:50:07 2023]  connection1:0: pdu (op 0x5 itt 0x33)
> > rejected. Reason code 0x9
> >
> > Guest is using Virtio disk drivers (vda) so I've switched it to
> > generic SATA (sda) for this guest, but errors in log remains.
> >
> > It seem that KVM is triggering this errors and make our NAS unstable.
> >
> > What can KVM do differently? It is using /dev/mapper/dm-XX as disk
> > devices so no direct iscsi access.
> >
> > Any ideas what should I try next?
> >
> >
>
> I'm afraid I do not have much direct hints for you here. given that
> this issue does not happen when KVM is not involved, would imply that
> the erratic behavior originates from that software's integration.
>
>
Do you know someone who can help with this? Here is example of KVM guest
running configuration:

libvirt+ 244533  1  5 Feb03 ?1-03:54:37 qemu-system-x86_64
-enable-kvm -name guest=me_test,process=qemu:me_test,debug-threads=on -S
-object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-108-me_test/master-key.aes
-machine pc-1.1,accel=kvm,usb=off,dump-guest-core=off -cpu host -m 8096
-realtime mlock=off -smp 4,sockets=1,cores=4,threads=1 -uuid
1591f345-96b5-4077-9d32-b2991003753d -no-user-config -nodefaults -chardev
socket,id=charmonitor,fd=57,server,nowait -mon
chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
-boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
if=none,id=drive-ide0-1-0,readonly=on -device
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -drive
file=/dev/mapper/me_test,format=raw,if=none,id=drive-virtio-disk0,cache=none,discard=unmap,aio=native
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2,write-cache=on
-netdev tap,fd=60,id=hostnet0,vhost=on,vhostfd=61 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=aa:bb:cc:00:10:31,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-spice port=5929,addr=0.0.0.0,disable-ticketing,seamless-migration=on
-device
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -sandbox
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
-msg timestamp=on

Maybe there is something undesirable for the iscsi target.


>
> > > There will be errors in your system journal for this particular
> > > setup.
> > >
> > > Errors like:
> > >
> > > * connection drops
> > > * iscsi session drops/terminations
> > > * SCSI errors
> > > * multipath path checker errors
> > >
> > > All these will be errors which will be recovered eventually. That
> > 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-23 Thread Ritesh Raj Sarraf
Hello Milan,

On Mon, 2023-02-20 at 09:51 +0100, Milan Oravec wrote:
> > >  
> > 
> > So did you trigger a cluster takeover ? My guess is it is your
> > target
> > initiating the connection drops, while taking over to the other
> > node.
> > 
> > 
> 
> 
> There was no intentional cluster takeover. This happens during normal
> operation. 
>  
> > How a target behaves during the transition is left to the target.
> > The
> > initiator will keep querying for recovery, until either it times
> > out or
> > recovers.
> > 
> 
> 
> recovery seems to work fine. I've tested it by disconnecting one path
> to target and all was OK, second path was used and when first one
> recovered is switched to that one over. 
>  


> > 
> > > > > 
> 
> I can understand this is very complex problem, what do you suggest me
> to debug this issues? We have 5 host connected to Fujitsu target, but
> errors are only on those two which are running KVM guests. Other
> servers running mail hosts and camera surveillance are not affected
> it seems. They are Debian systems but not running multipath, so I've
> made test:
> 
> So far I've disabled all KVM guests on one (of those two KVM hosts
> used fro virtualization) host and mounted  one KVM guest file system
> locally (/mnt) and performed stress test on that mount:
> 

By KVM, do you mean just pure KVM ? Or the management suite too,
libvirt ?


> 
> When I let run one KVM guest (Debian) on this host system following
> errors are in dmesg:
> 
> [Mon Feb 20 09:49:39 2023]  connection1:0: pdu (op 0x5 itt 0x53)
> rejected. Reason code 0x9
> [Mon Feb 20 09:49:41 2023]  connection1:0: pdu (op 0x5 itt 0x7c)
> rejected. Reason code 0x9
> [Mon Feb 20 09:49:41 2023]  connection1:0: pdu (op 0x5 itt 0x51)
> rejected. Reason code 0x9
> [Mon Feb 20 09:50:07 2023]  connection1:0: pdu (op 0x5 itt 0x33)
> rejected. Reason code 0x9
> 
> Guest is using Virtio disk drivers (vda) so I've switched it to
> generic SATA (sda) for this guest, but errors in log remains.
> 
> It seem that KVM is triggering this errors and make our NAS unstable.
> 
> What can KVM do differently? It is using /dev/mapper/dm-XX as disk
> devices so no direct iscsi access. 
> 
> Any ideas what should I try next? 
> 
> 

I'm afraid I do not have much direct hints for you here. given that
this issue does not happen when KVM is not involved, would imply that
the erratic behavior originates from that software's integration.


> > There will be errors in your system journal for this particular
> > setup.
> > 
> > Errors like:
> > 
> > * connection drops
> > * iscsi session drops/terminations
> > * SCSI errors
> > * multipath path checker errors
> > 
> > All these will be errors which will be recovered eventually. That
> > is
> > why we have the need for close integration in between these layers,
> > when building a storage solution on top.
> > 
> 
> 
> This is very complex ecosystem. I know that error reporting is good
> think :) and helping out to troubleshoot problems. But when
> everything is all right there should be no errors, right?
>  

In an ideal scenario, yes, there will be no errors. But on a SAN setup,
cluster failovers are a feature of the SAN target, and as such during
that transition some errors are expected on the initiator, which are
eventually recovered.

Recovery is the critical part here. When states do not recover to
normal, it is an error; either the target or the initiator. Or even the
middleman (network) at times.

> > 
> > 
> > 
> > Note: These days I only have a software LIO target to test/play
> > with,
> > where I have not seen any real issues/errors. How each SAN Target
> > behaves is something highly specific to the target, in your case
> > the
> > Fujitsu target.
> > 
> > 
> 
> 
> Are you running KVM virtualization atop of your SAN target?
>  

My LIO target runs in a KVM guest. So does the iSCSI initiator too.


-- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System


signature.asc
Description: This is a digitally signed message part


Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-20 Thread Milan Oravec
Hello,

On Thu, Feb 16, 2023 at 2:16 PM Ritesh Raj Sarraf  wrote:

> Hello,
>
> On Wed, 2023-02-15 at 19:21 +0100, Milan Oravec wrote:
> > > > I've lot of this error messages ind dmesg:
> > > >
> > > > Feb 09 10:56:05 virt2n kernel:  connection2:0: detected conn
> > > > error
> > > > (1015)
> > > > Feb 09 10:56:05 virt2n iscsid[2790]: connection2:0 is operational
> > > > after recovery (1 attempts)
> > > > Feb 09 10:56:05 virt2n iscsid[2790]: connection1:0 is operational
> > > > after recovery (1 attempts)
> > >
> > > Connection dropped and re-established. Looks normal.
> > >
> >
> >
> > Yes I understand that message but WHY this happens, what can be the
> > cause for this? Link is stable and no errors are reported on ifaces
> > connected to SAN switch:
> >
>
> That is mostly to analyze on the target side or on the network, to
> start with.
>
> Usually, in a multinode clustered target seutp, when a takeover
> happens, connecitons from node1 will be dropped for a moment while it
> is taken over by node2.
>
> I don't know nothing about the Fujitsu target so I cannot generalize.
>
> > ens1f0: flags=4163  mtu 9000
> > inet 10.11.12.17  netmask 255.255.255.0  broadcast
> > 10.11.12.255
> > inet6 fe80::9640:c9ff:fe5f:9172  prefixlen 64  scopeid
> > 0x20
> > ether 94:40:c9:5f:91:72  txqueuelen 1000  (Ethernet)
> > RX packets 184466309231  bytes 146589180968929 (133.3 TiB)
> > RX errors 0  dropped 0  overruns 0  frame 0
> > TX packets 86121730517  bytes 284905539017306 (259.1 TiB)
> > TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> > device interrupt 102  memory 0xa582-a583
> >
> > ens1f1: flags=4163  mtu 9000
> > inet 10.11.14.17  netmask 255.255.255.0  broadcast
> > 10.11.14.255
> > inet6 fe80::9640:c9ff:fe5f:9173  prefixlen 64  scopeid
> > 0x20
> > ether 94:40:c9:5f:91:73  txqueuelen 1000  (Ethernet)
> > RX packets 114453605386  bytes 85545283479332 (77.8 TiB)
> > RX errors 0  dropped 0  overruns 0  frame 0
> > TX packets 236069263519  bytes 283187128321034 (257.5 TiB)
> > TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> > device interrupt 132  memory 0xa580-a581
> >
> >
>
> Thanks for sharing this. So as you mentioned there's no issue on the
> network side.
>
> > >
> > > > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI
> > > > connection
> > > > 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch)
> > > > state
> > > > (3)
> > > > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI
> > > > connection
> > > > 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch)
> > > > state
> > > > (3)
> > > > Feb 09 10:56:10 virt2n iscsid[2790]: connection1:0 is operational
> > > > after recovery (1 attempts)
> > > > Feb 09 10:56:10 virt2n iscsid[2790]: connection2:0 is operational
> > > > after recovery (1 attempts)
> > >
> > > Conn 2:0 recovered but the earlier error it gave looks misleading
> > > to
> > > me. What target controller you have ? What there a failover across
> > > the
> > > nodes ? How does the target handle the transition period to
> > > initiator
> > > queries ?
> > >
> >
> >
> > yes, two paths with failover
> >
>
> So did you trigger a cluster takeover ? My guess is it is your target
> initiating the connection drops, while taking over to the other node.
>
>
There was no intentional cluster takeover. This happens during normal
operation.


> How a target behaves during the transition is left to the target. The
> initiator will keep querying for recovery, until either it times out or
> recovers.
>

recovery seems to work fine. I've tested it by disconnecting one path to
target and all was OK, second path was used and when first one recovered is
switched to that one over.


>
> > > >
> > > > When this happens iscsi opration is interupted for few seconds,
> > > > multipath -ll reports (this is only example, i/o pending is
> > > > reported
> > > > dor all 30 dm-s:
> > > >
> > >
> > > This all looks normal to me. If you want the errors to be
> > > suppressed,
> > > you can increase the timeouts. On the other hand, otherwise, you
> > > could
> > > also use dm-multipath on top, like you have shown below.
> > >
> >
> >
> > What can cause this errors? I've multipath setup, but I want to find
> > root cause of this error messages.
> >
>
> I don't know what is causing to trigger those errors on your setup.
>
> Please do note that you have different layers mixed to build that
> storage solution
>
>
> * TCP/IP
> * iSCSI
> * SCSI
> * Device Mapper Multipath
>
> Every layer has errors that it treats differently. For example, a SCSI
> error is perfectly suppressed by the DM-Multipath layer depending on
> how it is setup, to the upper layers.
>
>
I can understand this is very complex problem, what do you suggest me to
debug this issues? We have 5 host connected to Fujitsu target, but errors
are only on those two which are 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-16 Thread Ritesh Raj Sarraf
Hello,

On Wed, 2023-02-15 at 19:21 +0100, Milan Oravec wrote:
> > > I've lot of this error messages ind dmesg:
> > > 
> > > Feb 09 10:56:05 virt2n kernel:  connection2:0: detected conn
> > > error
> > > (1015)
> > > Feb 09 10:56:05 virt2n iscsid[2790]: connection2:0 is operational
> > > after recovery (1 attempts)
> > > Feb 09 10:56:05 virt2n iscsid[2790]: connection1:0 is operational
> > > after recovery (1 attempts)
> > 
> > Connection dropped and re-established. Looks normal.
> > 
> 
> 
> Yes I understand that message but WHY this happens, what can be the
> cause for this? Link is stable and no errors are reported on ifaces
> connected to SAN switch:
> 

That is mostly to analyze on the target side or on the network, to
start with.

Usually, in a multinode clustered target seutp, when a takeover
happens, connecitons from node1 will be dropped for a moment while it
is taken over by node2.

I don't know nothing about the Fujitsu target so I cannot generalize.

> ens1f0: flags=4163  mtu 9000
>         inet 10.11.12.17  netmask 255.255.255.0  broadcast
> 10.11.12.255
>         inet6 fe80::9640:c9ff:fe5f:9172  prefixlen 64  scopeid
> 0x20
>         ether 94:40:c9:5f:91:72  txqueuelen 1000  (Ethernet)
>         RX packets 184466309231  bytes 146589180968929 (133.3 TiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 86121730517  bytes 284905539017306 (259.1 TiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>         device interrupt 102  memory 0xa582-a583  
> 
> ens1f1: flags=4163  mtu 9000
>         inet 10.11.14.17  netmask 255.255.255.0  broadcast
> 10.11.14.255
>         inet6 fe80::9640:c9ff:fe5f:9173  prefixlen 64  scopeid
> 0x20
>         ether 94:40:c9:5f:91:73  txqueuelen 1000  (Ethernet)
>         RX packets 114453605386  bytes 85545283479332 (77.8 TiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 236069263519  bytes 283187128321034 (257.5 TiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>         device interrupt 132  memory 0xa580-a581  
> 
>  

Thanks for sharing this. So as you mentioned there's no issue on the
network side.

> > 
> > > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI
> > > connection
> > > 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch)
> > > state
> > > (3)
> > > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI
> > > connection
> > > 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch)
> > > state
> > > (3)
> > > Feb 09 10:56:10 virt2n iscsid[2790]: connection1:0 is operational
> > > after recovery (1 attempts)
> > > Feb 09 10:56:10 virt2n iscsid[2790]: connection2:0 is operational
> > > after recovery (1 attempts)
> > 
> > Conn 2:0 recovered but the earlier error it gave looks misleading
> > to
> > me. What target controller you have ? What there a failover across
> > the
> > nodes ? How does the target handle the transition period to
> > initiator
> > queries ?
> > 
> 
> 
> yes, two paths with failover
>  

So did you trigger a cluster takeover ? My guess is it is your target
initiating the connection drops, while taking over to the other node.

How a target behaves during the transition is left to the target. The
initiator will keep querying for recovery, until either it times out or
recovers.

> > > 
> > > When this happens iscsi opration is interupted for few seconds,
> > > multipath -ll reports (this is only example, i/o pending is
> > > reported
> > > dor all 30 dm-s:
> > > 
> > 
> > This all looks normal to me. If you want the errors to be
> > suppressed,
> > you can increase the timeouts. On the other hand, otherwise, you
> > could
> > also use dm-multipath on top, like you have shown below.
> > 
> 
> 
> What can cause this errors? I've multipath setup, but I want to find
> root cause of this error messages.
>  

I don't know what is causing to trigger those errors on your setup.

Please do note that you have different layers mixed to build that
storage solution


* TCP/IP
* iSCSI
* SCSI
* Device Mapper Multipath

Every layer has errors that it treats differently. For example, a SCSI
error is perfectly suppressed by the DM-Multipath layer depending on
how it is setup, to the upper layers.


> > 
> > > 
> > > web3v (360e00d2c002c17720012) dm-52
> > > FUJITSU,ETERNUS_DXL
> > > size=500G features='3 queue_if_no_path queue_mode mq'
> > > hwhandler='1
> > > alua' wp=rw
> > > `-+- policy='service-time 0' prio=10 status=active
> > >   |- 1:0:0:16 sdah 66:16  active i/o pending running
> > >   `- 2:0:0:16 sdai 66:32  active ready running
> > > ais_app (360e00d2c002c1772001f) dm-99
> > > FUJITSU,ETERNUS_DXL
> > > size=500G features='3 queue_if_no_path queue_mode mq'
> > > hwhandler='1
> > > alua' wp=rw
> > > `-+- policy='service-time 0' prio=10 status=active
> > >   |- 1:0:0:28 sdbh 67:176 active i/o pending running
> > >   `- 2:0:0:28 sdbf 67:144 active ready running
> > > 
> > 
> > Looks 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-15 Thread Milan Oravec
On Tue, Feb 14, 2023 at 1:04 PM Ritesh Raj Sarraf  wrote:

> Hello Migo,
>
>
Hello Ritesh Raj Sarraf,

On Sun, 2023-02-12 at 11:29 +0100, migo wrote:
> > Package: open-iscsi
> > Version: 2.1.3-5
> > Severity: normal
> > X-Debbugs-Cc: migo.ora...@gmail.com
> >
> > Dear Maintainer,
> >
> > I want to apologize if this is not the right place to report this
> > issue, but I need somewhere to start.
> >
> > I've lot of this error messages ind dmesg:
> >
> > Feb 09 10:56:05 virt2n kernel:  connection2:0: detected conn error
> > (1015)
> > Feb 09 10:56:05 virt2n iscsid[2790]: connection2:0 is operational
> > after recovery (1 attempts)
> > Feb 09 10:56:05 virt2n iscsid[2790]: connection1:0 is operational
> > after recovery (1 attempts)
>
> Connection dropped and re-established. Looks normal.
>

Yes I understand that message but WHY this happens, what can be the cause
for this? Link is stable and no errors are reported on ifaces connected to
SAN switch:

ens1f0: flags=4163  mtu 9000
inet 10.11.12.17  netmask 255.255.255.0  broadcast 10.11.12.255
inet6 fe80::9640:c9ff:fe5f:9172  prefixlen 64  scopeid 0x20
ether 94:40:c9:5f:91:72  txqueuelen 1000  (Ethernet)
RX packets 184466309231  bytes 146589180968929 (133.3 TiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 86121730517  bytes 284905539017306 (259.1 TiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 102  memory 0xa582-a583

ens1f1: flags=4163  mtu 9000
inet 10.11.14.17  netmask 255.255.255.0  broadcast 10.11.14.255
inet6 fe80::9640:c9ff:fe5f:9173  prefixlen 64  scopeid 0x20
ether 94:40:c9:5f:91:73  txqueuelen 1000  (Ethernet)
RX packets 114453605386  bytes 85545283479332 (77.8 TiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 236069263519  bytes 283187128321034 (257.5 TiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 132  memory 0xa580-a581



>
> > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection
> > 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> > (3)
> > Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection
> > 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> > (3)
> > Feb 09 10:56:10 virt2n iscsid[2790]: connection1:0 is operational
> > after recovery (1 attempts)
> > Feb 09 10:56:10 virt2n iscsid[2790]: connection2:0 is operational
> > after recovery (1 attempts)
>
> Conn 2:0 recovered but the earlier error it gave looks misleading to
> me. What target controller you have ? What there a failover across the
> nodes ? How does the target handle the transition period to initiator
> queries ?
>

yes, two paths with failover


>
> > Feb 09 10:56:28 virt2n kernel:  connection1:0: detected conn error
> > (1015)
> > Feb 09 10:56:28 virt2n kernel:  connection2:0: detected conn error
> > (1015)
> > Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection
> > 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> > (3)
> > Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection
> > 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> > (3)
> > Feb 09 10:56:33 virt2n kernel:  connection1:0: detected conn error
> > (1015)
> > Feb 09 10:56:33 virt2n kernel:  connection2:0: detected conn error
> > (1015)
> > Feb 09 10:56:33 virt2n iscsid[2790]: connection2:0 is operational
> > after recovery (1 attempts)
> > Feb 09 10:56:33 virt2n iscsid[2790]: connection1:0 is operational
> > after recovery (1 attempts)
>
> Same logs again.
>
> > ... snipped ...
>
> > or enventually this:
> >
> > Feb 10 20:39:39 virt2n kernel:  connection1:0: ping timeout of 5 secs
> > expired, recv timeout 5, last rx 19077132455, last ping 19077133712,
> > now 19077134976
> > Feb 10 20:39:39 virt2n kernel:  connection1:0: detected conn error
> > (1022)
> > Feb 10 20:39:39 virt2n kernel: scsi_io_completion_action: 11
> > callbacks suppressed
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 FAILED
> > Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 CDB: Test
> > Unit Ready 00 00 00 00 00 00
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 FAILED
> > Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 CDB: Test
> > Unit Ready 00 00 00 00 00 00
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 FAILED
> > Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> > Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 CDB: Test
> > Unit Ready 00 00 00 00 00 00
> >
> > When this happens iscsi opration is interupted for few seconds,
> > multipath -ll reports (this is only example, i/o pending is reported
> > dor all 30 dm-s:
> >
>
> This all looks normal to me. If 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-14 Thread Ritesh Raj Sarraf
Hello Migo,

On Sun, 2023-02-12 at 11:29 +0100, migo wrote:
> Package: open-iscsi
> Version: 2.1.3-5
> Severity: normal
> X-Debbugs-Cc: migo.ora...@gmail.com
> 
> Dear Maintainer,
> 
> I want to apologize if this is not the right place to report this
> issue, but I need somewhere to start.
> 
> I've lot of this error messages ind dmesg:
> 
> Feb 09 10:56:05 virt2n kernel:  connection2:0: detected conn error
> (1015)
> Feb 09 10:56:05 virt2n iscsid[2790]: connection2:0 is operational
> after recovery (1 attempts)
> Feb 09 10:56:05 virt2n iscsid[2790]: connection1:0 is operational
> after recovery (1 attempts)

Connection dropped and re-established. Looks normal.

> Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection
> 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> (3)
> Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection
> 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> (3)
> Feb 09 10:56:10 virt2n iscsid[2790]: connection1:0 is operational
> after recovery (1 attempts)
> Feb 09 10:56:10 virt2n iscsid[2790]: connection2:0 is operational
> after recovery (1 attempts)

Conn 2:0 recovered but the earlier error it gave looks misleading to
me. What target controller you have ? What there a failover across the
nodes ? How does the target handle the transition period to initiator
queries ?

> Feb 09 10:56:28 virt2n kernel:  connection1:0: detected conn error
> (1015)
> Feb 09 10:56:28 virt2n kernel:  connection2:0: detected conn error
> (1015)
> Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection
> 1:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> (3)
> Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection
> 2:0 error (1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state
> (3)
> Feb 09 10:56:33 virt2n kernel:  connection1:0: detected conn error
> (1015)
> Feb 09 10:56:33 virt2n kernel:  connection2:0: detected conn error
> (1015)
> Feb 09 10:56:33 virt2n iscsid[2790]: connection2:0 is operational
> after recovery (1 attempts)
> Feb 09 10:56:33 virt2n iscsid[2790]: connection1:0 is operational
> after recovery (1 attempts)

Same logs again.

> ... snipped ...

> or enventually this:
> 
> Feb 10 20:39:39 virt2n kernel:  connection1:0: ping timeout of 5 secs
> expired, recv timeout 5, last rx 19077132455, last ping 19077133712,
> now 19077134976
> Feb 10 20:39:39 virt2n kernel:  connection1:0: detected conn error
> (1022)
> Feb 10 20:39:39 virt2n kernel: scsi_io_completion_action: 11
> callbacks suppressed
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 FAILED
> Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 CDB: Test
> Unit Ready 00 00 00 00 00 00
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 FAILED
> Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 CDB: Test
> Unit Ready 00 00 00 00 00 00
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 FAILED
> Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
> Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 CDB: Test
> Unit Ready 00 00 00 00 00 00
> 
> When this happens iscsi opration is interupted for few seconds,
> multipath -ll reports (this is only example, i/o pending is reported
> dor all 30 dm-s:
> 

This all looks normal to me. If you want the errors to be suppressed,
you can increase the timeouts. On the other hand, otherwise, you could
also use dm-multipath on top, like you have shown below.

> 
> web3v (360e00d2c002c17720012) dm-52 FUJITSU,ETERNUS_DXL
> size=500G features='3 queue_if_no_path queue_mode mq' hwhandler='1
> alua' wp=rw
> `-+- policy='service-time 0' prio=10 status=active
>   |- 1:0:0:16 sdah 66:16  active i/o pending running
>   `- 2:0:0:16 sdai 66:32  active ready running
> ais_app (360e00d2c002c1772001f) dm-99 FUJITSU,ETERNUS_DXL
> size=500G features='3 queue_if_no_path queue_mode mq' hwhandler='1
> alua' wp=rw
> `-+- policy='service-time 0' prio=10 status=active
>   |- 1:0:0:28 sdbh 67:176 active i/o pending running
>   `- 2:0:0:28 sdbf 67:144 active ready running
> 

Looks all correct to me. You already are using feature queue_if_no_path

> SAN consists of one Fujitsu DX100 S4 storage with two controllers
> connected to LAN switch with two 10Gb fibre links (one from
> controller), each link has its own VLAN configured. Errors reported
> ocures on virtualization host that are connected mutipath with two
> 10Gb fibre links to respective VLANs. Jumbo frames are enabled across
> the way.
> 

Good. As I expected, you do have a 2 node target controller setup.

> I'll add all neded info upun request.
> 
> I've consulted this issue with Fujitsu representative and it seems we
> have all configured right and he advised me to contact Debian
> support. So I'm here and would kindly ask to point me the 

Bug#1031131: open-iscsi: Lot of iscsi/kernel errors in dmesg with Fujitsu Eternus DX100S4 connected with 2 10Gb ethernet paths with multipathd.

2023-02-12 Thread migo
Package: open-iscsi
Version: 2.1.3-5
Severity: normal
X-Debbugs-Cc: migo.ora...@gmail.com

Dear Maintainer,

I want to apologize if this is not the right place to report this issue, but I 
need somewhere to start.

I've lot of this error messages ind dmesg:

Feb 09 10:56:05 virt2n kernel:  connection2:0: detected conn error (1015)
Feb 09 10:56:05 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:05 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection 1:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:05 virt2n iscsid[2790]: Kernel reported iSCSI connection 2:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:10 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:10 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:28 virt2n kernel:  connection1:0: detected conn error (1015)
Feb 09 10:56:28 virt2n kernel:  connection2:0: detected conn error (1015)
Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection 1:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:28 virt2n iscsid[2790]: Kernel reported iSCSI connection 2:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:33 virt2n kernel:  connection1:0: detected conn error (1015)
Feb 09 10:56:33 virt2n kernel:  connection2:0: detected conn error (1015)
Feb 09 10:56:33 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:33 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:33 virt2n iscsid[2790]: Kernel reported iSCSI connection 1:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:33 virt2n iscsid[2790]: Kernel reported iSCSI connection 2:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:38 virt2n kernel:  connection2:0: detected conn error (1015)
Feb 09 10:56:38 virt2n kernel:  connection1:0: detected conn error (1015)
Feb 09 10:56:38 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:38 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:38 virt2n iscsid[2790]: Kernel reported iSCSI connection 2:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:38 virt2n iscsid[2790]: Kernel reported iSCSI connection 1:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:43 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:43 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:43 virt2n kernel:  connection1:0: detected conn error (1015)
Feb 09 10:56:43 virt2n kernel:  connection2:0: detected conn error (1015)
Feb 09 10:56:44 virt2n iscsid[2790]: Kernel reported iSCSI connection 1:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:44 virt2n iscsid[2790]: Kernel reported iSCSI connection 2:0 error 
(1015 - ISCSI_ERR_DATA_DGST: Data digest mismatch) state (3)
Feb 09 10:56:49 virt2n iscsid[2790]: connection1:0 is operational after 
recovery (1 attempts)
Feb 09 10:56:49 virt2n iscsid[2790]: connection2:0 is operational after 
recovery (1 attempts)
Feb 09 10:57:00 virt2n kernel:  connection1:0: detected conn error (1015) 

or enventually this:

Feb 10 20:39:39 virt2n kernel:  connection1:0: ping timeout of 5 secs expired, 
recv timeout 5, last rx 19077132455, last ping 19077133712, now 19077134976
Feb 10 20:39:39 virt2n kernel:  connection1:0: detected conn error (1022)
Feb 10 20:39:39 virt2n kernel: scsi_io_completion_action: 11 callbacks 
suppressed
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 FAILED Result: 
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:5: [sdl] tag#122 CDB: Test Unit Ready 
00 00 00 00 00 00
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 FAILED Result: 
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:16: [sdah] tag#103 CDB: Test Unit Ready 
00 00 00 00 00 00
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 FAILED Result: 
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:24: [sdax] tag#115 CDB: Test Unit Ready 
00 00 00 00 00 00
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:29: [sdbi] tag#118 FAILED Result: 
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:29: [sdbi] tag#118 CDB: Test Unit Ready 
00 00 00 00 00 00
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:21: [sdaq] tag#108 FAILED Result: 
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
Feb 10 20:39:39 virt2n kernel: sd 1:0:0:21: [sdaq] tag#108 CDB: