Re: DataDigest CRC mismatches
On Wed, Jul 15, 2009 at 07:14:40PM -0700, mala...@us.ibm.com wrote: > Seems to be a well known problem with iSCSI data digests and mirrored devices. > > See this (iscsi issue): > http://thread.gmane.org/gmane.linux.iscsi.open-iscsi/2670/focus=48961 > or (dm-raid1 issue) > > http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/5392 That clearly looks like it. Ouch. Looks very difficult to fix without copy the page data before transmission/checksumming. :-( I'll do more more reading. Thanks again, Mark. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: DataDigest CRC mismatches
Mark van Walraven [ma...@netvalue.net.nz] wrote: > > Do you mean a well known problem with zero-copy block devices or a well > known problem with iscsi with data digests? Seems to be a well known problem with iSCSI data digests and mirrored devices. See this (iscsi issue): http://thread.gmane.org/gmane.linux.iscsi.open-iscsi/2670/focus=48961 or (dm-raid1 issue) http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/5392 Thanks, Malahal. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: DataDigest CRC mismatches
Hi Malahal, Thanks for your response. On Wed, Jul 15, 2009 at 09:41:58AM -0700, mala...@us.ibm.com wrote: > > I've been playing with kvm (virtio_blk, writeback) -> dm_multipath > > (failover, queue_if_no_path) -> open-iscsi -> gigE -> IET on a new > > server, winding up the queue and segment lengths and I'm getting > > frequent disconnects during heavy writes from the KVM guest. Wireshark > > shows a PDU with a incorrect DataDigest (sample at > > http://www.interspeed.co.nz/crcerr.pcap for a little while) just before > > IET resets the connection (reasonably, if it gets the same CRC mis-match). > > What kind of application are you using to generate the write I/O? It is The application is KVM (qemu-kvm-0.10.5), running a single Debian Lenny instance with the iscsi device visible to the guest as a virtio disk. I've found running this on the guest is a pretty reliable way to produce the problem: find / > /dev/null ; sync The guest filesystems are ext3, so presumably journal flushes are the trigger ... > possible that a file system (or some other application) can modify the > write buffer once it is submitted to the block layer. Any modification > done after generating CRC is going cause CRC mismatch. This is a well > known problem! Do you mean a well known problem with zero-copy block devices or a well known problem with iscsi with data digests? I've been trawling through the code and if I understand correctly, iscsi_sw_tcp_xmit_segment() uses iscsi_tcp_segment_done() to calculate the digest after each sendpage or sendmsg. Do you think the segment data might be getting modified in between sendpage/sendmsg and packet assembly? (sendpage looks to be sock_no_sendpage if data digests are enabled.) If access to the segment data isn't exclusive during the execution of iscsi_sw_tcp_xmit_segment(), then I suppose there is also the chance the data might be altered between sendpage/sendmsg and crypto_hash_final() completing. (FWIW, the same thing happens with 871, built from source.) Thanks, Mark. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---
Re: DataDigest CRC mismatches
Mark van Walraven [ma...@netvalue.net.nz] wrote: > > Hi All, > > I've been playing with kvm (virtio_blk, writeback) -> dm_multipath > (failover, queue_if_no_path) -> open-iscsi -> gigE -> IET on a new > server, winding up the queue and segment lengths and I'm getting > frequent disconnects during heavy writes from the KVM guest. Wireshark > shows a PDU with a incorrect DataDigest (sample at > http://www.interspeed.co.nz/crcerr.pcap for a little while) just before > IET resets the connection (reasonably, if it gets the same CRC mis-match). What kind of application are you using to generate the write I/O? It is possible that a file system (or some other application) can modify the write buffer once it is submitted to the block layer. Any modification done after generating CRC is going cause CRC mismatch. This is a well known problem! -Malahal. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~--~~~~--~~--~--~---