Replies in line. On Mon, Apr 15, 2013 at 9:09 AM, Ian Campbell <i...@hellion.org.uk> wrote:
> On Mon, 2013-04-15 at 08:19 -0400, Anthony Sheetz wrote: > > > > Did you ever happen to try a transfer over a > > non-tunnelled connection? > > > > > > Yes, tried file transfers from another machine on the local network - > > never had a problem with those. > > So this issue isn't the tunnel, good. > > > Were you able to successfully transfer the > > file to the dom0 filesystem or to any other system (e.g. one > > not running > > Xen) on this end of the openswan link? > > > > Yes - tried that several times, and was able to do the transfer with > > no corruption, and md5sum matched. > > > > > > I'm not sure what error > > detection/correction scp/rsync or if they have any additional > > verification options which could be tried or perhaps it is > > possible to > > run md5sum on the stream before it hits the disk (can one > > rsync/scp to > > stdout? I doubt it). > > > > > > Tried doing 'scp file.sql | md5sum' on DomU which resulted in a > > matching md5sum. We decided this eliminated the openswan link as the > > culprit. > > This was in the domU? That would, I think, eliminate corruption in the > network at every stage including the dom0->domU link. > > That would suggest that the md5sum failures you saw before were caused > by writing the file to disk and reading it back (which does at least > mean we only have one bug to deal with...) > > > If you can transfer to dom0 OK then it might be > > interesting to try turning off the various offloads (GSO, SG > > etc) on the > > vif link. > > > > > > Any instructions on doing that? > > The above makes me suspect this isn't a worthwhile experiment but in any > case: > I'd agree - for now, in the interest of time, we'll shelve this avenue of investigation. > > "ethtool -k <device>" to examine and "ethtool -K <device> <offload> off" > to turn the various things off. I'd do it both on the device inside the > guest and the associated vifX.Y > > > I wonder if the layering of crypto+lvm+xen-blkback is causing > > the barriers which ext3 requires to function correctly to not > > occur in > > the right places. Does something need to be manually > > configured to > > enable barriers at some layer? (or perhaps I am thinking of > > DISCARD > > support). If you were able to attempt to reproduce without the > > crypto > > bit in dom0 for the VM disk that would be really useful. It > > might also > > be interesting to try using the ext3 barrier mount option in > > the guest > > to switch barriers either off or on (I can't remember what the > > default > > was for Squeeze). > > > > > > Google led me to try mounting the file system with barriers=0, and no > > luck. > > How did you do this? IIRC getting mount options to the root filesystem > to take effect involves more than just editing fstab (rootflags= on > command line I think? No idea how one inserts a space there) > Ah, ok. Did use fstab options. Will look in to other methods of specifying this. I'd imagine editing the boot option in pygrub might be a good avenue? > > For experimentation it might be useful to attach an xvdb to the domain > and use that as the write target, it'll allow easier experimentation > with mount options, and as a bonus you won't keep hosing your root > filesystem (which I imagine is getting pretty tedious...) > To be sure I understand: create a new lv, mount it, and use it as the write target. That's an excellent idea. Next time I experiment I'll be using that. > > > I appreciate that you may have redeployed/downgraded the > > systems so some > > of the above experiments might be quite hard to try out but if > > you could > > setup a spare system or something it would be very much > > appreciated. > > > > > > We planned for this, and once we have some ideas to try (with some > > detailed instructions for trying them) we'll be purchasing a spare > > hard drive to try them out. We'd like this problem solved, and we're > > willing to spend a little to do it. > > Other than the barriers thing I think the most worthwhile thing to try > would be a Wheezy domU kernel. Ok, will try that. If you've got instructions close to hand on installing and using a different kernel in domU, that'd save me the trouble of looking it up. No worries if not - my google foo is decent. > > Ian. > >