On Mon, 08 Oct 2007 17:36:16 +0200 Erez Zilber <[EMAIL PROTECTED]> wrote:
> FUJITA Tomonori wrote: > > On Thu, 4 Oct 2007 13:20:35 -0400 > > Pete Wyckoff <[EMAIL PROTECTED]> wrote: > > > > > >> [EMAIL PROTECTED] wrote on Sun, 09 Sep 2007 14:12 -0400: > >> > >>> [EMAIL PROTECTED] wrote on Sun, 09 Sep 2007 11:30 -0400: > >>> > >>>> Summary: > >>>> - 2.6.21 seems to be a good kernel. 2.6.22 or newer, or RedHat's OFED > >>>> 1.2 > >>>> patched kernels all seem to have iSER bugs that make them unusable. > >>>> - as everything works in 2.6.21 presumably this means there's nothing > >>>> wrong with the iSER implementation in tgtd. well done! :) > >>>> > >>> Well, that's good and bad news. Nice to know that things do work at > >>> times, > >>> but we have to figure out what happened in the initiator now. Or maybe > >>> tgt > >>> is making some bad assumptions. > >>> > >> This all turned out to be a known bug in the mthca IB driver in > >> kernels older than 2.6.21. Including the rhel5 kernel. The > >> initiator uses FMR for memory registrations, and a certain popular > >> chipset was prone to random scribbling on old registrations, > >> yielding wrong data in the application or unexplainable kernel > >> crashes. Nothing wrong in the target. > >> > >> > >>>> with the 2.6.22.6 kernel and iSER I couldn't find any corruption > >>>> issues using dd to /dev/sdc. however (as reported previously) if I put > >>>> an ext3 filesystem on the iSER device and then dd to a file in the ext3 > >>>> filsystem then pretty much immediately I get: > >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): > >>>> ext3_new_block: Allocating block in system zone - blocks from 196611, > >>>> length 1 > >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): > >>>> ext3_new_block: Allocating block in system zone - blocks from 196612, > >>>> length 1 > >>>> Sep 9 21:46:22 x11 kernel: EXT3-fs error (device sdc): > >>>> ext3_new_block: Allocating block in system zone - blocks from 196613, > >>>> length 1 > >>>> ... > >>>> > >>>> I get the same type of errors with 2.6.23-rc5 too. > >>>> > >>> I'm still not been able to reproduce this, at least on my > >>> 2.6.22-rc5. One of these days we'll move to some newer kernels > >>> here, but have been sort of waiting for the bidi approaches to > >>> stabilize somewhat. > >>> > >> Maybe this is fixed. I did find one possible case where the Send > >> result may have gone out before the final RDMA write, in the case > >> when the target is starved for RDMA slots. But I never saw the > >> problem myself, so can't say for sure. > >> > >> In fact, I hacked up the bs-sync code to calculate the result > >> expected by the test application lmdd, rather than read it off disk, > >> and could achieve your high throughputs but never any corruptions. > >> It ran all night last night. > >> > >> Anyway, there's a new git out there with this one new patch and some > >> kernel initiator warnings in the README.iser doc. > >> > > > > Sounds promising. voltaire guys, any chance to try Pete's latest tree? > > We ran some tests on it and it looks ok now (still trying to make it > crash :-) ). We will run more nasty tests soon, and if anything goes > wrong, we will report. We will also try to get some performance numbers > (BW, iops) from our storage. Cool. Pete, the iSER patchset is ready for re-submission? BTW, can you elaborate on the following commit? http://git.osc.edu/?p=tgt.git;a=commit;h=8d9eae7acd041fc10a7cfe560c1c280dcc290fa1 What type of commands hit this bug? Thanks, _______________________________________________ Stgt-devel mailing list [email protected] https://lists.berlios.de/mailman/listinfo/stgt-devel
