On Tuesday 14 December 2004 10:14 pm, Dale Manny wrote:
> David,
> 
> I had started my response is intermingled with your reply while I 
> started taking action.  I would appreciate it if you scanned through 
> most of this, I have posed a couple of questions in various places.  
> However we have major progress.  Your suggestion pertaining to further 
> restricting the mask seems to have done the trick.  I did this in -bk8.  
> I have not (yet) been able to toggle this on and off for demonstrating 
> direct control as I am remotely logged in to the colleague's machine and 
> doing some bulk work.  Therefore I don't want to reboot.  However I have 
> every reason to believe that this is a 'surgical' fix.

Hmm, OK ... I just submitted a patch making that mask a
module parameter, defaulting to "no park mode" but letting
that mode be enabled.


> I assume that this is disabling what would otherwise be 'a good thing 
> (tm)'.  I look forward to any elaboration you can provide.

Yes, "park mode" is normally a Good Thing.  When I first enabled
it, it made a significant improvement in throughput on NForce2;
but that was a long time ago, and many drivers have changed a lot
since then.  (Notably usb-storage.)

IMO the fact that it makes any difference in terms of data integrity
suggests a hardware bug somewhere, or cache setup problem ...

I just did an experiment with an NForce2 system and a Maxtor drive
(not the one you tried; USB-only, not USB+FireWire), and "hdparm -Tt".
Disabling "park" mode was about a 10% improvement in throughput!
Which is wierd at several levels, since even so it's still much
lower than I'd expect on that drive (by about 6 Mbyte/sec; it's
still "slow").  I'm not sure what's up with that, but an NF3 system
seems similar.  PCI and USB bus traces here would be interesting.


> I am willing to continue to work the issue.

I don't have any more bright ideas to try here though.

You seem to have a much-better-than-usual test setup (!),
so I'll be happy if you can just confirm that resolves
the problem you're seeing ... and can provide slightly
better characterization of the pattern of data errors.
(Entire 512 byte USB packets?  64 byte K7 cachelines?)

 
> I will go to the point of trying a quick change back and forth of the 
> mask.  Is there anything to keep me from stashing a copy of the one 
> module and switching from boot to boot?  I seem to remember something 
> about a new feature of module checksum but I think the default was not 
> to use them.  For that matter can I rmmod and subsequently modprobe to 
> avoid full reboots?

If you don't enable those distro-oriented module options, then there
will be no problem with rmmod/modprobe-new/rmmod/modprobe-old style
stuff; developers do that all the time!  Alternatively, the patch
I just sent should let you "modprobe ehci-hcd park=N" (N <= 3) to
experiment using just one module.


> As far as "the raw (uncached) block device", do you mean anything other 
> than simply using /dev/sda?

I don't think that's cached; no.


> >If it's just reads, that could also be a symptom of memory corruption
> >or failures.  Does memtest86 say your memory is OK?  Does anything
> >other than USB show similar problems?
> 
> As you may have gathered, I don't think so but I keep memtest86 in grub 
> and will schedule a run soon.  It has been a while since I did it.  I 
> wish I had an ECC setup.  I very grudgingly went away from having that 
> functionality.
> 
> Of course the recent progress makes me think this is a less likely 
> possibility.  I would be interested to know if you still think this is 
> of value.  I will probably still do this, it is just a question of when.

It'd be good to confirm that's not a factor, but the "park" mode
behavior is strange enough to explain a lot ... not high priority.


> >I suppose it's probably too much to expect you to be able to just
> >capture the USB traffic (say with a CATC) and show what's happening
> >on the wire at the time the error is detected in memory ... ;)
> >  
> >
> Love to.  I am comfortable with protocol analyzers in general but do not 
> have any experience or acces to any USB diag equipment.  If you have any 
> suggestions toward gaining the use of one, I would be willing to 
> consider them. 

Sorry.  That'd be a tricky setup too -- it'd require the capture
tool to synchronize with the data received more or less at the PCI
level, which is tough to do.  Dreaming is still permitted, though!  :)


> Saw some activity  in the BK8  patch set in the vicinity of 
> drivers/usb/host/ehci_____.c.  Got it .  Made it.  Tried it.  Pretty 
> much the same results but the lowest count to date for a full read of 
> the 11.5GB test drive.  It came in at 407 but this is not radically 
> better.  The range had previously been 430 to 608.

Suggesting that this bug is unrelated to one fixed in BK5.


> >>>>>>1) Thinking about getting one of the USB-PC to USB-PC active 
> >>>>>>connection cables and seeing if I can reproduce problem via TCP/IP
> >>>>>>            
> >>>>>>
> >
> >The problem addressed by that previous patch was first uncovered
> >with such a cable, but as a rule the networking layer is a bit too
> >fault-tolerant to let a corrupted packet get in its way!  ;)
> ...
> 
> My most recent feeling about getting one of these cables is that I don't 
> know if it would prove anything.  For starters, I almost certainly would 
> have to have some sort of UDP type program.  TCP would probably just 
> deal with it.  I do not know how the same error symptom would be expressed.

UDP checksum errors might show the problem.


> New points/questions:
> 
> In my thinking about this problem, I have it in my mind that the 
> motherboard BIOS really does not enter into the performance beyond the 
> point of establishing interrupts, etc.  Do you agree?  The reason that I 
> ask is that I may change CPUs and may need to upgrade the BIOS.

We don't have specs for the NForce2 chips, so it's always possible
that they have settings that affect EHCI behavior.  Hard to say what
unknown chipset tweaks would do!  But I'd expect that a BIOS update
wouldn't break things.


Many thanks for the good debugging help on your end, it makes a
big difference when tracking something like this!

- Dave



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to