I'm forwarding this message to linux-usb-devel because that's where many
of the USB gurus hang out, not on linux-usb-users.  Also, it would help a 
lot of us readers if you tell your mail client to wrap lines of text after 
~70 columns.


On Thu, 10 Feb 2005, Samuel Colin wrote:

> Greetings list and usb gurus,
> I have a problem with one of my machines, namely an openbrick ng (hence the 
> usb controller is a VIA CLE 266), and the usb hard drives attached to it.
> Let us set up the context. There is :
> - A lacie mobile hard drive, which I'll call lacieroot
> - A lacie usb2 hard drive (250 Go), which I'll call lacie1
> - The so-called openbrick ng
> 
> First of all, everything works fine in usb1.1, i.e. with only uhci host in 
> the kernel. Moreover, both disks work on another machine with usb2 (copied 
> large files from and to them, no error messages).
> 
> I decided to install a debian system on lacieroot, and boot the openbrick 
> from it, so I setup a custom 2.6.8 debian kernel, with uhci and the 
> boot-off-usb patch. Things seemed to work pretty well, but I wanted to have 
> full speed for lacie1 (which is planned becoming a debian mirror, so speed 
> for the local network is indeed a plus).
> Thus I compiled a debian 2.6.10 kernel with ehci directly into the kernel 
> (not as a module). This is where things go wrong : things didn't work at 
> first, I quickly figured the "pci=noacpi" trick, which allowed the system to 
> boot, start init, etc...
> The problem is, any thing that would imply writing large chunks of datas 
> (apt-get update, for instance) would make lacieroot go down. Else, "light" 
> operations did not seem to cause problems.
> Additionally, lacie1 refused to be mounted.
> So I took a look at the messages on this list, and decided to test some of 
> the proposed patches (with adaptation to 2.6.10 if needed), namely : 
> - The genesys udelay trick (with a delay of 100)
> - The core/hub.c patch (with overcurrent)
> - The host/ehci-hcd.c patch for bogus fatal irqs
> These patches had been proposed to another person who seemed to have the same 
> kind of problems, so I gave it a try.
> Bad luck, the error messages changed a bit, but did not change the situation 
> (from an error -110, it went to an error -71).
> The old_scheme_first did not work either with this kernel.
> 
> So, seeing that it was a rather sporadic problem recently, I decided to test 
> several kernels to see what happened with each one. I also made the tests 
> with and without an additional usb2 hub, as the issues seemed to be related 
> to timing problems with the usb2 controller. The kernels are kernel.org's 
> ones, with only the boot-off-usb patch, and debugging activated.
> 
> With the two disks (lacie1 and lacieroot) behind the hub, and "pci=noacpi" : 
> - 2.6.5 : looks ok
> - 2.6.6 : looks ok
> - 2.6.7 : looks ok
> - 2.6.8 : looks ok
> - 2.6.9 : looks ok
> - 2.6.10 : usb probing looks a little longer. lacie1 (sdb) is probed.
> The boot hangs after mounting lacieroot's partitions.
> I wait a little...
> usb 1-4.3: new high speed USB device using ehci_hcd and address 4,5,6
> usb 1-4.3: khubd timed out on ep0in
> usb 1-4.3: device descriptor read/64, error -110
> Which corresponds with lacie1.
> So I reboot, it seems that a message about usb is displayed at the end of the 
> shutdown but I can't read (screen goes off).
> - 2.6.10-ac12: same as above
> - 2.6.11-rc3: same as above
> - 2.6.11-rc3-bk4: same as above
> - 2.6.11-rc3-mm1: same as above, but the error handling seems faster and 
> cleaner (scsi: Device offlined - not ready after recovery: host 1 channel 0 
> id 0 lun 0
> scsi1 (0:0): rejecting I/O to offline device)
> 
> Note that I did not test large file transfers for the kernels that seemed to 
> work above, as I have enough usb plugs in the machine and my goal is not to 
> add a usb2 hub so as everything works ok.
> 
> Then, come the same tests, without the hub, and "pci=noacpi", and the output 
> is more interesting. Note that except when I say I reboot, I halted and 
> unplugged the disks so as to reset their own controller state.
> 
> - 2.6.5: looks ok (with writing a 40 Mo file from lacie1 to lacieroot, but 
> the same behaviour as 2.6.7 below might be observed)
> Reboot with the same kernel, hang after mounting the root fs readonly.
> - 2.6.6: error : 
> ehci_hcd 000:00:10.3: port 3 reset error -110
> hub 1-0:1.0: hub_port_status failed (err=-32)
> It hangs when detecting partitions on lacieroot, and hangs just after 
> detecting the last partition, but before finishing the detection on lacieroot.
> I wait, no messages for at least 2 minutes, I reboot. The reboot works, but 
> "SYSTEM ERROR" : apparently the disk's usb controller went crazy.
> - 2.6.7: looks ok (see 2.6.5 above), but copying a bigger file (iso image) 
> from lacie1 to lacieroot made some random figures appear on the console (not 
> many, but it looked strange). Then, making a diff between the two produced 
> the same kind of figures, but the diff hung in the middle, and both disks 
> were no more readable (I/O error), so I had to halt manually.
> - 2.6.8:
> boots ok, but blocks about when starting various daemons.
> I wait a little, and I get the error : 
> usb 1-3:
> scsi0 (0:0): rejecting I/O
> Thus apparently lacieroot went down.
> - 2.6.9:
> boots, but blocks soon after mounting /root on lacieroot.
> - 2.6.10:
> boots correctly, but mounting sdb1 does not work (see error -110 messages 
> above)
> - 2.6.11-rc3: same as 2.6.10
> - 2.6.11-rc3-mm1: same as 2.6.10, but error handling looks faster and cleaner.
> 
> So, my thoughts : the VIA CLE 266 is a little bogus (I had already read it 
> elsewhere, and the disks function well on other machines), and some code 
> introduced in the kernel regarding the delays seems to make it misbehave 
> (because a usb2 hub seem to lessen the problems). Moreover, the error 
> apparently affects the usb controllers of the disks, as a single reboot 
> causes errors.
> 
> Also, note that I surely made mistakes or imprecisions in the tests I 
> conducted above, but the general behaviour is what I described.
> 
> My question : what can I do to help you correct it ? I can test patches, 
> kernels, but only you usb gurus can give me instructions so as I can set up a 
> good testbed and give you informations and debugging logs.
> 
> Thanks in advance,
> Samuel Colin.

It's not clear from what you said; did any of the patches you used include 
this one:

http://marc.theaimsgroup.com/?l=linux-usb-devel&m=110797162426830&w=2

If not you should definitely add it.

For testing purposes, it would be a lot better if you leave the ehci-hcd 
driver as a separate module instead of building it into the kernel.  Is 
there any way you can do this (you would have to boot from a non-USB 
drive)?

When reporting errors on USB mass storage devices, a simple description
("hangs after mounting") doesn't do any good.  Even an extract from the
system log usually isn't enough.  You need to turn on the
USB_STORAGE_DEBUG configuration option to get a useful amount of
information.

Alan Stern



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
[email protected]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to