I'm forwarding this message to linux-usb-devel because that's where many of the USB gurus hang out, not on linux-usb-users. Also, it would help a lot of us readers if you tell your mail client to wrap lines of text after ~70 columns.
On Thu, 10 Feb 2005, Samuel Colin wrote: > Greetings list and usb gurus, > I have a problem with one of my machines, namely an openbrick ng (hence the > usb controller is a VIA CLE 266), and the usb hard drives attached to it. > Let us set up the context. There is : > - A lacie mobile hard drive, which I'll call lacieroot > - A lacie usb2 hard drive (250 Go), which I'll call lacie1 > - The so-called openbrick ng > > First of all, everything works fine in usb1.1, i.e. with only uhci host in > the kernel. Moreover, both disks work on another machine with usb2 (copied > large files from and to them, no error messages). > > I decided to install a debian system on lacieroot, and boot the openbrick > from it, so I setup a custom 2.6.8 debian kernel, with uhci and the > boot-off-usb patch. Things seemed to work pretty well, but I wanted to have > full speed for lacie1 (which is planned becoming a debian mirror, so speed > for the local network is indeed a plus). > Thus I compiled a debian 2.6.10 kernel with ehci directly into the kernel > (not as a module). This is where things go wrong : things didn't work at > first, I quickly figured the "pci=noacpi" trick, which allowed the system to > boot, start init, etc... > The problem is, any thing that would imply writing large chunks of datas > (apt-get update, for instance) would make lacieroot go down. Else, "light" > operations did not seem to cause problems. > Additionally, lacie1 refused to be mounted. > So I took a look at the messages on this list, and decided to test some of > the proposed patches (with adaptation to 2.6.10 if needed), namely : > - The genesys udelay trick (with a delay of 100) > - The core/hub.c patch (with overcurrent) > - The host/ehci-hcd.c patch for bogus fatal irqs > These patches had been proposed to another person who seemed to have the same > kind of problems, so I gave it a try. > Bad luck, the error messages changed a bit, but did not change the situation > (from an error -110, it went to an error -71). > The old_scheme_first did not work either with this kernel. > > So, seeing that it was a rather sporadic problem recently, I decided to test > several kernels to see what happened with each one. I also made the tests > with and without an additional usb2 hub, as the issues seemed to be related > to timing problems with the usb2 controller. The kernels are kernel.org's > ones, with only the boot-off-usb patch, and debugging activated. > > With the two disks (lacie1 and lacieroot) behind the hub, and "pci=noacpi" : > - 2.6.5 : looks ok > - 2.6.6 : looks ok > - 2.6.7 : looks ok > - 2.6.8 : looks ok > - 2.6.9 : looks ok > - 2.6.10 : usb probing looks a little longer. lacie1 (sdb) is probed. > The boot hangs after mounting lacieroot's partitions. > I wait a little... > usb 1-4.3: new high speed USB device using ehci_hcd and address 4,5,6 > usb 1-4.3: khubd timed out on ep0in > usb 1-4.3: device descriptor read/64, error -110 > Which corresponds with lacie1. > So I reboot, it seems that a message about usb is displayed at the end of the > shutdown but I can't read (screen goes off). > - 2.6.10-ac12: same as above > - 2.6.11-rc3: same as above > - 2.6.11-rc3-bk4: same as above > - 2.6.11-rc3-mm1: same as above, but the error handling seems faster and > cleaner (scsi: Device offlined - not ready after recovery: host 1 channel 0 > id 0 lun 0 > scsi1 (0:0): rejecting I/O to offline device) > > Note that I did not test large file transfers for the kernels that seemed to > work above, as I have enough usb plugs in the machine and my goal is not to > add a usb2 hub so as everything works ok. > > Then, come the same tests, without the hub, and "pci=noacpi", and the output > is more interesting. Note that except when I say I reboot, I halted and > unplugged the disks so as to reset their own controller state. > > - 2.6.5: looks ok (with writing a 40 Mo file from lacie1 to lacieroot, but > the same behaviour as 2.6.7 below might be observed) > Reboot with the same kernel, hang after mounting the root fs readonly. > - 2.6.6: error : > ehci_hcd 000:00:10.3: port 3 reset error -110 > hub 1-0:1.0: hub_port_status failed (err=-32) > It hangs when detecting partitions on lacieroot, and hangs just after > detecting the last partition, but before finishing the detection on lacieroot. > I wait, no messages for at least 2 minutes, I reboot. The reboot works, but > "SYSTEM ERROR" : apparently the disk's usb controller went crazy. > - 2.6.7: looks ok (see 2.6.5 above), but copying a bigger file (iso image) > from lacie1 to lacieroot made some random figures appear on the console (not > many, but it looked strange). Then, making a diff between the two produced > the same kind of figures, but the diff hung in the middle, and both disks > were no more readable (I/O error), so I had to halt manually. > - 2.6.8: > boots ok, but blocks about when starting various daemons. > I wait a little, and I get the error : > usb 1-3: > scsi0 (0:0): rejecting I/O > Thus apparently lacieroot went down. > - 2.6.9: > boots, but blocks soon after mounting /root on lacieroot. > - 2.6.10: > boots correctly, but mounting sdb1 does not work (see error -110 messages > above) > - 2.6.11-rc3: same as 2.6.10 > - 2.6.11-rc3-mm1: same as 2.6.10, but error handling looks faster and cleaner. > > So, my thoughts : the VIA CLE 266 is a little bogus (I had already read it > elsewhere, and the disks function well on other machines), and some code > introduced in the kernel regarding the delays seems to make it misbehave > (because a usb2 hub seem to lessen the problems). Moreover, the error > apparently affects the usb controllers of the disks, as a single reboot > causes errors. > > Also, note that I surely made mistakes or imprecisions in the tests I > conducted above, but the general behaviour is what I described. > > My question : what can I do to help you correct it ? I can test patches, > kernels, but only you usb gurus can give me instructions so as I can set up a > good testbed and give you informations and debugging logs. > > Thanks in advance, > Samuel Colin. It's not clear from what you said; did any of the patches you used include this one: http://marc.theaimsgroup.com/?l=linux-usb-devel&m=110797162426830&w=2 If not you should definitely add it. For testing purposes, it would be a lot better if you leave the ehci-hcd driver as a separate module instead of building it into the kernel. Is there any way you can do this (you would have to boot from a non-USB drive)? When reporting errors on USB mass storage devices, a simple description ("hangs after mounting") doesn't do any good. Even an extract from the system log usually isn't enough. You need to turn on the USB_STORAGE_DEBUG configuration option to get a useful amount of information. Alan Stern ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ [email protected] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
