On Mon, 30 Jun 2003, Adam Kessel wrote: > When I mount or unmount the device, I see this in syslog (and on the > console): > > Jun 28 00:32:30 joehill kernel: SCSI error: host 1 id 0 lun 0 return code = 8000002 > Jun 28 00:32:30 joehill kernel: ^ISense class 7, sense error 0, extended sense 0 > > As long as I don't read or write to the device, I can mount and unmount > (in 2.5.73) repeatedly with no problems other than this error message.
I think those particular error messages arise because your device has problems handling a command telling it to lock/unlock its door. Obviously it doesn't have any sort of door to lock or unlock; the problem is that it doesn't return an appropriate error code. In fact, it just says there's an error with no additional information at all; that's what the "sense error 0, extended sense 0" above means. Since the SCSI driver doesn't know what the meaning of these errors is, it prints a log message. You can just ignore it. > I just tried copying a bunch of data to the device. After 50M or so, it > slowed to a crawl. Initially, I thought it was crashed but eventually it > started moving again, very slowly (probably around 10K-100K / sec). The > log from this operation is here: > > http://bostoncoop.net/adam/temp/log_while_copying Interesting. What's happening is that after a while your device becomes comatose. It stops responding to commands. Linux's error recovery takes 50 seconds, at the end of which your device has been reset and it starts working again, though not for long. That's why your throughput is so lousy. > Here is the log from the point where I tried unsuccessfully to unmount > the device: > > http://bostoncoop.net/adam/temp/unmount_log > > Interestingly, while the unmount process was stuck, I ran df, and got: > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/hda2 18313076 17338432 44388 100% / > /dev/sda1 18313076 17338432 44388 100% /mnt/neuros > > /dev/sda1, the USB drive, is now returning to same size as the hard drive > (in reality, it's a 128M drive, which is the capacity it returned before > I started the unmount). I wouldn't trust that; filesystem information while an unmount is in progress is liable to be unreliable. The unmount wasn't really stuck; it was moving forward but _very_ slowly. Of course, unmount causes the filesystem to flush its buffers. All the cached data waiting to be written to the device has to be sent out before the unmount can complete. Your device would crash, after 50 seconds the driver would reset it, and then the device would accept another MB or so and crash again. You can see this happening repeatedly in the unmount_log file; just look for places where the timestamp jumps forward by more than 1 second. If you had had enough patience, eventually everything would have gotten written out and the unmount would have finished. It might have taken a few hours, though... > After several minutes where the unmount process was stuck, I finally > removed the module, which caused lots of Buffer I/O errors, a filesystem > panic, etc.. There is a call trace log as well. Here is the log from > the point of removing the module until the time it was finally done: > > http://bostoncoop.net/adam/temp/remove_module_log That looks about like I'd expect, given what you did. Of course, the filesystem shouldn't panic, but that's a separate issue. > Incidentally, if are able to isolate this to a firmware bug, my > expectation is that Neuros would be open to suggestions. This does look very much like a firmware bug. It would be especially nice if it were repeatable easily, without having to send 50 - 100 MB of data back and forth first. By the way, I noticed that all the problems in your logs occurred with writes to the device. Have you noticed any similar problems with reads? > > > Also, frequently the drive appears to be full when it's not. Although > > > it's a 128M memory card, it usually maxes out around 60-70M. > > What do you mean, appeared to be full? What did 'df' show? > > df showed 0M free... I can't replicate it now, so I don't remember what > it thought the capacity was at the time. I suspect this is related to > filesystem corruption on the drive rather than anything else... Space is > occupied by files that were copied to the device but don't actually show > up. Okay, let's not worry about that for the time being. > Hopefully the logs above will provide more insight into the situation. > Is there debugging output I should send other than what usb-storage is > putting in syslog? In general, _everything_ that shows up in the log during the time intervals in question might be useful. Don't filter just the usb-storage stuff (even though it will probably constitute the majority). One thing that could help a little would be if you also enable USB-debugging in addition to usb-storage debugging. That's another kernel configuration option. Try doing that, then capture a complete system log for the time when you get that dreadful slowdown; make sure it covers a period of at least a couple of minutes. With that information, we may be able to say definitely that this is (or is not) a firmware issue. Alan Stern ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01 _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel