Hi Alan :) * Alan Stern <[EMAIL PROTECTED]> dixit: > On Thu, 23 Nov 2006, DervishD wrote: > > > Intermittent problems like this are very hard to track down. It > > > sounds like a hardware problem of some sort, but without more > > > information it's impossible to say if the problem lies in your > > > computer, the USB cable, the USB-storage adapters, or the hard disk > > > drives. Have you tried using different cables? > > > > Yes, and the error dissappears (or at least it hasn't been > > produced yet) when using a very short cable (less than 0.5m), while a > > USB memory stick works OK with a 1m long cable at the same speed! The > > set of cables causing problems are of different brands, and their > > only common "feature" is their lenght: about 1m. The same cables work > > OK (no detectable problems) in other computers, I've tested this > > morning. > > It could be some sort of electromagnetic interference phenomenon. > Someone once reported that merely turning on the fluorescent lights > in the root with his computer was enough to cause USB errors.
With one of the USB cards I've tested, the number of differences raised a lot just by plugging a long USB cable in one of the ports. Please note that I say "cable", not "device", because the cable wasn't connected to anything!. Probably the cable was gathering electric noise and that screwed up the card or whatever :??? I've seen problems with an ADSL line not synchronizing due to noise, so it may be the cause here too. > On the other hand, variations in cable length don't appear to > relate to the error code mentioned below. So who knows... Cable length seems related to the problem in file contents. Probably the noise is provoking that some bits aren't transmitted correctly from the adapter to the hard disk (that is, the adapter changes them). As for the other problem... I don't know. It seems related probably with speed or something like that. Looks like the adapter can't wait until a sector is retrieved, or doesn't perform retries, etc... [IO error messages] > > > No, the messages are not false. They definitely indicate a > > > problem; you mustn't dismiss them so easily. With borderline > > > hardware it's entirely possible that an operation can fail at > > > moment and then succeed a few moments later. > > > > I've tested the hard disk with a destructive badblocks and with > > some diagnostic tool of Seagate, and all the disks are OK. In fact > > they work reliably (and SMART doesn't show any problem) if used > > directly. That leaves us with the usb-storage adapters as causing > > those failures, but: why should them fail for a sector that is being > > read from a hard disk which can be read after a while? > > I agree that a likely source is the adapters. Remember, these > things have to communicate with both the hard disk and the USB > subsystem. So even though the disk may be working fine, a problem > at the USB level could cause errors to show up. Or a problem in > the connection between the adapter and the drive. Or there could > be some internal error in the adapter itself, unrelated to either > the drive or the USB connection. These adapters are cheap, but I know lots of people using them without problems. In fact they seem to work under windows (don't ask me why). I'm going to reboot the problematic box with a windows hard disk just to test if the rest of things work. > > > Well, you could start by posting some of the error messages! > > > > Yep, sorry O:)) Here are they: > > > > kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002 > > kernel: Current sd08:01: sns = 70 4 > > kernel: ASC=4b ASCQ= 0 > > kernel: Raw sense data:0x70 0x00 0x04 0x00 0x00 0x00 0x00 0x0a 0x00 > > 0x00 0x00 0x00 0x4b 0x00 0x00 0x00 0x00 0x00 > > kernel: I/O error: dev 08:01, sector 1804512 > > Sense key = 0x04 means Hardware error. ASC = 0x4b means Data Phase > error, which is probably a fancy way of saying that the adapter had > some unspecified difficulty communicating with the drive (although > manufacturers aren't very careful about the error codes used in > their hardware, so it could easily mean something else). Which seems to confirm my suspicions: The adapter is trying to get a sector while the drive is, e.g. dumping its internal cache or something. The drive is not responding as fast as the adapter wants and then all goes bad. The most peculiar thing is that these IO errors NEVER happen during writes! They happen only when writing, and that's even more strange :??? > > Thanks a lot for the advice :) I'll try to provide more and > > better data, and more tests. I'm going to perform some kind of > > differential analysis to discover the exact combination (if any) > > or conditions (again, if any) that lead to the error. Anyway this > > is going to be slow, because each test tooks half an hour or even > > more. > > Sometimes differential analysis doesn't help. Here's an example: > > I've got a USB hard drive adapter. Plugged in to my home computer, it > doesn't work. But if I use a different USB cable, then it does work. > Alternatively, I can use the old USB cable with a USB flash drive, and > that works. Or, I can move both the adapter/hard drive and the cable to a > different computer, and again they work. In short, switching any one of > the three components (computer, cable, device) is enough to get things > working again -- so which component is at fault? That's the reason I included "if any" in my message: I don't really have any hope of finding the culprit, because it can be a combination. For me, it's enough to just being able to copy data to the disk and knowing it is really there without having to cmp it. The IO errors may be caused by a speed problem or whatever, and they don't worry me so much (by now). Anyway, I'll investigate the issue just in case. Again, thanks a lot for your answer, specially for explaining the SCSI error message I was having :)) If I discover anything, I'll post here (I mean the USB list). Until I'm able to reboot the machine under windows (I have a spare disk where to install that thing for testing) I won't be able to know if the problem lies in the kernel, the mobo, the USB card, the cable, the adapter, etc... Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen! ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel