An update on this (see long quote below) --- maybe it helps someone with a similar problem:
After a total disk failure seemed imminent --- the disconnected disk didn't come back after turning the computer off and back on --- I got two new disks to replace the Maxtor 7V300F0 disks. I made a software RAID-1 from the new disks and managed to copy all my data to the new disks. That allowed me to play around with the Maxtor disks. I found out that there is a firmware update available for them[1] which is supposed to solve the problem with the disks disconnecting. I updated the firmware today. The disks seem to be working so far; I'm using them for making backups, and a backup on one of the disks made before the firmware update is still readable after updating the firmware. I didn't check the other disk, but mdadm still recognized the other disk as being part of a RAID-1 and started an md device for it, which would indicate that everything on that disk was also still readable. --- Time will tell if the problem with disconnecting is finally solved. For reference on the firmware, see [2]. If you need to boot some DOS version from a USB stick or an USB disk: Install the unetbootin package. Download a FreeDos 2.88MB floppy image[3] (1.44 is too small) and the firmware archive. Unzip the firmware archive. Mount the floppy image file as loop device (mount here imagefile -o loop) and copy the files from the firmware archive into the image file. Unmount the image file. Use unetbootin to to make a bootable USB stick from the floppy image (Select "Floppy Image" instead of "ISO" in unetbootin when selecting the file to write onto the stick.) Disconnect all hard disks and DVD/CD drives except for the Maxtor disk the firmware of which you're going to update. Turn off AHCI mode in the BIOS. Boot from the USB stick, but press F8 while booting (after the boot manager) and do NOT load highmem, emm386 and especially not some pciusb.sys (or how it was called). Run dload.exe, choose "no power control", "first disk found" and an option called something like "transfer in one part"; then select the firmware file. It takes a few seconds to update the firmware; the update program will tell you when it was successful. When it was successful, exit the update program and start it again to verify the firmware version. It should show firmware version VA111680. --- It worked for me, but ymmv, so take all precautions, like making backups before you start ... On a side note, it took me awfully long to figure out how to make a "DOS bootable USB stick under Linux". Try to google for that, you just don't find it ... Your BIOS must be able to boot from such devices, but if it does, it seems you could even use a card reader (with a card inserted, of course) instead of a stick, and it doesn't matter if the stick says that is supports booting or not. Unetbootin is awesome ... This one might also be interesting: http://www.ultimatebootcd.com/ Using a Debian kernel (2.6.24) --- one of the things I tried --- did not solve the problem with disconnecting. [1]: http://www.eserviceinfo.com/downloadsm/24514/_.html [2]: http://forums.storagereview.net/index.php?showtopic=22435&st=0 [3]: http://www.fdos.org/bootdisks/ --- I think I used another one, but I don't remember where I got it. If you need the image file I used, let me know and I can mail it to you. On Wed, Dec 10, 2008 at 03:15:56PM -0600, lee wrote: > Hi, > > what's the difference between a standard kernel and a kernel that > comes as a Debian package? > > I'm using a standard kernel, but I'm having problems with one of my > disks (see below). The disk "gets lost" every now and then, i. e. it > seems to take a couple days or weeks now (I've seen it taking as long > as about two months with the old board) before it happens. The disk > remains unavailable until I turn the power off and back on. Once the > disk is back, I can re-add the partitions on the failed disk to the md > devices, and they are being rebuilt just fine, and it works for some > time until the disk "gets lost" again. > > This problem isn't new; it has been there with another board/CPU/RAM, > cables and power supply ever since I got the two SATA disks new. It's > been there with every standard kernel I tried over the years, with > i368, and now it's the same with amd64. I've been thinking it was a > problem of the board I had, but as it's there with another board etc., > it must be either the disk itself or the SATA driver. > > Googling revealed that this isn't a rare problem. There are people > reporting it with all kinds of different disks and boards and > different distributions. Some suggest that it's a problem with the PSU > or the SATA cables, but imho that's unlikely. Interestingly, it seems > to be more common for this problem to show up in RAID setups. > > Also interestingly, mdadm did *not* detect the disk failure for > /dev/md2 which is mounted read only. > > And even more interestingly, the problem is and has always been with > /dev/sdb, never with /dev/sda. I can't tell if the disks have been > swapped when I connected them to the new board, though. But I'd rule > out a problem with the firmware of the disk as well since both disks > use the same firmware version. > > So is there a difference between Debian and standard kernels so that I > might not have this problem if I'd use a Debian kernel? Has this > problem been solved in some way yet? > > I might get another two disks, but I'm afraid that the same problem > would come up with other disks as well ... > > > Info: > > cat:/home/lee# uname -a > Linux cat 2.6.27.7-cat-smp #4 SMP Thu Dec 4 16:03:29 CST 2008 x86_64 GNU/Linux > cat:/home/lee# smartctl -i /dev/sda > smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce > Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION === > Model Family: Maxtor MaXLine III family (SATA/300) > Device Model: Maxtor 7V300F0 > Serial Number: V604E3FG > Firmware Version: VA111630 > User Capacity: 300,090,728,448 bytes > Device is: In smartctl database [for details use: -P show] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > Local Time is: Wed Dec 10 15:00:04 2008 CST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > cat:/home/lee# smartctl -i /dev/sdb > smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce > Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION === > Model Family: Maxtor MaXLine III family (SATA/300) > Device Model: Maxtor 7V300F0 > Serial Number: V601T7VG > Firmware Version: VA111630 > User Capacity: 300,090,728,448 bytes > Device is: In smartctl database [for details use: -P show] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > Local Time is: Wed Dec 10 15:00:42 2008 CST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > cat:/home/lee# lspci > [...] > 00:1f.2 SATA controller: Intel Corporation 82801IB (ICH9) 4 port SATA AHCI > Controller (rev 02) > > > syslog: > > > Dec 10 00:09:10 cat kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 > action 0x6 frozen > Dec 10 00:09:10 cat kernel: ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 > tag 0 > Dec 10 00:09:10 cat kernel: res 40/00:00:00:4f:c2/00:00:00:c2:00/00 > Emask 0x4 (timeout) > Dec 10 00:09:10 cat kernel: ata5.00: status: { DRDY } > Dec 10 00:09:10 cat kernel: ata5: hard resetting link > Dec 10 00:09:10 cat kernel: ata5: SATA link down (SStatus 0 SControl 300) > Dec 10 00:09:15 cat kernel: ata5: hard resetting link > Dec 10 00:09:16 cat kernel: ata5: SATA link down (SStatus 0 SControl 300) > Dec 10 00:09:21 cat kernel: ata5: hard resetting link > Dec 10 00:09:21 cat kernel: ata5: SATA link down (SStatus 0 SControl 300) > Dec 10 00:09:21 cat kernel: ata5.00: disabled > Dec 10 00:09:21 cat kernel: end_request: I/O error, dev sdb, sector 478543967 > Dec 10 00:09:21 cat kernel: md: super_written gets error=-5, uptodate=0 > Dec 10 00:09:21 cat kernel: raid1: Disk failure on sdb2, disabling device. > Dec 10 00:09:21 cat kernel: raid1: Operation continuing on 1 devices. > Dec 10 00:09:21 cat kernel: ata5: EH complete > Dec 10 00:09:21 cat kernel: ata5.00: detaching (SCSI 4:0:0:0) > Dec 10 00:09:21 cat kernel: sd 4:0:0:0: [sdb] Synchronizing SCSI cache > Dec 10 00:09:21 cat kernel: sd 4:0:0:0: [sdb] Result: hostbyte=0x04 > driverbyte=0x00 > Dec 10 00:09:21 cat kernel: sd 4:0:0:0: [sdb] Stopping disk > Dec 10 00:09:21 cat kernel: sd 4:0:0:0: [sdb] START_STOP FAILED > Dec 10 00:09:21 cat kernel: sd 4:0:0:0: [sdb] Result: hostbyte=0x04 > driverbyte=0x00 > Dec 10 00:09:21 cat kernel: RAID1 conf printout: > Dec 10 00:09:21 cat kernel: --- wd:1 rd:2 > Dec 10 00:09:21 cat kernel: disk 0, wo:0, o:1, dev:sda2 > Dec 10 00:09:21 cat kernel: disk 1, wo:1, o:0, dev:sdb2 > Dec 10 00:09:21 cat kernel: RAID1 conf printout: > Dec 10 00:09:21 cat kernel: --- wd:1 rd:2 > Dec 10 00:09:21 cat kernel: disk 0, wo:0, o:1, dev:sda2 > Dec 10 00:09:21 cat kernel: scsi 4:0:0:0: rejecting I/O to dead device > Dec 10 00:09:21 cat kernel: scsi 4:0:0:0: rejecting I/O to dead device > Dec 10 00:09:21 cat kernel: end_request: I/O error, dev sdb, sector 146496512 > Dec 10 00:09:21 cat kernel: md: super_written gets error=-5, uptodate=0 > Dec 10 00:09:21 cat kernel: raid1: Disk failure on sdb1, disabling device. > Dec 10 00:09:21 cat kernel: raid1: Operation continuing on 1 devices. > Dec 10 00:09:21 cat mdadm[1995]: Fail event detected on md device /dev/md1, > component device /dev/sdb2 > Dec 10 00:09:21 cat kernel: RAID1 conf printout: > Dec 10 00:09:21 cat kernel: --- wd:1 rd:2 > Dec 10 00:09:21 cat kernel: disk 0, wo:0, o:1, dev:sda1 > Dec 10 00:09:21 cat kernel: disk 1, wo:1, o:0, dev:sdb1 > Dec 10 00:09:21 cat kernel: RAID1 conf printout: > Dec 10 00:09:21 cat kernel: --- wd:1 rd:2 > Dec 10 00:09:21 cat kernel: disk 0, wo:0, o:1, dev:sda1 > Dec 10 00:10:21 cat mdadm[1995]: Fail event detected on md device /dev/md0 > > -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org