** Description changed: I appear to have stumbled upon a bug in the kernel that can, in certain circumstances, both cause the kernel-boot to get stuck in an endless loop, and possibly damage the IDE drives over time (based on experience). Using Edgy Eft Desktop Live CD, preparing to install to an existing Windows system. This probably occurs during an installed system-boot too, but I've not got that far as yet. Scenario: - PC with a Promise FastTrak TX2000 SoftRAID controller and 4x 60GB IDE parallel ATA drives configured as RAID 10 (Mirror + Stripe) to provide one logical 120GB drive. - The PC already has Windows 2003 Server installed and booting from the RAID 10, with 2 NTFS partitions. + PC with a Promise FastTrak TX2000 SoftRAID controller and 4x 60GB IDE parallel ATA drives configured as RAID 1+0 (Mirror + Stripe) to provide one logical 120GB drive. + The PC already has Windows 2003 Server installed and booting from the RAID 1+0, with 2 NTFS partitions. I wanted to shrink the 2nd partition to make room to install Ubuntu 6.10 from the Live CD. See my Ubuntu forums article for a detailed explanation of my experience: http://www.ubuntuforums.org/showthread.php?p=1958918 Bug: When booting Edgy from the CD the kernel loads the Promise fasttrak controller module "pdc202xx" and then probes each of the connected IDE hard drives (for a partition table?) dmraid not being loaded so its not dealing with the logical drive. + + The RAID 1+0 120GB logical drive consists of hde+hdf mirrored to + hdg+hdh, with the partiton table on hde and hdg. Large drives use LBA addressing to overcome the CHS limitations of partition tables. If the probe finds a partition table on any drive, it then tries to seek to the starting sector of each partition (presumably to read its boot- sector system-id byte?), and also tries to seek into the last few sectors of the partition (looking for a superblock?). On a RAID 0 array where the striping causes the partition table to represent a larger logical drive, the starting and ending sector numbers of some partitions are beyond the end of the physical drive the partition table is written on. This causes the Disk Read Errors reported here. The fix would be for the probe to compare the physical number of cylinders reported by the drive (as seen by e.g. fdisk /dev/hde or fdisk /dev/hdg) to the starting/ending sector numbers for the LBA device. If the entries in the partition are beyond the end of the physical disk the probe should handle the situation gracefully (This could potentially be used as a cue to auto-loading dmraid). Once dmraid is loaded "fdisk /dev/mapper/raidarrayname" shows the correct total number of logical sectors. -------- Short extract of repetitive disk errors - usually there are hundred or thousands ------ PDC202XX: Primary channel reset. ide2: reset: success hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown end_request: I/O error, dev hde, sector 238276076 printk: 8 messages suppressed. Buffer I/O error on device hde2, logical block 47279294 hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
-- Disk Read Errors during boot-time probe of physical softRAID drives https://launchpad.net/bugs/77734 -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
