Bug#422217: linux-image-2.6.20-1-686: SCSI disks initialised too late for mdadm
Hi, just to add one more data point: I run into the same issue (well, 2.6.20 had other problems, so I turned to http://www.lrz-muenchen.de/~tobiasnadler/linux/kernelcompile.html for some help with 2.6.21), but: adding a delay is sooo bad solution, it even has serious support (see rootdelay in man initramfs-tools). What's more, if not found, initramfs checks for the root device ten times a second for 3 minutes before dropping into the debug shell. But the premounts scripts (mdadm, lvm, multipath, etc.) are only run once, so this doesn't help in such cases. I contemplated inventing some udev rules to get the premount stuff (multipath and lvm in my case) happen automatically in the background. That would make the root device appear and trigger a new try to mount it. I didn't get to that yet... -- Cheers, Feri. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#422217: linux-image-2.6.20-1-686: SCSI disks initialised too late for mdadm
I've done my experiment with initramfs-tools - putting a 'sleep 10' before mount_root makes my machine boot the kernel, as I suspected in my original email: # diff -u /usr/share/initramfs-tools/init{.orig,} --- /usr/share/initramfs-tools/init.orig2007-03-07 22:30:42.0 + +++ /usr/share/initramfs-tools/init 2007-05-11 14:33:55.0 +0100 @@ -145,6 +145,12 @@ run_scripts /scripts/init-premount [ $quiet != y ] log_end_msg +#SAB +log_begin_msg SAB: slow SCSI disk discovery workaround: sleeping for 10 seconds +/bin/sleep 10 +log_end_msg +#SAB + maybe_break mount log_begin_msg Mounting root file system... . /scripts/${BOOT} # update-initramfs -k 2.6.20-1-686 -d # update-initramfs -k 2.6.20-1-686 -c # update-grub # shutdown -r now Boot log captured from serial-over-LAN console (hence excuse strange chars): Begin: Running /scripts/init-premount ACPI: Processor [CPU2] (supports 8 throttling states) usbcore: registered newdriver 3.04.03 Copyright (c) 1999-2007 LSI Logic Corporation e1000: :04:04.0: e1000_probe: (PCI-X:100MHz:64-bit) 00:04:23:c5:10:d6 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx e1000: eth0: e1hdd: Slimtype COMBO SOSC-2483K, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 ACPioc0: 53C1030: Capabilities={Initiator} scsi0 : ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=24 ACPI: PCI Interrupt :02:05.1[B] - GSI 25 (level, low) - IRQ 25 mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator} scsi 0:0:0:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:0: Beginning Domain Validation scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=25 target0:0:0: Ending Domain Validation target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) scsi 0:0:1:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:1: Beginning Domain Validation target0:0:1: Ending Domain Validation target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) scsi 0:0:2:0: Direct-Acces:2: Beginning Domain Validation target0:0:2: Ending Domain Validation target0:0:2: FAST-160 SEAGATE ST336807LC 0C01 PQ: 0 ANSI: 3 target0:0:3: Beginning Domain Validation ACPI: PCI Interrupt :03:04.0[A] - GSI 24 (level, low) - IRQ 26 e100: eth4: e100_probe: addr 0xdecfe000, irq 26, MAC addr 00:02:B3:B4:3C:15 ACPI: PCI Interrupt :03:05.0[A] - GSI 27 (level, low) - IRQ 27 e100: eth5: e100_probe: addr 0xdecff000, irq 27, MAC addr 00:02:B3:B4:3C:1rive, 2048kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 Done. Begin: SAB: slow SCSIvery workaround: sleeping for 10 seconds ... target0:0:3: Ending Domain Validation target0:0:3: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) scsi 0:0:4:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:4: Beginning Domain Validation target0:0:4: Ending Domain Validation target0:0:4: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) scsi 0:0:5:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:5: Beginning Domain Validation target0:0:5: Ending Domain Validation target0:0:5: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRocessor ESG-SHV SCA HSBP M29 1.06 PQ: 0 ANSI: 2 target0:0:6: Beginning Domain ValSCSI device sda: 71687372 512-byte hdwr sectors (36704 MB) sda: Write Protect is off SCSI device sda: write cache: enabled, read cache: enabled, supports DPO and FUA SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB) sda: Write Protect is off SCSI device sda: write cache: enabled, read cache: enabled, supports DPO and FUA sda: sda1 sda2 sda3 sda5 sda6 sda7 sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 71687372 512-byte hdwr sectors (36704 MB) sdb: Write Protect is off SCSI device sdb: write cache: enabled, read cache: enabled, supports DPO and FUA SCSIis off SCSI device sdb: write cache: enabled, read cache: enabled, supports DPO and FUA sdb: 71687372 512-byte hdwr sectors (36704 MB) sdc: Write Protect is off SCSI device sdc: write ca device sdc: write cache: enabled, read cache: enabled, supports DPO and FUA sdc: unknown part Write Protect is off SCSI device sdd: write cache: enabled, read cache: enabled, supports DPO ed, supports DPO and FUA sdd: sdd4 sdd4: bsd:bad subpartition - ignored bad subpartition -0:3:0: Attached scsi disk sdd SCSI device sde: 71687372 512-byte hdwr sectors (36704 MB) sde: 12-byte hdwr sectors (36704 MB) sde: Write Protect is off SCSI device sde: write cache: enable2 512-byte hdwr sectors (36704 MB) sdf: Write Protect is off SCSI device sdf: write cache: enate cache: enabled, read cache: enabled, supports DPO and FUA sdf: unknown partition table sd Done. Begin: Mounting root file system... ... Begin: Running /scripts/local-top ... Begin: Loading Mmd:
Bug#422217: linux-image-2.6.20-1-686: SCSI disks initialised too late for mdadm
This one time, at band camp, Simon A. Boggis said: I've done my experiment with initramfs-tools - putting a 'sleep 10' before mount_root makes my machine boot the kernel, as I suspected in my original email: # diff -u /usr/share/initramfs-tools/init{.orig,} --- /usr/share/initramfs-tools/init.orig2007-03-07 22:30:42.0 + +++ /usr/share/initramfs-tools/init 2007-05-11 14:33:55.0 +0100 @@ -145,6 +145,12 @@ run_scripts /scripts/init-premount [ $quiet != y ] log_end_msg +#SAB +log_begin_msg SAB: slow SCSI disk discovery workaround: sleeping for 10 seconds +/bin/sleep 10 +log_end_msg +#SAB + maybe_break mount log_begin_msg Mounting root file system... . /scripts/${BOOT} Not that I'm involved in this in any real way, but things like hardcoded sleep timeouts always make me uncomfortable - they introduce delays for people who don't need them, and they are racy at best and can still fail for the people who do need them. Is there some way to use udevsettle or something instead? If not, some method of sleep until $disk seems better than hardcoding it, to me at least. -- - | ,''`.Stephen Gran | | : :' :[EMAIL PROTECTED] | | `. `'Debian user, admin, and developer | |`- http://www.debian.org | - signature.asc Description: Digital signature
Bug#422217: linux-image-2.6.20-1-686: SCSI disks initialised too late for mdadm
Stephen Gran wrote: This one time, at band camp, Simon A. Boggis said: I've done my experiment with initramfs-tools - putting a 'sleep 10' before mount_root makes my machine boot the kernel, as I suspected in my original email: # diff -u /usr/share/initramfs-tools/init{.orig,} --- /usr/share/initramfs-tools/init.orig2007-03-07 22:30:42.0 + +++ /usr/share/initramfs-tools/init 2007-05-11 14:33:55.0 +0100 @@ -145,6 +145,12 @@ run_scripts /scripts/init-premount [ $quiet != y ] log_end_msg +#SAB +log_begin_msg SAB: slow SCSI disk discovery workaround: sleeping for 10 seconds +/bin/sleep 10 +log_end_msg +#SAB + maybe_break mount log_begin_msg Mounting root file system... . /scripts/${BOOT} Not that I'm involved in this in any real way, but things like hardcoded sleep timeouts always make me uncomfortable - they introduce delays for people who don't need them, and they are racy at best and can still fail for the people who do need them. Is there some way to use udevsettle or something instead? If not, some method of sleep until $disk seems better than hardcoding it, to me at least. I would completely agree with you - it's totally the wrong thing to do - another SCSI card (or more, or slower devices) could take even longer. The only reason I did it was to prove (as opposed to guess) that the problem really is a race between SCSI becoming ready and mount_root. This has now been shown to be the case, so the next questions are what is the cause and can it be fixed properly? Ideally one would like something like (in pseudo-code): if has_scsi: start_scsi_in_blocking_mode mount_root or if it won't block then: if has_scsi start_scsi_in_non-blocking_mode wait_until_scsi_ready mount_root It is interesting that the behaviour is different between 2.6.18 and 2.6.20 - this either implies that SCSI blocked in 2.6.18 or that we were just lucky and SCSI initialisation won the race. I haven't had time to work out what might have changed in 2.6.20 yet. Best wishes, Simon -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#422217: linux-image-2.6.20-1-686: SCSI disks initialised too late for mdadm
Package: linux-image-2.6.20-1-686 Version: 2.6.20-3 Severity: critical Justification: breaks the whole system Hi, I've filed this bug under the this package since, although one could argue that initramfs-tools has the problem, the difference appears to be kernel version. My machine is configured with an software raid 0 (mdadm) root filesystem, composed of two SCSI drives. If I run the stock debian etch linux-image-2.6.18-4-686 (2.6.18.dfsg.1-12etch1) kernel, everything works as expected. If I attempt to boot the linux-image-2.6.20-1-686 (2.6.20-3) kernel from unstable, my system hangs on boot. Examining captures of the boot process shows that on 2.6.18-4-686 we see (excuse slight hiccups in formatting - imperfect capture from serial over LAN console): Begin: Running /scripts/init-premount ... ACPI: Processor [usbcore: registered new driver usbfs usbcore: registered new driver hub SCSI subsystem initiale1000: :04:04.0: e1000_probe: (PCI-X:100MHz:64-bit) 00:04:23:c5:10:d6 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; overrid 1999-2005 LSI Logic Corporation USB Universal Host Controller Interface driver v3.0 e1000: ett :00:1d.0[A] - GSI 16 (level, low) - IRQ 169 uhci_hcd :00:1d.0: UHCI Host Controllerhdd: Slimtype COMBO SOSC-2483K, ATAPI CD/DVD-ROM drive scsi0 : ioc0: LSI53C1030, FwRev=01032700Vendor: SEAGATE Model: ST336754LC Rev: 0005 Type: Direct-Access ANSI SCSI revision: 03 target0:0:0: Beginning Domain Validation target0:0:0: Ending Domain Validation target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) Vendor: SEAGATE Model: ST336754LCRev: 0005 Type: Direct-Access ANSI SCSI revision: 03 target0:0:1: Beginning Domain Validation target0:0:1: Ending Domain Validation target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) Vendor: SEAGATE Model: ST336807LCRev: 0C01 Type: Direct-AccessGW9 ANSI SCSI revision: 03 target0:0:3: Beginning Domain Validation target0:0:3: Ending DomaiFLOW PCOMP (6.25 ns, offset 63) Vendor: ESG-SHV Model: SCA HSBP M29 Rev: 1.06 Type: Processor ANSI SCSI revision: 02 target0:0:6: Beginning Domain Validation target0:0:6: Ending Domain Validation target0:0:6: asynchronous ACPI: PCI Interrupt :02:05.1[B] - GSI 25 (level, low) - IRQ 66 mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator} scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=66 SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB) hdd: ATAPI5sda: Write Protect is off SCSI device sda: drive cache: write back w/ FUA SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB) sda: Write Protect is off SCSI device sda: drive cache: write back w/ FUA sda: sda1 sda2 sda3 sda5 sda6 sda7 24X5sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 71687372 512-byte hdwr sectors (36704 MB) sdb: Write Protect is off SCSI device sdb: drive cache: write back w/ FUA SCSI device sdb: 71687372 512-byte hdwr sectors (36704 MB) sdb: Write Protect is off SCSI device sdb: drive c Write Protect is off SCSI device sdc: drive cache: write back w/ FUA SCSI device sdc: 7168737vision: 3.20 sdc: sdc4 sdc4: bsd:bad subpartition - ignored bad subpartition - ignored bac Done. Begin: Mounting root file system... ... [BOOT CONTINUES] whereas on linux-image-2.6.20-1-686 we see: Begin: Running /scripts/init-premount ... ACPIling states) ACPI: Processor [CPU2] (supports 8 throttling states) usbcore: registered new intporation. ACPI: PCI Interrupt :04:04.0[A] - GSI 54 (level, low) - IRQ 17 ICH5: IDE contrbe irqs later ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, hdb:pio ide1: BM-ystem initialized USB Universal Host Controller Interface driver v3.0 e1000: eth0: e1000_probee COMBO SOSC-2483K, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 ACPI: PCI Interscsi0 : ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=25 ACPI: PCI Interrupt :02:05.1[B] - GSI 25 (level, low) - IRQ 27 mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator} scsi 0:0:0:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:0: Beginning Domain Validation scsi1 : ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=27 target0:0:0: Ending Domain Validation target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) scsi 0:0:1:0: Direct-Access SEAGATE ST336754LC 0005 PQ: 0 ANSI: 3 target0:0:1: Beginning Domain Validation target0:0:1: Ending Domain Validation target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.25 ns, offset 63) hdd: ATAPI 24X DVD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) Uniform CD-ROM