I have tested kernels 3.16.0-031600rc1-generic and 3.2.60-030260-generic. On the former, the problem does not appear, on the latter, the bug is replicated with similar symptoms as on 3.2.0-64. I used a flash drive with a vanilla Ubuntu 12.04 desktop install for all tests. To summarize kernels tested so far: Good kernels: 3.2.0-63, 3.16.0-031600rc1 Bad kernels: 3.2.0-64, 3.2.60-030260
I also tested this issue on three additional machines, and the results were the same. So I have now five different hardware configurations (including one from bug 1330530) that are affected by this problem and show very similar symptoms. In fact, I was not able to find a computer that would not replicate this regression. If we also take into account Bard Hemmer's hardware, we can reasonably conclude that the issue is not related to motherboard/chipset/CPU/BIOS. It is however related to HighPoint RocketU 1144C add-in adapter that I used in all my tests. I would like to note that symptoms are similar on various hardware, but not identical. The errors are generally similar (xhci, udev, modprobe), but it appears that timing differences cause the issue to occur at different parts of the boot process, depending on the hardware. So far I have seen: 1. Dropping to initramfs shell in the middle of the boot ("Gave up waiting for root device." ... ALERT! [boot drive] does not exist! Dropping to shell!") 2. An error loop preventing system to boot (as described in this report). In this case I am not sure whether this is an infinite loop, or if the system would boot after a long delay. 3. Boot is delayed by 18 minutes, during which time numerous errors are thrown. After 18 minutes, OS boots fine. 4. System boots to text console, rather than the graphical login screen. It is possible to log on to the console. Within seconds, xhci and/or udev errors start appearing in the syslog. After two minutes, screen goes blank, and the console seems unresponsive for another 16 minutes. Following that, the graphical login screen appears, and from this point system behaves fine. 5. As in 4, but after two minutes in the text console, incomplete graphical login screen appears. Password box is missing and the background is not fully loaded. After another 16 minutes, login screen loads missing parts, and system behaves OK. In this case it is possible to switch between text and graphical consoles during these 16 minutes, but the graphical console becomes a purple empty screen after the switch. It is also worth noting that symptoms are highly dependent on the external device(s) attached to RocketU's ports. Here is a summary: 1. No device connected to RocletU adapter - no problems during boot 2. USB3 flash drives (tested two models) - no problems during boot 3. Areca ARC-5040 enclosure - bug is triggered 4. WD MyPassport 2TB US 3.0 drive - bug is triggered 5. Transcend USB 3.0 SD card reader (TS-RDF5K) - bug is triggered with different symptoms: only a small delay (~15 seconds) and small number of xhci errors occur during boot, but the device does not work when OS is fully booted. All the above devices work fine with "good" kernels. Note that I tested three RocketU controllers and five Areca enclosures, to rule out the possibility of a hardware problem on these devices. With a variety of hardware reliably triggering the bug on "bad" kernels, while working fine with "good" kernels, I think it is fully substantiated to consider this regression as not hardware-dependent (apart from the RocketU controller). I am changing tags as Christopher requested in comment #13, but I would like to ask that this bug is marked as duplicate of bug 1330530. That would allow me to debug the issue on my test machines, which would be substantially easier than doing it on production servers. I would prefer not to touch these servers until the fix is released and verified on test computers. ** Tags added: kernel-fixed-upstream kernel-fixed-upstream-3.16 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1328984 Title: [Dell PowerEdge R510] Regression: Kernel 3.2.0-64 fails to boot with USB3 controller card Status in “linux” package in Ubuntu: Incomplete Bug description: A routine system update of Ubuntu 12.04 LTS to kernel 3.2.0-64 resulted in unbootable system on two machines. Further testing revealed that kernel fails while initializing HighPoint RocketU 1144C USB 3.0 controller. This is a PCIe x4 add-in card that contains four USB 3.0 ports, each equipped with its own controller. The card did and does work without any problems with kernel 3.2.0-63 and earlier. Prior to installing kernel 3.2.0-64 there were neither hardware nor software problems with either of the machines. Steps to reproduce: apt-get dist-upgrade sync reboot Result: system fails to boot. The workaround is to revert to kernel 3.2.0-63 or to remove the RocketU card. Hardware description (same on both machines): Dell PowerEdge R510 PERC6/i RAID controller 64GB RAM DDR3 ECC registered Dual CPU: Intel Xeon X5660 2.80GHz HighPoint RocketU 1144C 4-Port USB 3.0 PCIe 2.0 x4 HBA Operating system (identical on both machines): Ubuntu 12.04.4 LTS Linux 3.2.0-64-generic x86_64 Drives: sda - logical drive on PERC6/i, OS sdb - logical drive on PERC6/i, data sdc - Areca 5040 external RAID connected by USB3 to RocketU card sdd - Areca 5040 external RAID connected by USB3 to RocketU card sde - Areca 5040 external RAID connected by USB3 to RocketU card Symptoms: System boots normally until initialization of Areca drives connected to the RocketU card. The following messages are displayed on screen when booting without quiet and with debug options. These are last messages of a "typical" part of the boot sequence. Following it is a ~2 minute lag when no messages are displayed. [Please note that no trace of the boot progress gets recorded in system logs, and messages on screen scroll very fast. I had to record the boot progress with a high framerate camera, and even so some messages scrolled too fast and were not recorded. The following is a manual transcript of fragments of these videos; please forgive inevitable typos.] [5.621523] scsi 5:0:0:0: Direct-Access Areca Areca5 PQ: 0 ANSI: 5 [5.622896] sd 5:0:0:0: Attached scsi generic sg4 type 0 [5.623230] sd 5:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16). [5.623668] sd 5:0:0:0: [sdc] 41015622144 512-byte logical blocks: (20.9 TB/19.0 TiB) [5.741152] scsi 6:0:0:0: Direct-Access Areca Areca3 PQ: 0 ANSI: 5 [5.744003] sd 6:0:0:0: Attached scsi generic sg5 type 0 [5.744545] sd 6:0:0:0: [sdd] Very big device. Trying to use READ CAPACITY(16). [5.744980] sd 6:0:0:0: [sdd] 41015622144 512-byte logical blocks: (20.9 TB/19.0 TiB) [6.004526] scsi76:0:0:0: Direct-Access Areca Areca7 PQ: 0 ANSI: 5 [6.006121] sd 7:0:0:0: Attached scsi generic sg6 type 0 [6.006488] sd 7:0:0:0: [sde] Very big device. Trying to use READ CAPACITY(16). [6.006834] sd 7:0:0:0: [sde] 35156217552 512-byte logical blocks: (17.9 TB/16.3 TiB) [7.133091] Adding 46874620k swap on /dev/sda3. Priority: -1 extents:1 across 46874620k After a two minute delay, the following messages appear in an infinite loop. Please note that these messages appear in a somewhat random sequence, and not all messages appear on every boot. The only thing that works at this point is Ctrl-Alt-Delete. udevd[632]: timeout: killing '/sbin/modprobe -bv acpi:ACPI000D:PMP0C01:' [774] udevd[703]: timeout: killing '/sbin/modprobe -bv acpi:PMP0C014:' [776] udevd[529]: timeout: killing '/sbin/modprobe -bv input:b0003v0557p2261e0110-e0,1,2,3,4,k110,111,112,r8,a0,1,m4,lsfw' [1642] udevd[630]: timeout: killing '/sbin/modprobe -bv serio:ty06pr00id00ex00' [655] udevd[508]: timeout: killing '/sbin/modprobe -bv pci:v0000808640000342Esv00000000sd00000000bc00sc00i00' [512] udevd[494]: timeout: killing '/sbin/modprobe -bv input:b0019v0000p0001e0000-r0,1,k74,ramlsfw' [771] udevd[699]: timeout: killing '/sbin/modprobe -bv dmi:bvnDellInc.:bvr1.12.0:bd07/26/2013:svnDellInc.:pnPowerEdgeR510:pvr:rvnDellInc.:rm00HDP0:rvr002:cvnDellInc.:ct23:cvr:' [708] udevd[529]: timeout: killing '/sbin/modprobe -bv input:b0003v0557p2261e0110-e0,1,2,3,4,k71,72,73,74,77,80,82,83,85,86,87,88,89,8A,8B,8C,8E,8F,90,96,98,9B,9C,9E,9F,A1,A3,A4,A5,A6,A7,A8,A9,AB,AC,AD,AE,B1,B2,B5,CE,CF,D0,D1,D2,D4,D8,D9,DB,E4,EA,EB,F1,100,161,162,166,16A,16E,172,174,176,178,179,17A,17B,17C,17D,17F,180,182,182,185,188,189,18C,18D,18E,18F,190,191,192,193,195,198,199,19A,1A9,1A1,1A2,1A3,1A4,1A5,1A6,1A7,1A8,1A9,1AA,1AB,1AC,1AD,1AE,1B0,1B1,1B7,1BA,r6,a20,m4,lsfw' [1678] After pressing Ctrl-Alt-Delete, the above messages continue to appear for a few seconds, and after that the following messages are displayed: An error occurred while mounting /mnt/sdb. mountall: mount /mnt/sdb [1785] killed by KILL signal mountall: Filesystem could not be mounted: /mnt/sdb * Killing all remaining processes... [Press S to skip mounting or M for manual recovery fail] rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w" * Deconfiguring network interfaces [ OK ] * Deactivating swap... [ OK ] * Unmounting local filesystems... [ OK ] * Will now restart [184.341144] hub 4-0:1.0: hub_port_status failed (err = -110) [184.341222] hub 4-0:1.0: hub_port_status failed (err = -110) [201.324536] usb 16-1: device not accepting address 2, error -62 [201.380907] sd 7:0:0:0: [sde] Asking for cache data failed [201.380980] sd 7:0:0:0: [sde] Assuming drive cache: write through [201.381767] sd 7:0:0:0: [sde] Asking for cache data failed [201.381840] sd 7:0:0:0: [sde] Assuming drive cache: write through [201.382457] sd 7:0:0:0: [sde] Asking for cache data failed [201.382530] sd 7:0:0:0: [sde] Assuming drive cache: write through [211.880194] usb 12-1: device not accepting address 2, error -62 [211.936396] sd 6:0:0:0: [sdd] Asking for cache data failed [211.936466] sd 6:0:0:0: [sdd] Assuming drive cache: write through [222.435967] usb 10-1: device not accepting address 2, error -62 After the last message screen goes blank and machine reboots. Additional note: Not sure if this is related, but while looking for existing bug reports, I have found several posts about kernel 3.2.0-64 regressing in USB 3.0 support: https://bugs.launchpad.net/software-center/+bug/1328883 http://www.linuxquestions.org/questions/linux-software-2/sudden-loss-of-usb-3-0-on-ubuntu-12-04-64-bit-kernel-3-2-0-64-generic-4175507335/ Note about attachments: Due to kernel 3.2.0-64 not being able to boot, the attached command output was obtained using kernel 3.2.0-63. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1328984/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp