Bug#801925: Linux Null Pointer Dereference on boot

2015-12-22 Thread Alexandre Rossi
Hi,

Same here, last working debian kernel is linux-image-4.1.0-2-amd64 .
Kernel hangs on early boot, after a while it notices and drops to an
initramfs shell, lacking a root device. I managed to capture de OOPS
using the handy netconsole module (attached).

> On second thought, it seems more likely that this issue probably was
> _caused_ by that commit.  The fix can be found in these two emails:
>
> http://marc.info/?l=linux-scsi=144185206825609=2
> http://marc.info/?l=linux-scsi=144185208525611=2
>
> which have not been merged yet as far as I know even though they were
> submitted back in September.

Those patches do not fix the problem for me. I tested using
debian/bin/test-patches from the Debian Linux source.

Thanks,

Alex
[2.561223] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[2.561719] ata2.00: ACPI cmd 00/00:00:00:00:00:a0 (NOP) rejected by device (Stat=0x51 Err=0x04)
[2.561775] ata2.00: supports DRM functions and may not be fully accessible
[2.565750] ata2.00: disabling queued TRIM support
[2.565758] ata2.00: ATA-9: Crucial_CT256M550SSD1, MU01, max UDMA/133
[2.565762] ata2.00: 500118192 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[2.570661] ata2.00: supports DRM functions and may not be fully accessible
[2.579029] ata2.00: configured for UDMA/133
[2.905416] ata4: SATA link down (SStatus 0 SControl 300)
[2.910964] BUG: unable to handle kernel NULL pointer dereference at 0008
[2.911135] PGD 0 
[2.911215] Modules linked in: sd_mod(+) crc32c_intel ahci libahci libata xhci_pci scsi_mod xhci_hcd i915 ehci_pci ehci_hcd sdhci_pci sdhci mmc_core i2c_algo_bit drm_kms_helper usbcore usb_common e1000e ptp drm pps_core wmi thermal video button
[2.911552] Hardware name: Hewlett-Packard HP ProBook 6470b/179C, BIOS 68ICE Ver. F.45 10/07/2013
[2.911665] RIP: 0010:[]  [] sd_resume+0xd/0x70 [sd_mod]
[2.911731] RSP: 0018:8801390a3a60  EFLAGS: 00010246
[2.911818] RDX: 0001 RSI: 880139398168 RDI: 880139398168
[2.911918] R10: 81827b86 R11: 000e R12: a0019220
[2.912017] FS:  7fc6c2e168c0() GS:88013fac() knlGS:
[2.912113] CR2: 0008 CR3: 0001390c7000 CR4: 001406e0
[2.912161] Stack:
[2.912247]   81403aae 880139398168 a03f2a20
[2.912382] Call Trace:
[2.912470]  [] ? __rpm_callback+0x2e/0x70
[2.912519]  [] ? scsi_autopm_put_device+0x20/0x20 [scsi_mod]
[2.912619]  [] ? scsi_autopm_put_device+0x20/0x20 [scsi_mod]
[2.912744]  [] ? __pm_runtime_resume+0x47/0x70
[2.912848]  [] ? sd_probe+0x35/0x340 [sd_mod]
[2.912959]  [] ? __driver_attach+0x7b/0x80
[2.913064]  [] ? bus_for_each_dev+0x67/0xb0
[2.913152]  [] ? 0xa0024000
[2.913190]  [] ? driver_register+0x57/0xc0
[2.913312]  [] ? do_one_initcall+0xb2/0x200
[2.913417]  [] ? load_module+0x2173/0x2780
[2.913504]  [] ? kernel_read+0x4b/0x70
[2.913623]  [] ? system_call_fast_compare_end+0xc/0x67
[2.914161] RIP  [] sd_resume+0xd/0x70 [sd_mod]
[2.914212]  RSP 
[2.914239] CR2: 0008
[2.916695] ---[ end trace 6a25c092cd6e126d ]---


Bug#801925: Linux Null Pointer Dereference on boot

2015-11-10 Thread Erich Schubert
Hi,
Both me and a colleague are bit by the same/a similar bug.
The Null Pointer Dereference here is at sd_resume during early boot,
but looks very much related to this issue.

The last kernel able to boot for me is 4.1.0-2-amd64
Anything later than that (including linux-image-4.3.0-trunk-amd64
4.3-1~exp1) does not work anymore.

So yes, it appears to be broken *since*
49718f0fb8c9 ("SCSI: Fix NULL pointer dereference in runtime PM")
and the patches from Ken Xue may be helpful (untested).

Regards,
Erich