Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-28 Thread Thomas Gleixner
On Mon, 27 Feb 2017, Linus Torvalds wrote: > So I don't disagree that in a perfect world all drivers should just > handle it. It's just that it's not realistic. > > The fact that we have now *twice* gotten an oops report or a "this > machine doesn't boot" report etc within a week or so of merging

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Ingo Molnar
* Linus Torvalds wrote: > In other words: what will happen is that distros start getting bootup problem > reports six months or a year after we've done it, and *if* they figure out > it's > the irq enabling, they'll disable it, because they have no way to solve

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Linus Torvalds
On Mon, Feb 27, 2017 at 7:41 AM, Ingo Molnar wrote: > > BTW., instead of trying to avoid the scenario, wow about moving in the other > direction: making CONFIG_DEBUG_SHIRQ=y unconditional property in the IRQ core > code > starting from v4.12 or so The problem is that it's

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Thomas Gleixner
On Mon, 27 Feb 2017, Tony Lindgren wrote: > * Ingo Molnar [170227 07:44]: > > Because it's not the requirement that hurts primarily, but the resulting > > non-determinism and the sporadic crashes. Which can be solved by making the > > race > > deterministic via the debug

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Thomas Gleixner
On Mon, 27 Feb 2017, Ingo Molnar wrote: > * Thomas Gleixner wrote: > > > The pending interrupt issue happens, at least on my test boxen, mostly on > > the 'legacy' interrupts (0 - 15). But even the IOAPIC interrupts >=16 > > happen occasionally. > > > > > > - Spurious

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Tony Lindgren
* Thomas Gleixner [170227 08:20]: > On Mon, 27 Feb 2017, Tony Lindgren wrote: > > * Ingo Molnar [170227 07:44]: > > > Because it's not the requirement that hurts primarily, but the resulting > > > non-determinism and the sporadic crashes. Which can be

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Tony Lindgren
* Ingo Molnar [170227 07:44]: > > * Thomas Gleixner wrote: > > > The pending interrupt issue happens, at least on my test boxen, mostly on > > the 'legacy' interrupts (0 - 15). But even the IOAPIC interrupts >=16 > > happen occasionally. > > > > > > -

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Ingo Molnar
* Thomas Gleixner wrote: > The pending interrupt issue happens, at least on my test boxen, mostly on > the 'legacy' interrupts (0 - 15). But even the IOAPIC interrupts >=16 > happen occasionally. > > > - Spurious interrupts on IRQ7, which are triggered by IRQ 0 (PIT/HPET).

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Thomas Gleixner
On Mon, 27 Feb 2017, Thomas Gleixner wrote: > On Sat, 25 Feb 2017, Linus Torvalds wrote: > > There are several things that set IRQS_PENDING, ranging from "try to > > test mis-routed interrupts while irqd was working", to "prepare for > > suspend losing the irq for us", to "irq auto-probing uses it

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-27 Thread Thomas Gleixner
On Sat, 25 Feb 2017, Linus Torvalds wrote: > On Sat, Feb 25, 2017 at 1:07 AM, Ingo Molnar wrote: > > > > So, should we revert the hw-retrigger change: > > > > a9b4f08770b4 x86/ioapic: Restore IO-APIC irq_chip retrigger callback > > > > ... until we managed to fix

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-25 Thread Linus Torvalds
On Sat, Feb 25, 2017 at 1:07 AM, Ingo Molnar wrote: > > So, should we revert the hw-retrigger change: > > a9b4f08770b4 x86/ioapic: Restore IO-APIC irq_chip retrigger callback > > ... until we managed to fix CONFIG_DEBUG_SHIRQ=y? If you'd like to revert it > upstream straight

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-25 Thread Sean Young
On Fri, Feb 24, 2017 at 11:15:51AM -0800, Linus Torvalds wrote: > Added more relevant people. I've debugged the immediate problem below, > but I think there's another problem that actually triggered this. > > On Fri, Feb 24, 2017 at 10:28 AM, kernel test robot > wrote: >

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-25 Thread Ingo Molnar
* Linus Torvalds wrote: > I'm pretty sure that the thing that triggered this is once more commit > a9b4f08770b4 ("x86/ioapic: Restore IO-APIC irq_chip retrigger > callback") which seems to retrigger stale irqs that simply should not > be retriggered. > > They

Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

2017-02-24 Thread Linus Torvalds
Added more relevant people. I've debugged the immediate problem below, but I think there's another problem that actually triggered this. On Fri, Feb 24, 2017 at 10:28 AM, kernel test robot wrote: > > 0day kernel testing robot got the below dmesg and the first bad commit