Hi Thomas

When I did cpu hotplug stress test, I found this log on my machine.

[  267.161043] do_IRQ: 7.33 No irq handler for vector

I add a dump_stack below the bug and get following log:

[  267.161043] do_IRQ: 7.33 No irq handler for vector
[  267.161045] CPU: 7 PID: 52 Comm: migration/7 Not tainted 4.15.0-rc7+ #27
[  267.161045] Hardware name: LENOVO 10MLS0E339/3106, BIOS M1AKT22A 06/27/2017
[  267.161046] Call Trace:
[  267.161047]  <IRQ>
[  267.161052]  dump_stack+0x7c/0xb5
[  267.161054]  do_IRQ+0xb9/0xf0
[  267.161056]  common_interrupt+0xa2/0xa2
[  267.161057]  </IRQ>
[  267.161059] RIP: 0010:multi_cpu_stop+0xb0/0x120
[  267.161060] RSP: 0018:ffffbb6c81af7e70 EFLAGS: 00000202 ORIG_RAX: 
ffffffffffffffde
[  267.161061] RAX: 0000000000000001 RBX: 0000000000000004 RCX: 0000000000000000
[  267.161062] RDX: 0000000000000006 RSI: ffffffff898c4591 RDI: 0000000000000202
[  267.161063] RBP: ffffbb6c826e7c88 R08: ffff991abc1256bc R09: 0000000000000005
[  267.161063] R10: ffffbb6c81af7db8 R11: ffffffff89c91d20 R12: 0000000000000001
[  267.161064] R13: ffffbb6c826e7cac R14: 0000000000000003 R15: 0000000000000000
[  267.161067]  ? cpu_stop_queue_work+0x90/0x90
[  267.161068]  cpu_stopper_thread+0x83/0x100
[  267.161070]  smpboot_thread_fn+0x161/0x220
[  267.161072]  kthread+0xf5/0x130
[  267.161073]  ? sort_range+0x20/0x20
[  267.161074]  ? kthread_associate_blkcg+0xe0/0xe0
[  267.161076]  ret_from_fork+0x24/0x30

The irq just occurred after the irq is enabled in multi_cpu_stop.

0xffffffff8112d655 is in multi_cpu_stop 
(/home/will/u04/source_code/linux-block/kernel/stop_machine.c:223).
218                              */
219                             touch_nmi_watchdog();
220                     }
221             } while (curstate != MULTI_STOP_EXIT);
222     
223             local_irq_restore(flags);
224             return err;
225     }

The vector 33 here is used by a NVMe card.

 124:     616993          0          0          0          0          0         
 0          0  IR-PCI-MSI 1048576-edge      nvme0q0, nvme0q1
 125:         44          0          0          0          0          0         
 0          0  IR-PCI-MSI 327680-edge      xhci_hcd
 126:          0     620871          0          0          0          0         
 0          0  IR-PCI-MSI 1048577-edge      nvme0q2
 127:          0          0     641541          0          0          0         
 0          0  IR-PCI-MSI 1048578-edge      nvme0q3
 128:          0          0          0     577836          0          0         
 0          0  IR-PCI-MSI 1048579-edge      nvme0q4
 129:          0          0          0          0     554206          0         
 0          0  IR-PCI-MSI 1048580-edge      nvme0q5
 130:          0          0          0          0          0     455021         
 0          0  IR-PCI-MSI 1048581-edge      nvme0q6
 131:          0          0          0          0          0          0     
273111          0  IR-PCI-MSI 1048582-edge      nvme0q7
 132:          0          0          0          0          0          0         
 0     120987  IR-PCI-MSI 1048583-edge      nvme0q8

Here is the output of irq debugfs

handler:  handle_edge_irq
device:   0000:02:00.0
status:   0x00004000
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x01608200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_MOVE_PCNTXT
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 7
effectiv: 7
pending:  
domain:  INTEL-IR-MSI-1-2
 hwirq:   0x100007
 chip:    IR-PCI-MSI
  flags:   0x10
             IRQCHIP_SKIP_SET_WAKE
 parent:
    domain:  INTEL-IR-1
     hwirq:   0x1a0000
     chip:    INTEL-IR
      flags:   0x0
     parent:
        domain:  VECTOR
         hwirq:   0x84
         chip:    APIC
          flags:   0x0
         Vector:    33
         Target:     7

Thanks
Jianchao

Reply via email to