Joe,

On Mon, 23 Nov 2015, Joe Lawrence wrote:

Nice detective work!

> The irq_desc is in R15: ffff88102a8196f8
>
> This irq_desc is no longer allocated, it's been filled with the slub debug
> poison pattern (hence the spinlock is stuck):
> 
> The irq vector is in RBX: 0000000000000091
>
> But from a custom crash gdb script, the freed irq_desc can still be found
> in CPU 1's vector_irq[145]:
> 
>   cpu[1] vector_irq[145] irq_desc @ 0xffff88102a8196f8
> 
> Sifting through a heavily grepped and abbreviated trace log:
> 
>   - The irq moved from CPU 1, vector 145 to CPU 44, vector 134 
>   - Both CPUs skip cleaning up their vector_irq[] entries for this irq
>     because data->move_in_progress is set (is this normal?)

Yes. The cleanup interrupt is not targetting a particular vector. 

>   - The irq moves again to CPU 34, vector 123
>   - The underlying device is removed @ 593.106514 jiffies
>   - The irq_desc is freed @ 593.239879 jiffies
>   - CPU 1 is last heard from @ 1022.830083 jiffies
> 
>   [001]    22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 
> 0xffff88102a8196f8
>   [044]    35.209945: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=134 0xffff88102a8196f8
>   [001]    35.212370: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8
>   [044]    35.674114: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=134 0xffff88102a8196f8
>   [001]    35.675395: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8
>   [044]    36.265716: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=134 0xffff88102a8196f8
>   [044]    36.517785: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=134 0xffff88102a8196f8
>   [001]    36.520115: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8
>   [001]    54.636651: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8
>   [001]    54.636722: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8

So I assume, that no interrupt happened so far.

>   [044]    61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 
> 0xffff88102a8196f8

Now it's moved again.

>   [001]    61.670979: smp_irq_move_cleanup_interrupt : data->move_in_progress 
> : vector=145 0xffff88102a8196f8
>   [044]    61.696120: smp_irq_move_cleanup_interrupt : cpu (this) : 
> vector=134 0xffff88102a8196f8 -> (nil)
                                        
>   [034]    69.086107: smp_irq_move_cleanup_interrupt : vector == 
> data->cfg.vector && ... : vector=123 0xffff88102a8196f8
>   [000]   593.239811: clear_irq_vector : 1 : cpu 34 : vector=123 
> 0xffff88102a8196f8 -> (nil)
>   [000]   593.239879: free_desc : desc @ 0xffff88102a8196f8
>   [001]  1022.830083: smp_irq_move_cleanup_interrupt : IS_ERR_OR_NULL : 
> vector=144 (nil)

Ok. Here is issue. So I assume the following happens:

CPU0         CPU1                  CPU34                CPU44           CPUxx
                                                                        
send_IPI(CLEANUP)
             smp_irq_move_cleanup_interrupt
               sees data->move_in_progress
                                                
                                  device_irq happens
                                  data->move_in_progress = 0
                                  send_IPI(CLEANUP)

             Does not receive another
             cleanup IPI for whatever
             reason .... See below.
                                                        
smp_irq_move_cleanup_interrupt
                                                        vector = NULL   

free_vector on CPU34

Does not clean CPU1 because
data->move_in_progress is not set.

free_desc

                                                                        
send_IPI(CLEANUP)
             smp_irq_move_cleanup_interrupt
               Accesses random memory
 
> Prior to a782a7e46bb5 "x86/irq: Store irq descriptor in vector array", the
> vector_irq array held irq values, those interested in the irq_desc would
> have to perform their own irq_to_desc() (ie, a radix_tree_lookup of the
> irq_desc_tree).  The radix tree is updated in free_desc(), so any
> subsequent lookups would fail.

Yes, that above race has existed forever and the irq_to_desc() check
papered over it. In case that the irq number was reassigned the thing
operated on the wrong descriptor. Pretty bad as well as it fiddles
with the wrong bits. Though it clears the vector ...

The problem is actually in the vector assignment code.

>   [001]    22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 
> 0xffff88102a8196f8

No interrupt happened so far. So nothing cleans up the vector on cpu 1

>   [044]    61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 
> 0xffff88102a8196f8
 
Now that moves it from 44 to 34 and ignores that cpu 1 still has the
vector assigned. __assign_irq_vector unconditionally overwrites
data->old_domain, so the bit of cpu 1 is lost ....

I'm staring into the code to figure out a fix ....

Thanks,

        tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to