On Wed, 2015-02-18 at 15:45 -0400, Julian Margetson wrote:
On 2/15/2015 8:18 PM, Michael Ellerman wrote:

> On Sun, 2015-02-15 at 08:16 -0400, Julian Margetson wrote:
> > Hi
> >
> > I am unable to get any kernel beyond  the 3.16 branch working on an
> > Acube Sam460ex
> >  AMCC 460ex based motherboard. Kernel  up 3.16.7-ckt6 working.
> Does reverting b0345bbc6d09 change anything?
>
> > [    6.364350] snd_hda_intel 0001:81:00.1: enabling device (0000 -> 0002)
> > [    6.453794] snd_hda_intel 0001:81:00.1: ppc4xx_setup_msi_irqs: fail 
mapping irq
> > [    6.487530] Unable to handle kernel paging request for data at address 
0x0fa06c7c
> > [    6.495055] Faulting instruction address: 0xc032202c
> > [    6.500033] Vector: 300 (Data Access) at [efa31cf0]
> > [    6.504922]     pc: c032202c: __reg_op+0xe8/0x100
> > [    6.509697]     lr: c0014f88: msi_bitmap_free_hwirqs+0x50/0x94
> > [    6.515600]     sp: efa31da0
> > [    6.518491]    msr: 21000
> > [    6.521112]    dar: fa06c7c
> > [    6.523915]  dsisr: 0
> > [    6.526190]   current = 0xef8bab00
> > [    6.529603]     pid   = 115, comm = kworker/0:1
> > [    6.534163] enter ? for help
> > [    6.537054] [link register   ] c0014f88 msi_bitmap_free_hwirqs+0x50/0x94
> > [    6.543811] [efa31da0] c0014f78 msi_bitmap_free_hwirqs+0x40/0x94 
(unreliable)
> > [    6.551001] [efa31dc0] c001aee8 ppc4xx_setup_msi_irqs+0xac/0xf4
> > [    6.556973] [efa31e00] c03503a4 pci_enable_msi_range+0x1e0/0x280
> > [    6.563032] [efa31e40] f92c2f74 azx_probe_work+0xe0/0x57c [snd_hda_intel]
> > [    6.569906] [efa31e80] c0036344 process_one_work+0x1e8/0x2f0
> > [    6.575627] [efa31eb0] c003677c worker_thread+0x2f4/0x438
> > [    6.581079] [efa31ef0] c003a3e4 kthread+0xc8/0xcc
> > [    6.585844] [efa31f40] c000aec4 ret_from_kernel_thread+0x5c/0x64
> > [    6.591910] mon>  <no input ...>

Managed to do a third git bisect  with the following results .

Great work.

git bisect bad
9279d3286e10736766edcaf815ae10e00856e448 is the first bad commit
commit 9279d3286e10736766edcaf815ae10e00856e448
Author: Rasmus Villemoes <li...@rasmusvillemoes.dk>
Date:   Wed Aug 6 16:10:16 2014 -0700

    lib: bitmap: change parameter of bitmap_*_region to unsigned

    Changing the pos parameter of __reg_op to unsigned allows the compiler
    to generate slightly smaller and simpler code.  Also update its callers
    bitmap_*_region to receive and pass unsigned int.  The return types of
    bitmap_find_free_region and bitmap_allocate_region are still int to
    allow a negative error code to be returned.  An int is certainly capable
    of representing any realistic return value.

So that looks feasible as the culprit.

Looking at the 4xx MSI code, it just looks wrong:

static int ppc4xx_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
{
        ...

        list_for_each_entry(entry, &dev->msi_list, list) {
                int_no = msi_bitmap_alloc_hwirqs(&msi_data->bitmap, 1);
                if (int_no >= 0)
                        break;

That's backward, a *negative* return indicates an error.

                if (int_no < 0) {
                        pr_debug("%s: fail allocating msi interrupt\n",
                                        __func__);
                }

This is the correct check, but it just prints a warning and then continues,
which is not going to work.

                virq = irq_of_parse_and_map(msi_data->msi_dev, int_no);

This will fail if int_no is negative.

                if (virq == NO_IRQ) {
                        dev_err(&dev->dev, "%s: fail mapping irq\n", __func__);
                        msi_bitmap_free_hwirqs(&msi_data->bitmap, int_no, 1);

And so here we can pass a negative int_no to the free routine, which then oopses.

                        return -ENOSPC;
                }


So the bug is in the 4xx MSI code, and has always been there, in fact I don't see how that code has *ever* worked. The commit you bisected to just caused the
existing bug to cause an oops.

Can you try this?

diff --git a/arch/powerpc/sysdev/ppc4xx_msi.c b/arch/powerpc/sysdev/ppc4xx_msi.c
index 6e2e6aa378bb..effb5b878a78 100644
--- a/arch/powerpc/sysdev/ppc4xx_msi.c
+++ b/arch/powerpc/sysdev/ppc4xx_msi.c
@@ -95,11 +95,9 @@ static int ppc4xx_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
        list_for_each_entry(entry, &dev->msi_list, list) {
                int_no = msi_bitmap_alloc_hwirqs(&msi_data->bitmap, 1);
-               if (int_no >= 0)
-                       break;
                if (int_no < 0) {
-                       pr_debug("%s: fail allocating msi interrupt\n",
-                                       __func__);
+                       pr_warn("%s: fail allocating msi interrupt\n", 
__func__);
+                       return -ENOSPC;
                }
                virq = irq_of_parse_and_map(msi_data->msi_dev, int_no);
                if (virq == NO_IRQ) {

cheers


Can also confirm patch working with kernel 3.18.7 .




_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to