On Sat, Nov 12, 2011 at 7:15 AM, Lukas Razik <[email protected]> wrote:
> But I also get this BUG message from the kernel:
> ---
> [ 9305.698663] swap_free: Bad swap file entry 100005e000061800
> [ 9305.698791] BUG: Bad page map in process ibv_devinfo  pte:bc0000c300104848 
> pmd:00f38054
> [ 9305.698908] addr:fffff80100114000 vm_flags:000844fa anon_vma:          
> (null) mapping:fffff807f313a410 index:6180082
> [ 9305.699087] vma->vm_file->f_op->mmap: ib_uverbs_mmap+0x8/0x38 [ib_uverbs]
> [ 9305.699135] Call Trace:
> [ 9305.699174]  [00000000004cd558] unmap_vmas+0x514/0x7f4
> [ 9305.699302]  [00000000004d1274] unmap_region+0xb4/0x164
> [ 9305.699383]  [00000000004d22c0] do_munmap+0x2a8/0x31c
> [ 9305.699467]  [000000000042d350] SyS_64_munmap+0x88/0xa8
> [ 9305.699550]  [0000000000406154] linux_sparc_syscall+0x34/0x44

Hi Sparc Linux hackers!

This message is coming from someone using a ConnectX (mlx4) adapter on
a Sun T5120 (SPARC64).

I'm wondering if the bug is in the mlx4 code or in the sparc
io_remap_pfn_range()
code.  From the "Bad page map" pte value, if I understand sparc mm correctly,
the PTE is seen as not present.

The mapping is coming from the code in drivers/infiniband/hw/mlx4/main.c:

static int mlx4_ib_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
{
        struct mlx4_ib_dev *dev = to_mdev(context->device);

        if (vma->vm_end - vma->vm_start != PAGE_SIZE)
                return -EINVAL;

        if (vma->vm_pgoff == 0) {
                vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

                if (io_remap_pfn_range(vma, vma->vm_start,
                                       to_mucontext(context)->uar.pfn,
                                       PAGE_SIZE, vma->vm_page_prot))
                        return -EAGAIN;
        } else if (vma->vm_pgoff == 1 && dev->dev->caps.bf_reg_size != 0) {
                vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);

                if (io_remap_pfn_range(vma, vma->vm_start,
                                       to_mucontext(context)->uar.pfn +
                                       dev->dev->caps.num_uars,
                                       PAGE_SIZE, vma->vm_page_prot))
                        return -EAGAIN;
        } else
                return -EINVAL;

        return 0;
}

which is just taking one page -- the pfn comes from the PCI BAR
resource value in

int mlx4_uar_alloc(struct mlx4_dev *dev, struct mlx4_uar *uar)
{
        uar->index = mlx4_bitmap_alloc(&mlx4_priv(dev)->uar_table.bitmap);
        if (uar->index == -1)
                return -ENOMEM;

        uar->pfn = (pci_resource_start(dev->pdev, 2) >> PAGE_SHIFT) + 
uar->index;
        uar->map = NULL;

        return 0;
}

and calling io_remap_pfn_range on it.

Is there some VMA flag we're forgetting to set that only affects sparc?
Or could this possible be a bug in the sparc handling of PFN maps?

Thanks!
  Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to