Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Benjamin Herrenschmidt wrote:

On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:


Or based on the alignment of vaddr actually...


The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?


That might work.

If you gimme a patch, i'll try it out!

Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Benjamin Herrenschmidt

On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:

> Or based on the alignment of vaddr actually...

The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Steve Wise wrote:

Roland Dreier wrote:

 > It appears that my problem boils down to a single host page of memory
 > that is mapped for dma, and the dma address returned by dma_map_sg()
 > is _not_ 64KB aligned.  Here is an example:

 > My first question is: Is there an assumption or requirement in linux
 > that dma_addressess should have the same alignment as the host address
 > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
 > but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.



In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:



vaddr = (unsigned long)page_address(s->page) + s->offset;
npages = iommu_num_pages(vaddr, slen);
entry = iommu_range_alloc(tbl, npages, , mask 
>> IOMMU_PAGE_SHIFT, 0);


But perhaps the alignment order needs to be based on the host page size?



Or based on the alignment of vaddr actually...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Roland Dreier wrote:

 > It appears that my problem boils down to a single host page of memory
 > that is mapped for dma, and the dma address returned by dma_map_sg()
 > is _not_ 64KB aligned.  Here is an example:

 > My first question is: Is there an assumption or requirement in linux
 > that dma_addressess should have the same alignment as the host address
 > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
 > but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.



In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:



vaddr = (unsigned long)page_address(s->page) + s->offset;
npages = iommu_num_pages(vaddr, slen);
entry = iommu_range_alloc(tbl, npages, , mask >> 
IOMMU_PAGE_SHIFT, 0);


But perhaps the alignment order needs to be based on the host page size?


Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Roland Dreier
 > It appears that my problem boils down to a single host page of memory
 > that is mapped for dma, and the dma address returned by dma_map_sg()
 > is _not_ 64KB aligned.  Here is an example:

 > My first question is: Is there an assumption or requirement in linux
 > that dma_addressess should have the same alignment as the host address
 > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
 > but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Tom Tucker

On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
> Hey Roland (and any iommu/ppc/dma experts out there):
> 
> I'm debugging a data corruption issue that happens on PPC64 systems 
> running rdma on kernels where the iommu page size is 4KB yet the host 
> page size is 64KB.  This "feature" was added to the PPC64 code recently, 
> and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
> page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
>   configs work too. Its just a problem when the iommu page size != host 
> page size.
> 
> It appears that my problem boils down to a single host page of memory 
> that is mapped for dma, and the dma address returned by dma_map_sg() is 
> _not_ 64KB aligned.  Here is an example:
> 
> app registers va 0x2d9a3000 len 12288
> ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
> state from a registered user memory region):
> 
> > umem len 12288 off 12288 pgsz 65536 shift 16
> > chunk 0: nmap 1 nents 1
> > sglist[0] page 0xc0930b08 off 0 len 65536 dma_addr 
> > 5bff4000 dma_len 65536
> > 
> 
> So the kernel maps 1 full page for this MR.  But note that the dma 
> address is 5bff4000 which is 4KB aligned, not 64KB aligned.  I 
> think this is causing grief to the RDMA HW.
> 
> My first question is: Is there an assumption or requirement in linux 
> that dma_addressess should have the same alignment as the host address 
> they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
> but the mapping doesn't begin on a 64KB page boundary.
> 
> If this mapping is considered valid, then perhaps the rdma hw is at 
> fault here.  But I'm wondering if this is an PPC/iommu bug.
> 
> BTW:  Here is what the Memory Region looks like to the HW:
> 
> > TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
> > perms RW rem_inv_dis 0 addr_type VATO
> > bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
> > len 12288 va 2d9a3000 bind_cnt 0
> > PBL: 5bff4000
> 
> 
> 
> Any thoughts?

The Ammasso certainly works this way. If you tell it the page size is
64KB, it will ignore bits in the page address that encode 0-65535.

> 
> Steve.
> 
> 
> ___
> general mailing list
> [EMAIL PROTECTED]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Tom Tucker

On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
 Hey Roland (and any iommu/ppc/dma experts out there):
 
 I'm debugging a data corruption issue that happens on PPC64 systems 
 running rdma on kernels where the iommu page size is 4KB yet the host 
 page size is 64KB.  This feature was added to the PPC64 code recently, 
 and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
 page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
   configs work too. Its just a problem when the iommu page size != host 
 page size.
 
 It appears that my problem boils down to a single host page of memory 
 that is mapped for dma, and the dma address returned by dma_map_sg() is 
 _not_ 64KB aligned.  Here is an example:
 
 app registers va 0x2d9a3000 len 12288
 ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
 state from a registered user memory region):
 
  umem len 12288 off 12288 pgsz 65536 shift 16
  chunk 0: nmap 1 nents 1
  sglist[0] page 0xc0930b08 off 0 len 65536 dma_addr 
  5bff4000 dma_len 65536
  
 
 So the kernel maps 1 full page for this MR.  But note that the dma 
 address is 5bff4000 which is 4KB aligned, not 64KB aligned.  I 
 think this is causing grief to the RDMA HW.
 
 My first question is: Is there an assumption or requirement in linux 
 that dma_addressess should have the same alignment as the host address 
 they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
 but the mapping doesn't begin on a 64KB page boundary.
 
 If this mapping is considered valid, then perhaps the rdma hw is at 
 fault here.  But I'm wondering if this is an PPC/iommu bug.
 
 BTW:  Here is what the Memory Region looks like to the HW:
 
  TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
  perms RW rem_inv_dis 0 addr_type VATO
  bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
  len 12288 va 2d9a3000 bind_cnt 0
  PBL: 5bff4000
 
 
 
 Any thoughts?

The Ammasso certainly works this way. If you tell it the page size is
64KB, it will ignore bits in the page address that encode 0-65535.

 
 Steve.
 
 
 ___
 general mailing list
 [EMAIL PROTECTED]
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Roland Dreier
  It appears that my problem boils down to a single host page of memory
  that is mapped for dma, and the dma address returned by dma_map_sg()
  is _not_ 64KB aligned.  Here is an example:

  My first question is: Is there an assumption or requirement in linux
  that dma_addressess should have the same alignment as the host address
  they are mapped to?  IE the rdma core is mapping the entire 64KB page,
  but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Roland Dreier wrote:

  It appears that my problem boils down to a single host page of memory
  that is mapped for dma, and the dma address returned by dma_map_sg()
  is _not_ 64KB aligned.  Here is an example:

  My first question is: Is there an assumption or requirement in linux
  that dma_addressess should have the same alignment as the host address
  they are mapped to?  IE the rdma core is mapping the entire 64KB page,
  but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.



In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:



vaddr = (unsigned long)page_address(s-page) + s-offset;
npages = iommu_num_pages(vaddr, slen);
entry = iommu_range_alloc(tbl, npages, handle, mask  
IOMMU_PAGE_SHIFT, 0);


But perhaps the alignment order needs to be based on the host page size?


Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Steve Wise wrote:

Roland Dreier wrote:

  It appears that my problem boils down to a single host page of memory
  that is mapped for dma, and the dma address returned by dma_map_sg()
  is _not_ 64KB aligned.  Here is an example:

  My first question is: Is there an assumption or requirement in linux
  that dma_addressess should have the same alignment as the host address
  they are mapped to?  IE the rdma core is mapping the entire 64KB page,
  but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.



In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:



vaddr = (unsigned long)page_address(s-page) + s-offset;
npages = iommu_num_pages(vaddr, slen);
entry = iommu_range_alloc(tbl, npages, handle, mask 
 IOMMU_PAGE_SHIFT, 0);


But perhaps the alignment order needs to be based on the host page size?



Or based on the alignment of vaddr actually...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Benjamin Herrenschmidt

On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:

 Or based on the alignment of vaddr actually...

The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?

Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] iommu dma mapping alignment requirements

2007-12-20 Thread Steve Wise

Benjamin Herrenschmidt wrote:

On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:


Or based on the alignment of vaddr actually...


The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?


That might work.

If you gimme a patch, i'll try it out!

Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/