Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-21 Thread Roman Skakun
Hi Robin,

>> I use Xen PV display. In my case, PV display backend(Dom0) allocates
>> contiguous buffer via DMA-API to
>> to implement zero-copy between Dom0 and DomU.
>>
> Well, something's gone badly wrong there - if you have to shadow the
> entire thing in a bounce buffer to import it then it's hardly zero-copy,
> is it? If you want to do buffer sharing the buffer really needs to be
> allocated appropriately to begin with, such that all relevant devices
> can access it directly. That might be something which needs fixing in Xen.
>

Right, in case when we want to use a zero-copy approach need to avoid
using swiotlb
bounce buffer for all devices which is potentially using this buffer.
The root of the problem is that this buffer mapped to foreign pages
and when I tried to
retrieve dma_addr for this buffer I got a foreign MFN that bigger than
32 bit and swiotlb tries to
use bounce buffer.
I understood, that, need to find a way to avoid using swiotlb in this case.
At the moment, it's unclear how to do this properly.
But, this is another story...

I guess, we can have the situation when some device like rcar-du needs
to use a sufficiently large
buffer which is greater than 256 KB (128(CURRENT_IO_TLB_SEGMENT *
2048) and need to
adjust this parameter during boot time, not compilation time.
In order to this point, this patch was created.

Thanks,
Roman

пт, 17 сент. 2021 г. в 12:44, Robin Murphy :
>
> On 2021-09-17 10:36, Roman Skakun wrote:
> > Hi, Christoph
> >
> > I use Xen PV display. In my case, PV display backend(Dom0) allocates
> > contiguous buffer via DMA-API to
> > to implement zero-copy between Dom0 and DomU.
>
> Well, something's gone badly wrong there - if you have to shadow the
> entire thing in a bounce buffer to import it then it's hardly zero-copy,
> is it? If you want to do buffer sharing the buffer really needs to be
> allocated appropriately to begin with, such that all relevant devices
> can access it directly. That might be something which needs fixing in Xen.
>
> Robin.
>
> > When I start Weston under DomU, I got the next log in Dom0:
> > ```
> > [ 112.554471] CPU: 0 PID: 367 Comm: weston Tainted: G O
> > 5.10.0-yocto-standard+ #312
> > [ 112.575149] Call trace:
> > [ 112.577666] dump_backtrace+0x0/0x1b0
> > [ 112.581373] show_stack+0x18/0x70
> > [ 112.584746] dump_stack+0xd0/0x12c
> > [ 112.588200] swiotlb_tbl_map_single+0x234/0x360
> > [ 112.592781] xen_swiotlb_map_page+0xe4/0x4c0
> > [ 112.597095] xen_swiotlb_map_sg+0x84/0x12c
> > [ 112.601249] dma_map_sg_attrs+0x54/0x60
> > [ 112.605138] vsp1_du_map_sg+0x30/0x60
> > [ 112.608851] rcar_du_vsp_map_fb+0x134/0x170
> > [ 112.613082] rcar_du_vsp_plane_prepare_fb+0x44/0x64
> > [ 112.618007] drm_atomic_helper_prepare_planes+0xac/0x160
> > [ 112.623362] drm_atomic_helper_commit+0x88/0x390
> > [ 112.628029] drm_atomic_nonblocking_commit+0x4c/0x60
> > [ 112.633043] drm_mode_atomic_ioctl+0x9a8/0xb0c
> > [ 112.637532] drm_ioctl_kernel+0xc4/0x11c
> > [ 112.641506] drm_ioctl+0x21c/0x460
> > [ 112.644967] __arm64_sys_ioctl+0xa8/0xf0
> > [ 112.648939] el0_svc_common.constprop.0+0x78/0x1a0
> > [ 112.653775] do_el0_svc+0x24/0x90
> > [ 112.657148] el0_svc+0x14/0x20
> > [ 112.660254] el0_sync_handler+0x1a4/0x1b0
> > [ 112.664315] el0_sync+0x174/0x180
> > [ 112.668145] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
> > 3686400 bytes), total 65536 (slots), used 112 (slots)
> > ```
> > The problem is happened here:
> > https://elixir.bootlin.com/linux/v5.14.4/source/drivers/gpu/drm/rcar-du/rcar_du_vsp.c#L202
> >
> > Sgt was created in dma_get_sgtable() by dma_common_get_sgtable() and
> > includes a single page chunk
> > as shown here:
> > https://elixir.bootlin.com/linux/v5.14.5/source/kernel/dma/ops_helpers.c#L18
> >
> > After creating a new sgt, we tried to map this sgt through vsp1_du_map_sg().
> > Internally, vsp1_du_map_sg() using ops->map_sg (e.g
> > xen_swiotlb_map_sg) to perform
> > mapping.
> >
> > I realized that required segment is too big to be fitted to default
> > swiotlb segment and condition
> > https://elixir.bootlin.com/linux/latest/source/kernel/dma/swiotlb.c#L474
> > is always false.
> >
> > I know that I use a large buffer, but why can't I map this buffer in one 
> > chunk?
> >
> > Thanks!
> >
> > ср, 15 сент. 2021 г. в 16:53, Christoph Hellwig :
> >>
> >> On Wed, Sep 15, 2021 at 03:49:52PM +0200, Jan Beulich wrote:
> >>> But the question remains: Why does the framebuffer need to be mapped
> >>> in a single giant chunk?
> >>
> >> More importantly: if you use dynamic dma mappings for your framebuffer
> >> you're doing something wrong.
> >
> >
> >



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-17 Thread Roman Skakun
Hi Jan,

>>>  In order to be sure to catch all uses like this one (including ones
>>>  which make it upstream in parallel to yours), I think you will want
>>>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
>>
>> I don't understand your point. Can you clarify this?
>
> There's a concrete present example: I have a patch pending adding
> another use of IO_TLB_SEGSIZE. This use would need to be replaced
> like you do here in several places. The need for the additional
> replacement would be quite obvious (from a build failure) if you
> renamed the manifest constant. Without renaming, it'll take
> someone running into an issue on a live system, which I consider
> far worse. This is because a simple re-basing of one of the
> patches on top of the other will not point out the need for the
> extra replacement, nor would a test build (with both patches in
> place).

It's reasonable.
I will change the original IO_TLB_SEGSIZE to IO_TLB_DEFAULT_SEGSIZE in the
next patch series.

Thanks.

ср, 15 сент. 2021 г. в 16:50, Jan Beulich :
>
> On 15.09.2021 15:37, Roman Skakun wrote:
> >>> From: Roman Skakun 
> >>>
> >>> It is possible when default IO TLB size is not
> >>> enough to fit a long buffers as described here [1].
> >>>
> >>> This patch makes a way to set this parameter
> >>> using cmdline instead of recompiling a kernel.
> >>>
> >>> [1] https://www.xilinx.com/support/answers/72694.html
> >>
> >>  I'm not convinced the swiotlb use describe there falls under "intended
> >>  use" - mapping a 1280x720 framebuffer in a single chunk?
> >
> > I had the same issue while mapping DMA chuck ~4MB for gem fb when
> > using xen vdispl.
> > I got the next log:
> > [ 142.030421] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
> > 3686400 bytes), total 32768 (slots), used 32 (slots)
> >
> > It happened when I tried to map bounce buffer, which has a large size.
> > The default size if 128(IO_TLB_SEGSIZE) * 2048(IO_TLB_SHIFT) = 262144
> > bytes, but we requested 3686400 bytes.
> > When I change IO_TLB_SEGSIZE to 2048. (2048(IO_TLB_SEGSIZE)  *
> > 2048(IO_TLB_SHIFT) = 4194304bytes).
> > It makes possible to retrieve a bounce buffer for requested size.
> > After changing this value, the problem is gone.
>
> But the question remains: Why does the framebuffer need to be mapped
> in a single giant chunk?
>
> >>  In order to be sure to catch all uses like this one (including ones
> >>  which make it upstream in parallel to yours), I think you will want
> >>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
> >
> > I don't understand your point. Can you clarify this?
>
> There's a concrete present example: I have a patch pending adding
> another use of IO_TLB_SEGSIZE. This use would need to be replaced
> like you do here in several places. The need for the additional
> replacement would be quite obvious (from a build failure) if you
> renamed the manifest constant. Without renaming, it'll take
> someone running into an issue on a live system, which I consider
> far worse. This is because a simple re-basing of one of the
> patches on top of the other will not point out the need for the
> extra replacement, nor would a test build (with both patches in
> place).
>
> Jan
>


-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-17 Thread Roman Skakun
Hi, Christoph

I use Xen PV display. In my case, PV display backend(Dom0) allocates
contiguous buffer via DMA-API to
to implement zero-copy between Dom0 and DomU.

When I start Weston under DomU, I got the next log in Dom0:
```
[ 112.554471] CPU: 0 PID: 367 Comm: weston Tainted: G O
5.10.0-yocto-standard+ #312
[ 112.575149] Call trace:
[ 112.577666] dump_backtrace+0x0/0x1b0
[ 112.581373] show_stack+0x18/0x70
[ 112.584746] dump_stack+0xd0/0x12c
[ 112.588200] swiotlb_tbl_map_single+0x234/0x360
[ 112.592781] xen_swiotlb_map_page+0xe4/0x4c0
[ 112.597095] xen_swiotlb_map_sg+0x84/0x12c
[ 112.601249] dma_map_sg_attrs+0x54/0x60
[ 112.605138] vsp1_du_map_sg+0x30/0x60
[ 112.608851] rcar_du_vsp_map_fb+0x134/0x170
[ 112.613082] rcar_du_vsp_plane_prepare_fb+0x44/0x64
[ 112.618007] drm_atomic_helper_prepare_planes+0xac/0x160
[ 112.623362] drm_atomic_helper_commit+0x88/0x390
[ 112.628029] drm_atomic_nonblocking_commit+0x4c/0x60
[ 112.633043] drm_mode_atomic_ioctl+0x9a8/0xb0c
[ 112.637532] drm_ioctl_kernel+0xc4/0x11c
[ 112.641506] drm_ioctl+0x21c/0x460
[ 112.644967] __arm64_sys_ioctl+0xa8/0xf0
[ 112.648939] el0_svc_common.constprop.0+0x78/0x1a0
[ 112.653775] do_el0_svc+0x24/0x90
[ 112.657148] el0_svc+0x14/0x20
[ 112.660254] el0_sync_handler+0x1a4/0x1b0
[ 112.664315] el0_sync+0x174/0x180
[ 112.668145] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
3686400 bytes), total 65536 (slots), used 112 (slots)
```
The problem is happened here:
https://elixir.bootlin.com/linux/v5.14.4/source/drivers/gpu/drm/rcar-du/rcar_du_vsp.c#L202

Sgt was created in dma_get_sgtable() by dma_common_get_sgtable() and
includes a single page chunk
as shown here:
https://elixir.bootlin.com/linux/v5.14.5/source/kernel/dma/ops_helpers.c#L18

After creating a new sgt, we tried to map this sgt through vsp1_du_map_sg().
Internally, vsp1_du_map_sg() using ops->map_sg (e.g
xen_swiotlb_map_sg) to perform
mapping.

I realized that required segment is too big to be fitted to default
swiotlb segment and condition
https://elixir.bootlin.com/linux/latest/source/kernel/dma/swiotlb.c#L474
is always false.

I know that I use a large buffer, but why can't I map this buffer in one chunk?

Thanks!

ср, 15 сент. 2021 г. в 16:53, Christoph Hellwig :
>
> On Wed, Sep 15, 2021 at 03:49:52PM +0200, Jan Beulich wrote:
> > But the question remains: Why does the framebuffer need to be mapped
> > in a single giant chunk?
>
> More importantly: if you use dynamic dma mappings for your framebuffer
> you're doing something wrong.



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-16 Thread Roman Skakun
Hi Stefano,

> Also, Option 1 listed in the webpage seems to be a lot better. Any
> reason you can't do that? Because that option both solves the problem
> and increases performance.

Yes, Option 1 is probably more efficient.
But I use another platform under Xen without DMA adjustment functionality.
I pinned this webpage only for example to describe the similar problem I had.

Cheers,
Roman

ср, 15 сент. 2021 г. в 04:51, Stefano Stabellini :

>
> On Tue, 14 Sep 2021, Christoph Hellwig wrote:
> > On Tue, Sep 14, 2021 at 05:29:07PM +0200, Jan Beulich wrote:
> > > I'm not convinced the swiotlb use describe there falls under "intended
> > > use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> > > the bottom of this page is also confusing, as following "Then we can
> > > confirm the modified swiotlb size in the boot log:" there is a log
> > > fragment showing the same original size of 64Mb.
> >
> > It doesn't.  We also do not add hacks to the kernel for whacky out
> > of tree modules.
>
> Also, Option 1 listed in the webpage seems to be a lot better. Any
> reason you can't do that? Because that option both solves the problem
> and increases performance.



--
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-15 Thread Roman Skakun
Hi Jan,

Thanks for the answer.

>> From: Roman Skakun 
>>
>> It is possible when default IO TLB size is not
>> enough to fit a long buffers as described here [1].
>>
>> This patch makes a way to set this parameter
>> using cmdline instead of recompiling a kernel.
>>
>> [1] https://www.xilinx.com/support/answers/72694.html
>
>  I'm not convinced the swiotlb use describe there falls under "intended
>  use" - mapping a 1280x720 framebuffer in a single chunk?

I had the same issue while mapping DMA chuck ~4MB for gem fb when
using xen vdispl.
I got the next log:
[ 142.030421] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
3686400 bytes), total 32768 (slots), used 32 (slots)

It happened when I tried to map bounce buffer, which has a large size.
The default size if 128(IO_TLB_SEGSIZE) * 2048(IO_TLB_SHIFT) = 262144
bytes, but we requested 3686400 bytes.
When I change IO_TLB_SEGSIZE to 2048. (2048(IO_TLB_SEGSIZE)  *
2048(IO_TLB_SHIFT) = 4194304bytes).
It makes possible to retrieve a bounce buffer for requested size.
After changing this value, the problem is gone.

>  the bottom of this page is also confusing, as following "Then we can
>  confirm the modified swiotlb size in the boot log:" there is a log
>  fragment showing the same original size of 64Mb.

I suspect, this is a mistake in the article.
According to 
https://elixir.bootlin.com/linux/v5.14.4/source/kernel/dma/swiotlb.c#L214
and
https://elixir.bootlin.com/linux/v5.15-rc1/source/kernel/dma/swiotlb.c#L182
The IO_TLB_SEGSIZE is not used to calculate total size of swiotlb area.
This explains why we have the same total size before and after changing of
TLB segment size.

>  In order to be sure to catch all uses like this one (including ones
>  which make it upstream in parallel to yours), I think you will want
>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.

I don't understand your point. Can you clarify this?

>> + /* get max IO TLB segment size */
>> + if (isdigit(*str)) {
>> + tmp = simple_strtoul(str, , 0);
>> + if (tmp)
>> + io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.

Yes, right, thanks!

Cheers,
Roman.

вт, 14 сент. 2021 г. в 18:29, Jan Beulich :
>
> On 14.09.2021 17:10, Roman Skakun wrote:
> > From: Roman Skakun 
> >
> > It is possible when default IO TLB size is not
> > enough to fit a long buffers as described here [1].
> >
> > This patch makes a way to set this parameter
> > using cmdline instead of recompiling a kernel.
> >
> > [1] https://www.xilinx.com/support/answers/72694.html
>
> I'm not convinced the swiotlb use describe there falls under "intended
> use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> the bottom of this page is also confusing, as following "Then we can
> confirm the modified swiotlb size in the boot log:" there is a log
> fragment showing the same original size of 64Mb.
>
> > --- a/arch/mips/cavium-octeon/dma-octeon.c
> > +++ b/arch/mips/cavium-octeon/dma-octeon.c
> > @@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
> >   swiotlbsize = 64 * (1<<20);
> >  #endif
> >   swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
> > - swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
> > + swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
>
> In order to be sure to catch all uses like this one (including ones
> which make it upstream in parallel to yours), I think you will want
> to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
>
> > @@ -81,15 +86,30 @@ static unsigned int max_segment;
> >  static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> >
> >  static int __init
> > -setup_io_tlb_npages(char *str)
> > +setup_io_tlb_params(char *str)
> >  {
> > + unsigned long tmp;
> > +
> >   if (isdigit(*str)) {
> > - /* avoid tail segment of size < IO_TLB_SEGSIZE */
> > - default_nslabs =
> > - ALIGN(simple_strtoul(str, , 0), IO_TLB_SEGSIZE);
> > + default_nslabs = simple_strtoul(str, , 0);
> >   }
> >   if (*str == ',')
> >   ++str;
> > +
> > + /* get max IO TLB segment size */
> > + if (isdigit(*str)) {
> > + tmp = simple_strtoul(str, , 0);
> > + if (tmp)
> > + io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.
>
> Jan
>


-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] swiotlb: set IO TLB segment size via cmdline

2021-09-14 Thread Roman Skakun
From: Roman Skakun 

It is possible when default IO TLB size is not
enough to fit a long buffers as described here [1].

This patch makes a way to set this parameter
using cmdline instead of recompiling a kernel.

[1] https://www.xilinx.com/support/answers/72694.html

Signed-off-by: Roman Skakun 
---
 .../admin-guide/kernel-parameters.txt |  5 +-
 arch/mips/cavium-octeon/dma-octeon.c  |  2 +-
 arch/powerpc/platforms/pseries/svm.c  |  2 +-
 drivers/xen/swiotlb-xen.c |  7 +--
 include/linux/swiotlb.h   |  1 +
 kernel/dma/swiotlb.c  | 51 ++-
 6 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91ba391f9b32..f842a523a485 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5558,8 +5558,9 @@
it if 0 is given (See 
Documentation/admin-guide/cgroup-v1/memory.rst)
 
swiotlb=[ARM,IA-64,PPC,MIPS,X86]
-   Format: {  | force | noforce }
--- Number of I/O TLB slabs
+   Format: {  [,] [,force | 
noforce]​ }
+-- Number of I/O TLB slabs
+-- Max IO TLB segment size
force -- force using of bounce buffers even if they
 wouldn't be automatically used by the kernel
noforce -- Never use bounce buffers (for debugging)
diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..446c73bc936e 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
swiotlbsize = 64 * (1<<20);
 #endif
swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
-   swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
+   swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
swiotlbsize = swiotlb_nslabs << IO_TLB_SHIFT;
 
octeon_swiotlb = memblock_alloc_low(swiotlbsize, PAGE_SIZE);
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..2a1f09c722ac 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -47,7 +47,7 @@ void __init svm_swiotlb_init(void)
unsigned long bytes, io_tlb_nslabs;
 
io_tlb_nslabs = (swiotlb_size_or_default() >> IO_TLB_SHIFT);
-   io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+   io_tlb_nslabs = ALIGN(io_tlb_nslabs, swiotlb_io_seg_size());
 
bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 643fe440c46e..0fc9c6cb6815 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -110,12 +110,13 @@ static int xen_swiotlb_fixup(void *buf, unsigned long 
nslabs)
int dma_bits;
dma_addr_t dma_handle;
phys_addr_t p = virt_to_phys(buf);
+   unsigned long tlb_segment_size = swiotlb_io_seg_size();
 
-   dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
+   dma_bits = get_order(tlb_segment_size << IO_TLB_SHIFT) + PAGE_SHIFT;
 
i = 0;
do {
-   int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
+   int slabs = min(nslabs - i, (unsigned long)tlb_segment_size);
 
do {
rc = xen_create_contiguous_region(
@@ -153,7 +154,7 @@ static const char *xen_swiotlb_error(enum xen_swiotlb_err 
err)
return "";
 }
 
-#define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, IO_TLB_SEGSIZE)
+#define DEFAULT_NSLABS ALIGN(SZ_64M >> IO_TLB_SHIFT, 
swiotlb_io_seg_size())
 
 int __ref xen_swiotlb_init(void)
 {
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index b0cb2a9973f4..35c3ffeda9fa 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -59,6 +59,7 @@ void swiotlb_sync_single_for_cpu(struct device *dev, 
phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir);
 dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs);
+unsigned long swiotlb_io_seg_size(void);
 
 #ifdef CONFIG_SWIOTLB
 extern enum swiotlb_force swiotlb_force;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 87c40517e822..6b505206fc13 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -72,6 +72,11 @@ enum swiotlb_force swiotlb_force;
 
 struct io_tlb_mem io_tlb_default_mem;
 
+/*
+ * Maximum IO TLB segment size.
+ */
+static unsigned long io_tlb_seg_size = IO_TLB_SEGSIZE;
+
 /*
  * Max segment that we can

Re: [GIT PULL] dma-mapping fix for Linux 5.14

2021-07-29 Thread Roman Skakun
Hi, Stefano!

> I don't know on which platform Roman Skakun (CC'ed) found the problem.
> But if we look at arch/arm/mm/dma-mapping.c:__dma_alloc, one of the
> possible options is the "remap_allocator", which calls
> __alloc_remap_buffer, which calls dma_common_contiguous_remap, which
> calls vmap.

Using  Renesas R-car H3 platform.
I have tested this case on 1:1 dom0 with < 4GB memory, but this case
still exists.
I'm still wondering why xen-swiotlb mapped vmalloc'ed addresses for
low memory DMA addresses.

пн, 26 июл. 2021 г. в 23:03, Stefano Stabellini :
>
> On Mon, 26 Jul 2021, Boris Ostrovsky wrote:
> > On 7/25/21 12:50 PM, Linus Torvalds wrote:
> > > On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig  
> > > wrote:
> > >
> > >>   - handle vmalloc addresses in dma_common_{mmap,get_sgtable}
> > >> (Roman Skakun)
> > > I've pulled this, but my reaction is that we've tried to avoid this in
> > > the past. Why is Xen using vmalloc'ed addresses and passing those in
> > > to the dma mapping routines?
> > >
> > > It *smells* to me like a Xen-swiotlb bug, and it would have been
> > > better to try to fix it there. Was that just too painful?
> >
> >
> > Stefano will probably know better but this appears to have something to do 
> > with how Pi (and possibly more ARM systems?) manage DMA memory: 
> > https://lore.kernel.org/xen-devel/CADz_WD5Ln7Pe1WAFp73d2Mz9wxspzTE3WgAJusp5S8LX4=8...@mail.gmail.com/.
>
> The original issue was found on the Raspberry Pi 4, and the fix was in
> swiotlb-xen.c, commit 8b1e868f6. More recently, Roman realized that
> dma_common_mmap might also end up calling virt_to_page on a vmalloc
> address. This is the fix for that.
>
>
> Why is Xen using vmalloc'ed addresses with dma routines at all?
>
> Xen is actually just calling the regular dma_direct_alloc to allocate
> pages (xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages ->
> dma_direct_alloc). dma_direct_alloc is the generic implementation. Back
> when the original issue was found, dma_direct_alloc returned a vmalloc
> address on RPi4.
>
> The original analysis was "xen_alloc_coherent_pages() eventually calls
> arch_dma_alloc() in remap.c which successfully allocates pages from
> atomic pool." See https://marc.info/?l=xen-devel=158878173207775.
>
>
> I don't know on which platform Roman Skakun (CC'ed) found the problem.
> But if we look at arch/arm/mm/dma-mapping.c:__dma_alloc, one of the
> possible options is the "remap_allocator", which calls
> __alloc_remap_buffer, which calls dma_common_contiguous_remap, which
> calls vmap.
>
> So unfortunately it seems that on certain arch/platforms
> dma_alloc_coherent can return a vmap'ed address. So I would imagine this
> issue could also happen on native (without Xen), at least in theory.



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-21 Thread Roman Skakun
> Fine with.  I've queued up the modified patch.

Good. Thanks!

>
> On Sat, Jul 17, 2021 at 11:39:21AM +0300, Roman Skakun wrote:
> > > We can merge this patch and create a new one for
> > > xen_swiotlb_free_coherent() later.
> > > Yeah, no worries, I didn't know that exposing dma_common_vaddr_to_page
> > > was problematic.
> > >
> > > This patch is fine by me.
> >
> > Good. I'm agreed too. Waiting for Christoph.
>
> Fine with.  I've queued up the modified patch.



--
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-17 Thread Roman Skakun
> We can merge this patch and create a new one for
> xen_swiotlb_free_coherent() later.
> Yeah, no worries, I didn't know that exposing dma_common_vaddr_to_page
> was problematic.
>
> This patch is fine by me.

Good. I'm agreed too. Waiting for Christoph.

пт, 16 июл. 2021 г. в 18:29, Stefano Stabellini :
>
> On Fri, 16 Jul 2021, Roman Skakun wrote:
> > > Technically this looks good.  But given that exposing a helper
> > > that does either vmalloc_to_page or virt_to_page is one of the
> > > never ending MM discussions I don't want to get into that discussion
> > > and just keep it local in the DMA code.
> > >
> > > Are you fine with me applying this version?
> >
> > Looks good to me, thanks!
> > But, Stefano asked me about using created helper in the
> > xen_swiotlb_free_coherent()
> > and I created a patch according to this mention.
> >
> > We can merge this patch and create a new one for
> > xen_swiotlb_free_coherent() later.
>
> Yeah, no worries, I didn't know that exposing dma_common_vaddr_to_page
> was problematic.
>
> This patch is fine by me.
>
>
> > пт, 16 июл. 2021 г. в 12:35, Christoph Hellwig :
> > >
> > > Technically this looks good.  But given that exposing a helper
> > > that does either vmalloc_to_page or virt_to_page is one of the
> > > never ending MM discussions I don't want to get into that discussion
> > > and just keep it local in the DMA code.
> > >
> > > Are you fine with me applying this version?
> > >
> > > ---
> > > From 40ac971eab89330d6153e7721e88acd2d98833f9 Mon Sep 17 00:00:00 2001
> > > From: Roman Skakun 
> > > Date: Fri, 16 Jul 2021 11:39:34 +0300
> > > Subject: dma-mapping: handle vmalloc addresses in
> > >  dma_common_{mmap,get_sgtable}
> > >
> > > xen-swiotlb can use vmalloc backed addresses for dma coherent allocations
> > > and uses the common helpers.  Properly handle them to unbreak Xen on
> > > ARM platforms.
> > >
> > > Fixes: 1b65c4e5a9af ("swiotlb-xen: use xen_alloc/free_coherent_pages")
> > > Signed-off-by: Roman Skakun 
> > > Reviewed-by: Andrii Anisov 
> > > [hch: split the patch, renamed the helpers]
> > > Signed-off-by: Christoph Hellwig 
> > > ---
> > >  kernel/dma/ops_helpers.c | 12 ++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
> > > index 910ae69cae77..af4a6ef48ce0 100644
> > > --- a/kernel/dma/ops_helpers.c
> > > +++ b/kernel/dma/ops_helpers.c
> > > @@ -5,6 +5,13 @@
> > >   */
> > >  #include 
> > >
> > > +static struct page *dma_common_vaddr_to_page(void *cpu_addr)
> > > +{
> > > +   if (is_vmalloc_addr(cpu_addr))
> > > +   return vmalloc_to_page(cpu_addr);
> > > +   return virt_to_page(cpu_addr);
> > > +}
> > > +
> > >  /*
> > >   * Create scatter-list for the already allocated DMA buffer.
> > >   */
> > > @@ -12,7 +19,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
> > > sg_table *sgt,
> > >  void *cpu_addr, dma_addr_t dma_addr, size_t size,
> > >  unsigned long attrs)
> > >  {
> > > -   struct page *page = virt_to_page(cpu_addr);
> > > +   struct page *page = dma_common_vaddr_to_page(cpu_addr);
> > > int ret;
> > >
> > > ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> > > @@ -32,6 +39,7 @@ int dma_common_mmap(struct device *dev, struct 
> > > vm_area_struct *vma,
> > > unsigned long user_count = vma_pages(vma);
> > > unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > unsigned long off = vma->vm_pgoff;
> > > +   struct page *page = dma_common_vaddr_to_page(cpu_addr);
> > > int ret = -ENXIO;
> > >
> > > vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> > > @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct 
> > > vm_area_struct *vma,
> > > return -ENXIO;
> > >
> > > return remap_pfn_range(vma, vma->vm_start,
> > > -   page_to_pfn(virt_to_page(cpu_addr)) + 
> > > vma->vm_pgoff,
> > > +   page_to_pfn(page) + vma->vm_pgoff,
> > > user_count << PAGE_SHIFT, vma->vm_page_prot);
> > >  #else
> > > return -ENXIO;
> > > --
> > > 2.30.2
> > >
> >
> >
> > --
> > Best Regards, Roman.
> >



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-16 Thread Roman Skakun
> Technically this looks good.  But given that exposing a helper
> that does either vmalloc_to_page or virt_to_page is one of the
> never ending MM discussions I don't want to get into that discussion
> and just keep it local in the DMA code.
>
> Are you fine with me applying this version?

Looks good to me, thanks!
But, Stefano asked me about using created helper in the
xen_swiotlb_free_coherent()
and I created a patch according to this mention.

We can merge this patch and create a new one for
xen_swiotlb_free_coherent() later.

пт, 16 июл. 2021 г. в 12:35, Christoph Hellwig :
>
> Technically this looks good.  But given that exposing a helper
> that does either vmalloc_to_page or virt_to_page is one of the
> never ending MM discussions I don't want to get into that discussion
> and just keep it local in the DMA code.
>
> Are you fine with me applying this version?
>
> ---
> From 40ac971eab89330d6153e7721e88acd2d98833f9 Mon Sep 17 00:00:00 2001
> From: Roman Skakun 
> Date: Fri, 16 Jul 2021 11:39:34 +0300
> Subject: dma-mapping: handle vmalloc addresses in
>  dma_common_{mmap,get_sgtable}
>
> xen-swiotlb can use vmalloc backed addresses for dma coherent allocations
> and uses the common helpers.  Properly handle them to unbreak Xen on
> ARM platforms.
>
> Fixes: 1b65c4e5a9af ("swiotlb-xen: use xen_alloc/free_coherent_pages")
> Signed-off-by: Roman Skakun 
> Reviewed-by: Andrii Anisov 
> [hch: split the patch, renamed the helpers]
> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/ops_helpers.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
> index 910ae69cae77..af4a6ef48ce0 100644
> --- a/kernel/dma/ops_helpers.c
> +++ b/kernel/dma/ops_helpers.c
> @@ -5,6 +5,13 @@
>   */
>  #include 
>
> +static struct page *dma_common_vaddr_to_page(void *cpu_addr)
> +{
> +   if (is_vmalloc_addr(cpu_addr))
> +   return vmalloc_to_page(cpu_addr);
> +   return virt_to_page(cpu_addr);
> +}
> +
>  /*
>   * Create scatter-list for the already allocated DMA buffer.
>   */
> @@ -12,7 +19,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
> sg_table *sgt,
>  void *cpu_addr, dma_addr_t dma_addr, size_t size,
>  unsigned long attrs)
>  {
> -   struct page *page = virt_to_page(cpu_addr);
> +   struct page *page = dma_common_vaddr_to_page(cpu_addr);
> int ret;
>
> ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> @@ -32,6 +39,7 @@ int dma_common_mmap(struct device *dev, struct 
> vm_area_struct *vma,
> unsigned long user_count = vma_pages(vma);
> unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> unsigned long off = vma->vm_pgoff;
> +   struct page *page = dma_common_vaddr_to_page(cpu_addr);
> int ret = -ENXIO;
>
> vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct 
> vm_area_struct *vma,
> return -ENXIO;
>
> return remap_pfn_range(vma, vma->vm_start,
> -   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
> +   page_to_pfn(page) + vma->vm_pgoff,
> user_count << PAGE_SHIFT, vma->vm_page_prot);
>  #else
> return -ENXIO;
> --
> 2.30.2
>


-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-16 Thread Roman Skakun
From: Roman Skakun 

This commit is dedicated to fix incorrect conversion from
cpu_addr to page address in cases when we get virtual
address which allocated in the vmalloc range.
As the result, virt_to_page() cannot convert this address
properly and return incorrect page address.

Need to detect such cases and obtains the page address using
vmalloc_to_page() instead.

Signed-off-by: Roman Skakun 
Reviewed-by: Andrii Anisov 
---
Hi, Christoph!
It's updated patch in accordance with your and Stefano 
suggestions. 

 drivers/xen/swiotlb-xen.c   |  7 +--
 include/linux/dma-map-ops.h |  2 ++
 kernel/dma/ops_helpers.c| 16 ++--
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 92ee6eea30cd..c2f612a10a95 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -337,7 +337,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
int order = get_order(size);
phys_addr_t phys;
u64 dma_mask = DMA_BIT_MASK(32);
-   struct page *page;
+   struct page *page = cpu_addr_to_page(vaddr);
 
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = hwdev->coherent_dma_mask;
@@ -349,11 +349,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);
 
-   if (is_vmalloc_addr(vaddr))
-   page = vmalloc_to_page(vaddr);
-   else
-   page = virt_to_page(vaddr);
-
if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
 range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(page))
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index a5f89fc4d6df..ce0edb0bb603 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -226,6 +226,8 @@ struct page *dma_alloc_from_pool(struct device *dev, size_t 
size,
bool (*phys_addr_ok)(struct device *, phys_addr_t, size_t));
 bool dma_free_from_pool(struct device *dev, void *start, size_t size);
 
+struct page *cpu_addr_to_page(void *cpu_addr);
+
 #ifdef CONFIG_ARCH_HAS_DMA_COHERENCE_H
 #include 
 #elif defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
index 910ae69cae77..472e861750d3 100644
--- a/kernel/dma/ops_helpers.c
+++ b/kernel/dma/ops_helpers.c
@@ -5,6 +5,17 @@
  */
 #include 
 
+/*
+ * This helper converts virtual address to page address.
+ */
+struct page *cpu_addr_to_page(void *cpu_addr)
+{
+   if (is_vmalloc_addr(cpu_addr))
+   return vmalloc_to_page(cpu_addr);
+   else
+   return virt_to_page(cpu_addr);
+}
+
 /*
  * Create scatter-list for the already allocated DMA buffer.
  */
@@ -12,7 +23,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
sg_table *sgt,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   struct page *page = virt_to_page(cpu_addr);
+   struct page *page = cpu_addr_to_page(cpu_addr);
int ret;
 
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
@@ -32,6 +43,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct 
*vma,
unsigned long user_count = vma_pages(vma);
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long off = vma->vm_pgoff;
+   struct page *page = cpu_addr_to_page(cpu_addr);
int ret = -ENXIO;
 
vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
@@ -43,7 +55,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct 
*vma,
return -ENXIO;
 
return remap_pfn_range(vma, vma->vm_start,
-   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
+   page_to_pfn(page) + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
 #else
return -ENXIO;
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-15 Thread Roman Skakun
> This looks like it wasn't picked up? Should it go in rc1?

Hi, Konrad!

This looks like an unambiguous bug, and should be in rc1.

Cheers!

ср, 14 июл. 2021 г. в 03:15, Konrad Rzeszutek Wilk :
>
> On Tue, Jun 22, 2021 at 04:34:14PM +0300, Roman Skakun wrote:
> > This commit is dedicated to fix incorrect conversion from
> > cpu_addr to page address in cases when we get virtual
> > address which allocated in the vmalloc range.
> > As the result, virt_to_page() cannot convert this address
> > properly and return incorrect page address.
> >
> > Need to detect such cases and obtains the page address using
> > vmalloc_to_page() instead.
> >
> > Signed-off-by: Roman Skakun 
> > Reviewed-by: Andrii Anisov 
> > ---
> > Hey!
> > Thanks for suggestions, Christoph!
> > I updated the patch according to your advice.
> > But, I'm so surprised because nobody catches this problem
> > in the common code before. It looks a bit strange as for me.
>
> This looks like it wasn't picked up? Should it go in rc1?
> >
> >
> >  kernel/dma/ops_helpers.c | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
> > index 910ae69cae77..782728d8a393 100644
> > --- a/kernel/dma/ops_helpers.c
> > +++ b/kernel/dma/ops_helpers.c
> > @@ -5,6 +5,14 @@
> >   */
> >  #include 
> >
> > +static struct page *cpu_addr_to_page(void *cpu_addr)
> > +{
> > + if (is_vmalloc_addr(cpu_addr))
> > + return vmalloc_to_page(cpu_addr);
> > + else
> > + return virt_to_page(cpu_addr);
> > +}
> > +
> >  /*
> >   * Create scatter-list for the already allocated DMA buffer.
> >   */
> > @@ -12,7 +20,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
> > sg_table *sgt,
> >void *cpu_addr, dma_addr_t dma_addr, size_t size,
> >unsigned long attrs)
> >  {
> > - struct page *page = virt_to_page(cpu_addr);
> > + struct page *page = cpu_addr_to_page(cpu_addr);
> >   int ret;
> >
> >   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> > @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct 
> > vm_area_struct *vma,
> >   return -ENXIO;
> >
> >   return remap_pfn_range(vma, vma->vm_start,
> > - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
> > + page_to_pfn(cpu_addr_to_page(cpu_addr)) + 
> > vma->vm_pgoff,
> >   user_count << PAGE_SHIFT, vma->vm_page_prot);
> >  #else
> >   return -ENXIO;
> > --
> > 2.25.1
> >



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-15 Thread Roman Skakun
Hi, Stefano!

> We have the same thing in xen_swiotlb_free_coherent. Can we find a way
> to call cpu_addr_to_page() from xen_swiotlb_free_coherent?
> Maybe we can make cpu_addr_to_page non-static?

Yes, right, If we want to reuse this helper instead of the same code
block in xen_swiotlb_free_coherent() need to make cpu_addr_to_page() as
global and add a new declaration for this helper in include/linux/dma-map-ops.h.
What do you think?

Cheers!

ср, 14 июл. 2021 г. в 04:23, Stefano Stabellini :
>
> On Tue, 22 Jun 2021, Roman Skakun wrote:
> > This commit is dedicated to fix incorrect conversion from
> > cpu_addr to page address in cases when we get virtual
> > address which allocated in the vmalloc range.
> > As the result, virt_to_page() cannot convert this address
> > properly and return incorrect page address.
> >
> > Need to detect such cases and obtains the page address using
> > vmalloc_to_page() instead.
> >
> > Signed-off-by: Roman Skakun 
> > Reviewed-by: Andrii Anisov 
> > ---
> > Hey!
> > Thanks for suggestions, Christoph!
> > I updated the patch according to your advice.
> > But, I'm so surprised because nobody catches this problem
> > in the common code before. It looks a bit strange as for me.
> >
> >
> >  kernel/dma/ops_helpers.c | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
> > index 910ae69cae77..782728d8a393 100644
> > --- a/kernel/dma/ops_helpers.c
> > +++ b/kernel/dma/ops_helpers.c
> > @@ -5,6 +5,14 @@
> >   */
> >  #include 
> >
> > +static struct page *cpu_addr_to_page(void *cpu_addr)
> > +{
> > + if (is_vmalloc_addr(cpu_addr))
> > + return vmalloc_to_page(cpu_addr);
> > + else
> > + return virt_to_page(cpu_addr);
> > +}
>
> We have the same thing in xen_swiotlb_free_coherent. Can we find a way
> to call cpu_addr_to_page() from xen_swiotlb_free_coherent? Maybe we can
> make cpu_addr_to_page non-static? No problem if it is too much trouble.
>
>
> >  /*
> >   * Create scatter-list for the already allocated DMA buffer.
> >   */
> > @@ -12,7 +20,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
> > sg_table *sgt,
> >void *cpu_addr, dma_addr_t dma_addr, size_t size,
> >unsigned long attrs)
> >  {
> > - struct page *page = virt_to_page(cpu_addr);
> > + struct page *page = cpu_addr_to_page(cpu_addr);
> >   int ret;
> >
> >   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
> > @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct 
> > vm_area_struct *vma,
> >   return -ENXIO;
> >
> >   return remap_pfn_range(vma, vma->vm_start,
> > - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
> > + page_to_pfn(cpu_addr_to_page(cpu_addr)) + 
> > vma->vm_pgoff,
> >   user_count << PAGE_SHIFT, vma->vm_page_prot);
> >  #else
> >   return -ENXIO;
> > --
> > 2.25.1
> >



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-06-22 Thread Roman Skakun
This commit is dedicated to fix incorrect conversion from
cpu_addr to page address in cases when we get virtual
address which allocated in the vmalloc range.
As the result, virt_to_page() cannot convert this address
properly and return incorrect page address.

Need to detect such cases and obtains the page address using
vmalloc_to_page() instead.

Signed-off-by: Roman Skakun 
Reviewed-by: Andrii Anisov 
---
Hey!
Thanks for suggestions, Christoph!
I updated the patch according to your advice.
But, I'm so surprised because nobody catches this problem
in the common code before. It looks a bit strange as for me. 


 kernel/dma/ops_helpers.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
index 910ae69cae77..782728d8a393 100644
--- a/kernel/dma/ops_helpers.c
+++ b/kernel/dma/ops_helpers.c
@@ -5,6 +5,14 @@
  */
 #include 
 
+static struct page *cpu_addr_to_page(void *cpu_addr)
+{
+   if (is_vmalloc_addr(cpu_addr))
+   return vmalloc_to_page(cpu_addr);
+   else
+   return virt_to_page(cpu_addr);
+}
+
 /*
  * Create scatter-list for the already allocated DMA buffer.
  */
@@ -12,7 +20,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
sg_table *sgt,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   struct page *page = virt_to_page(cpu_addr);
+   struct page *page = cpu_addr_to_page(cpu_addr);
int ret;
 
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
@@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct 
*vma,
return -ENXIO;
 
return remap_pfn_range(vma, vma->vm_start,
-   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
+   page_to_pfn(cpu_addr_to_page(cpu_addr)) + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
 #else
return -ENXIO;
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Roman Skakun
> We make sure that we allocate contiguous memory in 
> xen_swiotlb_alloc_coherent().

I understood.
Thanks!

-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Roman Skakun
This commit is dedicated to fix incorrect conversion from
cpu_addr to page address in cases when we get virtual
address which allocated through xen_swiotlb_alloc_coherent()
and can be mapped in the vmalloc range.
As the result, virt_to_page() cannot convert this address
properly and return incorrect page address.

Need to detect such cases and obtains the page address using
vmalloc_to_page() instead.

The reference code for mmap() and get_sgtable() was copied
from kernel/dma/ops_helpers.c and modified to provide
additional detections as described above.

In order to simplify code there was added a new
dma_cpu_addr_to_page() helper.

Signed-off-by: Roman Skakun 
Reviewed-by: Andrii Anisov 
---
 drivers/xen/swiotlb-xen.c | 42 +++
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 90bc5fc321bc..9331a8500547 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -118,6 +118,14 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
return 0;
 }
 
+static struct page *cpu_addr_to_page(void *cpu_addr)
+{
+   if (is_vmalloc_addr(cpu_addr))
+   return vmalloc_to_page(cpu_addr);
+   else
+   return virt_to_page(cpu_addr);
+}
+
 static int
 xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
@@ -337,7 +345,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
int order = get_order(size);
phys_addr_t phys;
u64 dma_mask = DMA_BIT_MASK(32);
-   struct page *page;
+   struct page *page = cpu_addr_to_page(vaddr);
 
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = hwdev->coherent_dma_mask;
@@ -349,11 +357,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);
 
-   if (is_vmalloc_addr(vaddr))
-   page = vmalloc_to_page(vaddr);
-   else
-   page = virt_to_page(vaddr);
-
if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
 range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(page))
@@ -573,7 +576,23 @@ xen_swiotlb_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+   unsigned long user_count = vma_pages(vma);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   unsigned long off = vma->vm_pgoff;
+   struct page *page = cpu_addr_to_page(cpu_addr);
+   int ret;
+
+   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
+
+   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
+   return ret;
+
+   if (off >= count || user_count > count - off)
+   return -ENXIO;
+
+   return remap_pfn_range(vma, vma->vm_start,
+   page_to_pfn(page) + vma->vm_pgoff,
+   user_count << PAGE_SHIFT, vma->vm_page_prot);
 }
 
 /*
@@ -585,7 +604,14 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
sg_table *sgt,
void *cpu_addr, dma_addr_t handle, size_t size,
unsigned long attrs)
 {
-   return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
+   struct page *page = cpu_addr_to_page(cpu_addr);
+   int ret;
+
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+
+   return ret;
 }
 
 const struct dma_map_ops xen_swiotlb_dma_ops = {
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] Revert "swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable"

2021-06-16 Thread Roman Skakun
This reverts commit 922659ea771b3fd728149262c5ea15608fab9719.

Signed-off-by: Roman Skakun 
---
 drivers/xen/swiotlb-xen.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..90bc5fc321bc 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -563,6 +563,31 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(hwdev, xen_io_tlb_end - 1) <= mask;
 }
 
+/*
+ * Create userspace mapping for the DMA-coherent memory.
+ * This function should be called with the pages from the current domain only,
+ * passing pages mapped from other domains would lead to memory corruption.
+ */
+static int
+xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+void *cpu_addr, dma_addr_t dma_addr, size_t size,
+unsigned long attrs)
+{
+   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+}
+
+/*
+ * This function should be called with the pages from the current domain only,
+ * passing pages mapped from other domains would lead to memory corruption.
+ */
+static int
+xen_swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
+   void *cpu_addr, dma_addr_t handle, size_t size,
+   unsigned long attrs)
+{
+   return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
+}
+
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -575,8 +600,8 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
-   .mmap = dma_common_mmap,
-   .get_sgtable = dma_common_get_sgtable,
+   .mmap = xen_swiotlb_dma_mmap,
+   .get_sgtable = xen_swiotlb_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
 };
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-14 Thread Roman Skakun
Hey, Boris!
Thanks for the review.

I have an additional question about current implementation that disturbed me.
I think, that we can have cases when mapped memory can not be
physically contiguous.
In order to proceed this correctly need to apply some additional steps
to current implementation as well.

In mmap() :
1. Is the memory region physically contiguous?
2. Remap multiple ranges if it is not.

In get_sgtable() :
1. Is the memory region physically contiguous?
2. Create sgt that will be included multiple contiguous ranges if it is not.

What do you think about it?

Cheers!
Roman


пт, 11 июн. 2021 г. в 18:20, Boris Ostrovsky :
>
>
> On 6/11/21 5:55 AM, Roman Skakun wrote:
> >
> > +static int
> > +xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> > + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> > + unsigned long attrs)
> > +{
> > + unsigned long user_count = vma_pages(vma);
> > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > + unsigned long off = vma->vm_pgoff;
> > + struct page *page;
> > +
> > + if (is_vmalloc_addr(cpu_addr))
> > + page = vmalloc_to_page(cpu_addr);
> > + else
> > + page = virt_to_page(cpu_addr);
> > +
> > + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> > +
> > + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
> > + return -ENXIO;
> > +
> > + if (off >= count || user_count > count - off)
> > + return -ENXIO;
> > +
> > + return remap_pfn_range(vma, vma->vm_start,
> > + page_to_pfn(page) + vma->vm_pgoff,
> > + user_count << PAGE_SHIFT, vma->vm_page_prot);
> > +}
>
>
> I suggest you create a helper for computing page value and then revert 
> 922659ea771b3fd728149262c5ea15608fab9719 and pass result of the helper 
> instead of cpu_addr. Here and in xen_swiotlb_dma_get_sgtable().
>
>
> And use this new helper in xen_swiotlb_free_coherent() too. I am curious 
> though why this was not a problem when Stefano was looking at the problem 
> that introduced this vmalloc check (i.e. 
> 8b1e868f66076490189a36d984fcce286cdd6295). Stefano?
>
>
> -boris



-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-11 Thread Roman Skakun
This commit is dedicated to fix incorrect convertion from
cpu_addr to page address in cases when we get virtual
address which allocated throught xen_swiotlb_alloc_coherent()
and can be mapped in the vmalloc range.
As the result, virt_to_page() cannot convert this address
properly and return incorrect page address.

Need to detect such cases and obtains the page address using
vmalloc_to_page() instead.

The reference code was copied from kernel/dma/ops_helpers.c
and modified to provide additional detections as described
above.

Signed-off-by: Roman Skakun 
Reviewed-by: Andrii Anisov 

---
Also, I have observed that the original common code didn't 
make additional checks about contiguity of the memory region
represented by cpu_addr and size

May be, this means that these functions can get only physically 
contiguous memory.
Is this correct?

Cheers!

---
 drivers/xen/swiotlb-xen.c | 51 +--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..f99c98472927 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -563,6 +563,53 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(hwdev, xen_io_tlb_end - 1) <= mask;
 }
 
+static int
+xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+   void *cpu_addr, dma_addr_t dma_addr, size_t size,
+   unsigned long attrs)
+{
+   unsigned long user_count = vma_pages(vma);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   unsigned long off = vma->vm_pgoff;
+   struct page *page;
+
+   if (is_vmalloc_addr(cpu_addr))
+   page = vmalloc_to_page(cpu_addr);
+   else
+   page = virt_to_page(cpu_addr);
+
+   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
+
+   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
+   return -ENXIO;
+
+   if (off >= count || user_count > count - off)
+   return -ENXIO;
+
+   return remap_pfn_range(vma, vma->vm_start,
+   page_to_pfn(page) + vma->vm_pgoff,
+   user_count << PAGE_SHIFT, vma->vm_page_prot);
+}
+
+static int
+xen_swiotlb_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
+void *cpu_addr, dma_addr_t dma_addr, size_t size,
+unsigned long attrs)
+{
+   struct page *page;
+   int ret;
+
+   if (is_vmalloc_addr(cpu_addr))
+   page = vmalloc_to_page(cpu_addr);
+   else
+   page = virt_to_page(cpu_addr);
+
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+   return ret;
+}
+
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -575,8 +622,8 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
-   .mmap = dma_common_mmap,
-   .get_sgtable = dma_common_get_sgtable,
+   .mmap = xen_swiotlb_dma_mmap,
+   .get_sgtable = xen_swiotlb_dma_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
 };
-- 
2.27.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu