Re: [PATCH v4 13/24] iommu/mediatek: Add device link for smi-common and m4u

2020-11-11 Thread Nicolas Boichat
On Wed, Nov 11, 2020 at 8:40 PM Yong Wu  wrote:
>
> In the lastest SoC, M4U has its special power domain. thus, If the engine
> begin to work, it should help enable the power for M4U firstly.
> Currently if the engine work, it always enable the power/clocks for
> smi-larbs/smi-common. This patch adds device_link for smi-common and M4U.
> then, if smi-common power is enabled, the M4U power also is powered on
> automatically.
>
> Normally M4U connect with several smi-larbs and their smi-common always
> are the same, In this patch it get smi-common dev from the first smi-larb
> device(i==0), then add the device_link only while m4u has power-domain.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 36 +---
>  drivers/iommu/mtk_iommu.h |  1 +
>  2 files changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index cfdf5ce696fd..4ce7e0883e4d 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -705,7 +706,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> return larb_nr;
>
> for (i = 0; i < larb_nr; i++) {
> -   struct device_node *larbnode;
> +   struct device_node *larbnode, *smicomm_node;
> struct platform_device *plarbdev;
> u32 id;
>
> @@ -731,6 +732,26 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>
> component_match_add_release(dev, , release_of,
> compare_of, larbnode);
> +   if (!i) {

Maybe more of a style preference, but since you are actually comparing
an integer, I prefer seeing i == 0.

Also, might be nicer to do

if (i != 0)
   continue;

And de-indent the rest.

> +   smicomm_node = of_parse_phandle(larbnode, 
> "mediatek,smi", 0);
> +   if (!smicomm_node)
> +   return -EINVAL;
> +
> +   plarbdev = of_find_device_by_node(smicomm_node);
> +   of_node_put(smicomm_node);
> +   data->smicomm_dev = >dev;
> +   }
> +   }
> +
> +   if (dev->pm_domain) {
> +   struct device_link *link;
> +
> +   link = device_link_add(data->smicomm_dev, dev,
> +  DL_FLAG_STATELESS | 
> DL_FLAG_PM_RUNTIME);
> +   if (!link) {
> +   dev_err(dev, "Unable link %s.\n", 
> dev_name(data->smicomm_dev));
> +   return -EINVAL;
> +   }
> }
>
> platform_set_drvdata(pdev, data);
> @@ -738,14 +759,14 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> ret = iommu_device_sysfs_add(>iommu, dev, NULL,
>  "mtk-iommu.%pa", );
> if (ret)
> -   return ret;
> +   goto out_link_remove;
>
> iommu_device_set_ops(>iommu, _iommu_ops);
> iommu_device_set_fwnode(>iommu, >dev.of_node->fwnode);
>
> ret = iommu_device_register(>iommu);
> if (ret)
> -   return ret;
> +   goto out_sysfs_remove;

Technically, this change is unrelated.

>
> spin_lock_init(>tlb_lock);
> list_add_tail(>list, );
> @@ -754,6 +775,13 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> bus_set_iommu(_bus_type, _iommu_ops);
>
> return component_master_add_with_match(dev, _iommu_com_ops, 
> match);
> +
> +out_sysfs_remove:
> +   iommu_device_sysfs_remove(>iommu);
> +out_link_remove:
> +   if (dev->pm_domain)
> +   device_link_remove(data->smicomm_dev, dev);
> +   return ret;
>  }
>
>  static int mtk_iommu_remove(struct platform_device *pdev)
> @@ -767,6 +795,8 @@ static int mtk_iommu_remove(struct platform_device *pdev)
> bus_set_iommu(_bus_type, NULL);
>
> clk_disable_unprepare(data->bclk);
> +   if (pdev->dev.pm_domain)
> +   device_link_remove(data->smicomm_dev, >dev);
> devm_free_irq(>dev, data->irq, data);
> component_master_del(>dev, _iommu_com_ops);
> return 0;
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index d0c93652bdbe..5e03a029c4dc 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -68,6 +68,7 @@ struct mtk_iommu_data {
>
> struct iommu_device iommu;
> const struct mtk_iommu_plat_data *plat_data;
> +   struct device   *smicomm_dev;
>
> struct dma_iommu_mapping*mapping; /* For mtk_iommu_v1.c */
>
> --
> 2.18.0
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 06/21] iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap

2020-07-12 Thread Nicolas Boichat
On Sat, Jul 11, 2020 at 2:50 PM Yong Wu  wrote:
>
> As title.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index 4272fe4e17f4..01f2a8876808 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -717,7 +717,7 @@ static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, 
> unsigned long iova,
>  {
> struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
>
> -   if (WARN_ON(upper_32_bits(iova)))
> +   if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))

This is a little odd as iova is unsigned long and 1ULL is unsigned long long.

Would it be better to keep the spirit of the previous test and do
something like:
 if (WARN_ON(iova >> data->iop.cfg.ias)) ?

> return 0;
>
> return __arm_v7s_unmap(data, gather, iova, size, 1, data->pgd);
> --
> 2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 03/14] iommu/mediatek: Add device_link between the consumer and the larb devices

2020-03-04 Thread Nicolas Boichat
On Tue, Sep 3, 2019 at 5:38 PM Yong Wu  wrote:
>
> MediaTek IOMMU don't have its power-domain. all the consumer connect
> with smi-larb, then connect with smi-common.
>
> M4U
>  |
> smi-common
>  |
>   -
>   | |...
>   | |
> larb1 larb2
>   | |
> vdec   venc
>
> When the consumer works, it should enable the smi-larb's power which
> also need enable the smi-common's power firstly.
>
> Thus, First of all, use the device link connect the consumer and the
> smi-larbs. then add device link between the smi-larb and smi-common.
>
> This patch adds device_link between the consumer and the larbs.
>
> When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
> pm_runtime_xx to keep the original status of clocks. It can avoid two
> issues:
> 1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
> all the clocks are enabled before entering kernel, but the clocks for
> display HW(always in larb0) will be gated after clk_enable and clk_disable
> called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
> operation happened before display driver probe. At that time, the display
> HW will be abnormal.
>
> 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
> pm_runtime_xx to avoid the deadlock.
>
> Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
> device_link_removed should be added explicitly.
>
> [1] http://lists.infradead.org/pipermail/linux-mediatek/2019-July/
> 021500.html
> [2] https://lore.kernel.org/patchwork/patch/1086569/
>
> Suggested-by: Tomasz Figa 
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c| 17 +
>  drivers/iommu/mtk_iommu_v1.c | 18 +-
>  2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index b138b94..2511b3c 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -450,6 +450,9 @@ static int mtk_iommu_add_device(struct device *dev)
> struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> struct mtk_iommu_data *data;
> struct iommu_group *group;
> +   struct device_link *link;
> +   struct device *larbdev;
> +   unsigned int larbid;
>
> if (!fwspec || fwspec->ops != _iommu_ops)
> return -ENODEV; /* Not a iommu client device */
> @@ -461,6 +464,14 @@ static int mtk_iommu_add_device(struct device *dev)
> if (IS_ERR(group))
> return PTR_ERR(group);
>
> +   /* Link the consumer device with the smi-larb device(supplier) */
> +   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);

I'll mirror the comment I made on gerrit
(https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1361013):
Maybe I'm missing something here, but for example, on MT8173,
vcodec_enc: vcodec@18002000 needs to use both larb3 and larb5, isn't
the code below just adding a link for larb3?

Do we need to iterate over all fwspecs->ids to figure out which larbs
we need to add links to each of them?

> +   larbdev = data->larb_imu[larbid].dev;
> +   link = device_link_add(dev, larbdev,
> +  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> +   if (!link)
> +   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
> +
> iommu_group_put(group);
> return 0;
>  }
> @@ -469,6 +480,8 @@ static void mtk_iommu_remove_device(struct device *dev)
>  {
> struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> struct mtk_iommu_data *data;
> +   struct device *larbdev;
> +   unsigned int larbid;
>
> if (!fwspec || fwspec->ops != _iommu_ops)
> return;
> @@ -476,6 +489,10 @@ static void mtk_iommu_remove_device(struct device *dev)
> data = fwspec->iommu_priv;
> iommu_device_unlink(>iommu, dev);
>
> +   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
> +   larbdev = data->larb_imu[larbid].dev;
> +   device_link_remove(dev, larbdev);
> +
> iommu_group_remove_device(dev);
> iommu_fwspec_free(dev);
>  }
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index 2034d72..a7f22a2 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -423,7 +423,9 @@ static int mtk_iommu_add_device(struct device *dev)
> struct of_phandle_iterator it;
> struct mtk_iommu_data *data;
> struct iommu_group *group;
> -   int err;
> +   struct device_link *link;
> +   struct device *larbdev;
> +   int err, larbid;
>
> of_for_each_phandle(, err, dev->of_node, "iommus",
> "#iommu-cells", 0) {
> @@ -466,6 +468,14 @@ static int mtk_iommu_add_device(struct device *dev)
> return err;
> }
>
> +   /* Link the consumer device with the smi-larb device(supplier) */
> +   larbid = 

Re: [V3, 2/2] media: i2c: Add Omnivision OV02A10 camera sensor driver

2019-09-05 Thread Nicolas Boichat
On Fri, Sep 6, 2019 at 12:05 AM Sakari Ailus
 wrote:
>
> On Thu, Sep 05, 2019 at 07:53:37PM +0900, Tomasz Figa wrote:
> > On Thu, Sep 5, 2019 at 7:45 PM Sakari Ailus
> >  wrote:
> > >
> > > Hi Dongchun,
> > >
> > > On Thu, Sep 05, 2019 at 05:41:05PM +0800, Dongchun Zhu wrote:
> > >
> > > ...
> > >
> > > > > > + ret = regulator_bulk_enable(OV02A10_NUM_SUPPLIES, 
> > > > > > ov02a10->supplies);
> > > > > > + if (ret < 0) {
> > > > > > + dev_err(dev, "Failed to enable regulators\n");
> > > > > > + goto disable_clk;
> > > > > > + }
> > > > > > + msleep_range(7);
> > > > >
> > > > > This has some potential of clashing with more generic functions in the
> > > > > future. Please use usleep_range directly, or msleep.
> > > > >
> > > >
> > > > Did you mean using usleep_range(7*1000, 8*1000), as used in patch v1?
> > > > https://patchwork.kernel.org/patch/10957225/
> > >
> > > Yes, please.
> >
> > Why not just msleep()?
>
> msleep() is usually less accurate. I'm not sure it makes a big different in
> this case. Perhaps, if someone wants that the sensor is powered on and
> streaming as soon as possible.

https://elixir.bootlin.com/linux/latest/source/Documentation/timers/timers-howto.txt#L70

Use usleep_range for delays up to 20ms (at least that's what the
documentation (still) says?)

> --
> Sakari Ailus
> sakari.ai...@linux.intel.com


Re: [PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2019-03-19 Thread Nicolas Boichat
On Wed, Mar 20, 2019 at 1:56 AM Andrew Morton  wrote:
>
> On Tue, 19 Mar 2019 15:41:43 +0800 Nicolas Boichat  
> wrote:
>
> > On Mon, Feb 25, 2019 at 8:23 AM Nicolas Boichat  
> > wrote:
> > >
> > > On Thu, Feb 14, 2019 at 1:12 AM Vlastimil Babka  wrote:
> > > >
> > > > On 1/22/19 11:51 PM, Nicolas Boichat wrote:
> > > > > Hi Andrew,
> > > > >
> > > > > On Fri, Jan 11, 2019 at 6:21 PM Joerg Roedel  wrote:
> > > > >>
> > > > >> On Wed, Jan 02, 2019 at 01:51:45PM +0800, Nicolas Boichat wrote:
> > > > >> > Does anyone have any further comment on this series? If not, which
> > > > >> > maintainer is going to pick this up? I assume Andrew Morton?
> > > > >>
> > > > >> Probably, yes. I don't like to carry the mm-changes in iommu-tree, so
> > > > >> this should go through mm.
> > > > >
> > > > > Gentle ping on this series, it seems like it's better if it goes
> > > > > through your tree.
> > > > >
> > > > > Series still applies cleanly on linux-next, but I'm happy to resend if
> > > > > that helps.
> > > >
> > > > Ping, Andrew?
> > >
> > > Another gentle ping, I still don't see these patches in mmot[ms]. Thanks.
> >
> > Andrew: AFAICT this still applies cleanly on linux-next/master, so I
> > don't plan to resend... is there any other issues with this series?
> >
> > This is a regression, so it'd be nice to have it fixed in mainline, 
> > eventually.
>
> Sorry, seeing "iommu" and "arm" made these escape my gimlet eye.

Thanks for picking them up!

> I'm only seeing acks on [1/3].  What's the review status of [2/3] and [3/3]?

Replied on the notification, [2/3] had a Ack, [3/3] is somewhat controversial.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2019-03-19 Thread Nicolas Boichat
On Mon, Feb 25, 2019 at 8:23 AM Nicolas Boichat  wrote:
>
> On Thu, Feb 14, 2019 at 1:12 AM Vlastimil Babka  wrote:
> >
> > On 1/22/19 11:51 PM, Nicolas Boichat wrote:
> > > Hi Andrew,
> > >
> > > On Fri, Jan 11, 2019 at 6:21 PM Joerg Roedel  wrote:
> > >>
> > >> On Wed, Jan 02, 2019 at 01:51:45PM +0800, Nicolas Boichat wrote:
> > >> > Does anyone have any further comment on this series? If not, which
> > >> > maintainer is going to pick this up? I assume Andrew Morton?
> > >>
> > >> Probably, yes. I don't like to carry the mm-changes in iommu-tree, so
> > >> this should go through mm.
> > >
> > > Gentle ping on this series, it seems like it's better if it goes
> > > through your tree.
> > >
> > > Series still applies cleanly on linux-next, but I'm happy to resend if
> > > that helps.
> >
> > Ping, Andrew?
>
> Another gentle ping, I still don't see these patches in mmot[ms]. Thanks.

Andrew: AFAICT this still applies cleanly on linux-next/master, so I
don't plan to resend... is there any other issues with this series?

This is a regression, so it'd be nice to have it fixed in mainline, eventually.

Thanks,

> > > Thanks!
> > >
> > >> Regards,
> > >>
> > >> Joerg
> > >
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2019-02-24 Thread Nicolas Boichat
On Thu, Feb 14, 2019 at 1:12 AM Vlastimil Babka  wrote:
>
> On 1/22/19 11:51 PM, Nicolas Boichat wrote:
> > Hi Andrew,
> >
> > On Fri, Jan 11, 2019 at 6:21 PM Joerg Roedel  wrote:
> >>
> >> On Wed, Jan 02, 2019 at 01:51:45PM +0800, Nicolas Boichat wrote:
> >> > Does anyone have any further comment on this series? If not, which
> >> > maintainer is going to pick this up? I assume Andrew Morton?
> >>
> >> Probably, yes. I don't like to carry the mm-changes in iommu-tree, so
> >> this should go through mm.
> >
> > Gentle ping on this series, it seems like it's better if it goes
> > through your tree.
> >
> > Series still applies cleanly on linux-next, but I'm happy to resend if
> > that helps.
>
> Ping, Andrew?

Another gentle ping, I still don't see these patches in mmot[ms]. Thanks.

> > Thanks!
> >
> >> Regards,
> >>
> >> Joerg
> >
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/io-pgtable-arm-v7s: only kmemleak_ignore L2 tables

2019-02-24 Thread Nicolas Boichat
Joerg: Just to make sure, is this patch in your queue? Thanks.

On Thu, Jan 31, 2019 at 2:21 AM Will Deacon  wrote:
>
> On Mon, Jan 28, 2019 at 05:43:01PM +0800, Nicolas Boichat wrote:
> > L1 tables are allocated with __get_dma_pages, and therefore already
> > ignored by kmemleak.
> >
> > Without this, the kernel would print this error message on boot,
> > when the first L1 table is allocated:
> >
> > [2.810533] kmemleak: Trying to color unknown object at 
> > 0xffd652388000 as Black
> > [2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S
> > 4.19.16 #8
> > [2.831227] Workqueue: events deferred_probe_work_func
> > [2.836353] Call trace:
> > ...
> > [2.852532]  paint_ptr+0xa0/0xa8
> > [2.855750]  kmemleak_ignore+0x38/0x6c
> > [2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
> > [2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
> > [2.868354]  alloc_io_pgtable_ops+0x3c/0x78
> > ...
> >
> > Fixes: e5fc9753b1a8314 ("iommu/io-pgtable: Add ARMv7 short descriptor 
> > support")
> > Signed-off-by: Nicolas Boichat 
> > ---
> >
> > I only tested this on top of my other series
> > (https://patchwork.kernel.org/patch/10720495/), but I think the same fix
> > applies. I'm still a bit confused as to why this only shows up now, as IIUC,
> > the kmemleak_ignore call was always wrong with L1 tables.
>
> Yes, I managed to reproduce this on top of -rc4 (see below). I suspect you
> /are/ the intersection of people using v7s w/ kmemleak, so this has just
> lingered and never been hit until now.
>
> For the patch (assuming this is going via Joerg):
>
> Acked-by: Will Deacon 
>
> Will
>
> --->8
>
> [0.124473] kmemleak: Trying to color unknown object at 0x842d8000 
> as Black
> [0.125312] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 5.0.0-rc4-00012-g40b114779944 #1
> [0.126181] Hardware name: linux,dummy-virt (DT)
> [0.126680] Call trace:
> [0.126950]  dump_backtrace+0x0/0x140
> [0.127346]  show_stack+0x14/0x20
> [0.127706]  dump_stack+0x90/0xb4
> [0.128066]  paint_ptr+0x94/0xa8
> [0.128417]  kmemleak_ignore+0x54/0x60
> [0.128991]  __arm_v7s_alloc_table+0x6c/0x240
> [0.129661]  arm_v7s_alloc_pgtable+0x10c/0x188
> [0.130359]  alloc_io_pgtable_ops+0x44/0xb0
> [0.131006]  arm_v7s_do_selftests+0x84/0x4bc
> [0.131663]  do_one_initcall+0x74/0x178
> [0.132253]  kernel_init_freeable+0x188/0x220
> [0.132923]  kernel_init+0x10/0x100
> [0.133460]  ret_from_fork+0x10/0x18
> [0.142102] arm-v7s io-pgtable: self test ok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/io-pgtable-arm-v7s: only kmemleak_ignore L2 tables

2019-01-28 Thread Nicolas Boichat
L1 tables are allocated with __get_dma_pages, and therefore already
ignored by kmemleak.

Without this, the kernel would print this error message on boot,
when the first L1 table is allocated:

[2.810533] kmemleak: Trying to color unknown object at 0xffd652388000 
as Black
[2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S
4.19.16 #8
[2.831227] Workqueue: events deferred_probe_work_func
[2.836353] Call trace:
...
[2.852532]  paint_ptr+0xa0/0xa8
[2.855750]  kmemleak_ignore+0x38/0x6c
[2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
[2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
[2.868354]  alloc_io_pgtable_ops+0x3c/0x78
...

Fixes: e5fc9753b1a8314 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: Nicolas Boichat 
---

I only tested this on top of my other series
(https://patchwork.kernel.org/patch/10720495/), but I think the same fix
applies. I'm still a bit confused as to why this only shows up now, as IIUC,
the kmemleak_ignore call was always wrong with L1 tables.

 drivers/iommu/io-pgtable-arm-v7s.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index cec29bf45c9bd7..98a4a4a0dfb0c1 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -217,7 +217,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
if (dma != phys)
goto out_unmap;
}
-   kmemleak_ignore(table);
+   if (lvl == 2)
+   kmemleak_ignore(table);
return table;
 
 out_unmap:
-- 
2.20.1.495.gaa96b0ce6b-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2019-01-22 Thread Nicolas Boichat
Hi Andrew,

On Fri, Jan 11, 2019 at 6:21 PM Joerg Roedel  wrote:
>
> On Wed, Jan 02, 2019 at 01:51:45PM +0800, Nicolas Boichat wrote:
> > Does anyone have any further comment on this series? If not, which
> > maintainer is going to pick this up? I assume Andrew Morton?
>
> Probably, yes. I don't like to carry the mm-changes in iommu-tree, so
> this should go through mm.

Gentle ping on this series, it seems like it's better if it goes
through your tree.

Series still applies cleanly on linux-next, but I'm happy to resend if
that helps.

Thanks!

> Regards,
>
> Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 18/20] iommu/mediatek: Fix VLD_PA_RANGE register backup when suspend

2019-01-01 Thread Nicolas Boichat
On Tue, Jan 1, 2019 at 11:59 AM Yong Wu  wrote:
>
> The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
> mode support for mt2712. this patch add it.
>
> Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
> for 4GB mode")
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 2 ++
>  drivers/iommu/mtk_iommu.h | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 7fcef19..ddf1969 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -716,6 +716,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device 
> *dev)
> reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0);
> reg->int_main_control = readl_relaxed(base + 
> REG_MMU_INT_MAIN_CONTROL);
> reg->ivrp_paddr = readl_relaxed(base + REG_MMU_IVRP_PADDR);
> +   reg->vld_pa_range = readl_relaxed(base + REG_MMU_VLD_PA_RNG);

Don't we want to add:
if (data->plat_data->vld_pa_rng)
before this save/restore operation? Or it doesn't matter?

> clk_disable_unprepare(data->bclk);
> return 0;
>  }
> @@ -740,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
> *dev)
> writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
> writel_relaxed(reg->int_main_control, base + 
> REG_MMU_INT_MAIN_CONTROL);
> writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
> +   writel_relaxed(reg->vld_pa_range, base + REG_MMU_VLD_PA_RNG);
> if (m4u_dom)
> writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
>base + REG_MMU_PT_BASE_ADDR);
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 0a7c463..c500bfd 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -33,6 +33,7 @@ struct mtk_iommu_suspend_reg {
> u32 int_control0;
> u32 int_main_control;
> u32 ivrp_paddr;
> +   u32 vld_pa_range;

Well, please be consistent ,-) Either vld_pa_rng, or valid_pa_range ,-)

>  };
>
>  enum mtk_iommu_plat {
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 11/20] iommu/mediatek: Move vld_pa_rng into plat_data

2019-01-01 Thread Nicolas Boichat
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu  wrote:
>
> Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
> range) register while mt2712 have. Move it into the plat_data.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 3 ++-
>  drivers/iommu/mtk_iommu.h | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 8d8ab21..2913ddb 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -548,7 +548,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
>  upper_32_bits(data->protect_base);
> writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>
> -   if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> +   if (data->enable_4GB && data->plat_data->vld_pa_rng) {
> /*
>  * If 4GB mode is enabled, the validate PA range is from
>  * 0x1__ to 0x1__. here record bit[32:30].
> @@ -741,6 +741,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
> *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> +   .vld_pa_rng   = true,
> .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
>  };
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b46aeaa..a8c5d1e 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -48,6 +48,7 @@ struct mtk_iommu_plat_data {
> /* HW will use the EMI clock if there isn't the "bclk". */
> boolhas_bclk;
> boolreset_axi;
> +   boolvld_pa_rng;

Since this is not a register name, maybe we can use something more
readable, like valid_pa_range?

(or at the very least describe it in a comment in the struct?)

> unsigned char   larbid_remap[MTK_LARB_NR_MAX];
>  };
>
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 10/20] iommu/mediatek: Move reset_axi into plat_data

2019-01-01 Thread Nicolas Boichat
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu  wrote:
>
> In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while
> it is extended to REG_MMU_CTRL which contains _STANDARD_AXI_MODE in
> the other SoCs. I move this property to plat_data since both mt8173
> and mt8183 use this property.
>
> It is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu 

Reviewed-by: Nicolas Boichat 

> ---
>  drivers/iommu/mtk_iommu.c | 4 ++--
>  drivers/iommu/mtk_iommu.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 35a1263..8d8ab21 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -558,8 +558,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
> }
> writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
>
> -   /* It's MISC control register whose default value is ok except 
> mt8173.*/
> -   if (data->plat_data->m4u_plat == M4U_MT8173)
> +   if (data->plat_data->reset_axi)
> writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
>
> if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
> @@ -749,6 +748,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
> *dev)
> .m4u_plat = M4U_MT8173,
> .has_4gb_mode = true,
> .has_bclk = true,
> +   .reset_axi= true,
> .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
>  };
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index eec19a6..b46aeaa 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -47,7 +47,7 @@ struct mtk_iommu_plat_data {
>
> /* HW will use the EMI clock if there isn't the "bclk". */
> boolhas_bclk;
> -
> +   boolreset_axi;
> unsigned char   larbid_remap[MTK_LARB_NR_MAX];
>  };
>
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 09/20] iommu/mediatek: Refine protect memory definition

2019-01-01 Thread Nicolas Boichat
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu  wrote:
>
> The protect memory setting is a little different in the different SoCs.
> In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
> protect) shift bit is normally 4 while it shift 5 bits only in the
> mt8173. This patch delete the complex MACRO and use a common if-else
> instead.
>
> Also, use "F_MMU_TF_PROT_TO_PROGRAM_ADDR" instead of the hard code(2)
> which means the M4U will output the dirty data to the programmed
> address that we allocated dynamically when translation fault occurs.
>
> Signed-off-by: Yong Wu 
> ---
> @Nicalos, I don't put it in the plat_data since only the previous mt8173
> shift 5. As I know, the latest SoC always use the new setting like mt2712
> and mt8183. Thus, I think it is unnecessary to put it in plat_data and
> let all the latest SoC set it. Hence, I still keep "== mt8173" for this
> like the reg REG_MMU_CTRL_REG.

Should be ok this way. But maybe one way to avoid hard-coding 4/5
below is to have 2 macros:

#define F_MMU_TF_PROT_TO_PROGRAM_ADDR (2 << 4)
#define F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173 (2 << 5)

And still use the if below?

> ---
>  drivers/iommu/mtk_iommu.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index eca1536..35a1263 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -53,11 +53,7 @@
>
>  #define REG_MMU_CTRL_REG   0x110
>  #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
> -#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> -   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> -/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> -#define F_MMU_TF_PROTECT_SEL(prot, data) \
> -   (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> +#define F_MMU_TF_PROT_TO_PROGRAM_ADDR  2
>
>  #define REG_MMU_IVRP_PADDR 0x114
>
> @@ -521,9 +517,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
> return ret;
> }
>
> -   regval = F_MMU_TF_PROTECT_SEL(2, data);
> if (data->plat_data->m4u_plat == M4U_MT8173)
> -   regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> +   regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
> + (F_MMU_TF_PROT_TO_PROGRAM_ADDR << 5);
> +   else
> +   regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR << 4;
> writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
>
> regval = F_L2_MULIT_HIT_EN |
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 08/20] iommu/mediatek: Add larb-id remapped support

2019-01-01 Thread Nicolas Boichat
On Tue, Jan 1, 2019 at 11:58 AM Yong Wu  wrote:
>
> The larb-id may be remapped in the smi-common, this means the
> larb-id reported in the mtk_iommu_isr isn't the real larb-id,
>
> Take mt8183 as a example:
>M4U
> |
> -
> |   SMI common  |
> -0-7-5-6-1-2--3-4- <- Id remapped
>  | | | | | |  | |
> larb0 larb1 IPU0  IPU1 larb4 larb5  larb6  CCU
> disp  vdec  img   cam   venc  imgcam
> As above, larb0 connects with the id 0 in smi-common.
>   larb1 connects with the id 7 in smi-common.
>   ...
> If the larb-id reported in the isr is 7, actually it's larb1(vdec).
> In order to output the right larb-id in the isr, we add a larb-id
> remapping relationship in this patch.
>
> If there is no this larb-id remapping in some SoCs, use the linear
> mapping array instead.
>
> This also is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu 

I think it's a little cleaner this way, thanks.

Reviewed-by: Nicolas Boichat 

> ---
>  drivers/iommu/mtk_iommu.c | 4 
>  drivers/iommu/mtk_iommu.h | 2 ++
>  2 files changed, 6 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 847082c..eca1536 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -220,6 +220,8 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
> fault_port = F_MMU0_INT_ID_PORT_ID(regval);
>
> +   fault_larb = data->plat_data->larbid_remap[fault_larb];
> +
> if (report_iommu_fault(>domain, data->dev, fault_iova,
>write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) 
> {
> dev_err_ratelimited(
> @@ -742,12 +744,14 @@ static int __maybe_unused mtk_iommu_resume(struct 
> device *dev)
> .m4u_plat = M4U_MT2712,
> .has_4gb_mode = true,
> .has_bclk = true,
> +   .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
>  };
>
>  static const struct mtk_iommu_plat_data mt8173_data = {
> .m4u_plat = M4U_MT8173,
> .has_4gb_mode = true,
> .has_bclk = true,
> +   .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
>  };
>
>  static const struct of_device_id mtk_iommu_of_ids[] = {
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b8749ac..eec19a6 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -47,6 +47,8 @@ struct mtk_iommu_plat_data {
>
> /* HW will use the EMI clock if there isn't the "bclk". */
> boolhas_bclk;
> +
> +   unsigned char   larbid_remap[MTK_LARB_NR_MAX];
>  };
>
>  struct mtk_iommu_domain;
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 10/18] iommu/mediatek: Add mt8183 IOMMU support

2018-12-21 Thread Nicolas Boichat
On Fri, Dec 21, 2018 at 4:02 PM Yong Wu  wrote:
>
> On Fri, 2018-12-21 at 12:43 +0800, Nicolas Boichat wrote:
> > On Sat, Dec 8, 2018 at 4:42 PM Yong Wu  wrote:
> > >
> > > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > > the ARM Short-descriptor like mt8173, and most of the HW registers
> > > are the same.
> > >
> > > Here list main differences between mt8183 and mt8173/mt2712:
> > > 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> > > 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> > > 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> > > mode".
> > > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > > the bit[33:32] in the physical address of the pgtable base, But the
> > > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > > we add a mask.
> > > 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> > > 6) the larb-id in smi-common is remapped. M4U should enable
> > > larbid_remapped support.
> > >
> > > Signed-off-by: Yong Wu 
> > > ---
> > >  drivers/iommu/mtk_iommu.c | 31 ++-
> > >  drivers/iommu/mtk_iommu.h |  1 +
> > >  drivers/memory/mtk-smi.c  | 19 +++
> > >  3 files changed, 42 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 8ab3b69..d91a554 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -36,6 +36,7 @@
> > >  #include "mtk_iommu.h"
> > >
> > >  #define REG_MMU_PT_BASE_ADDR   0x000
> > > +#define MMU_PT_ADDR_MASK   GENMASK(31, 7)
> > >
> > >  #define REG_MMU_INVALIDATE 0x020
> > >  #define F_ALL_INVLD0x2
> > > @@ -54,7 +55,7 @@
> > >  #define REG_MMU_CTRL_REG   0x110
> > >  #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
> > >  #define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> > > -   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> > > +   ((data)->plat_data->m4u_plat == M4U_MT8173 ? 5 : 4)
> >
> > Should the shift value be a member of plat_data instead?
>
> It's also ok.
> This TF_PROTECT_SEL MACRO looks a bit complex. I can use a new patch to
> refactor it.

SGTM.

> >
> > >  /* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> > >  #define F_MMU_TF_PROTECT_SEL(prot, data) \
> > > (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> > > @@ -347,7 +348,7 @@ static int mtk_iommu_attach_device(struct 
> > > iommu_domain *domain,
> > > /* Update the pgtable base address register of the M4U HW */
> > > if (!data->m4u_dom) {
> > > data->m4u_dom = dom;
> > > -   writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> > > +   writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
> > >data->base + REG_MMU_PT_BASE_ADDR);
> > > }
> > >
> > > @@ -510,6 +511,7 @@ static int mtk_iommu_of_xlate(struct device *dev, 
> > > struct of_phandle_args *args)
> > >
> > >  static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > >  {
> > > +   enum mtk_iommu_plat m4u_plat = data->plat_data->m4u_plat;
> > > u32 regval;
> > > int ret;
> > >
> > > @@ -520,7 +522,7 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > > }
> > >
> > > regval = F_MMU_TF_PROTECT_SEL(2, data);
> > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > +   if (m4u_plat == M4U_MT8173)
> > > regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> > > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
> > >
> > > @@ -541,14 +543,14 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > > F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
> > > writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
> > >
> > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > +   if (m4u_plat == M4U_MT8173)
> > >   

Re: [PATCH v4 13/18] memory: mtk-smi: Add bus_sel for mt8183

2018-12-20 Thread Nicolas Boichat
On Sat, Dec 8, 2018 at 4:43 PM Yong Wu  wrote:
>
> There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
> mmu0 or mmu1 to balance the bandwidth via the smi-common register
> SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
>
> In mt8183, For better performance, we switch larb1/2/5/7 to enter
> mmu1 while the others still keep enter mmu0.
>
> In mt8173 and mt2712, we don't get the performance issue,
> Keep its default value(0x0), that means all the larbs enter mmu0.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/memory/mtk-smi.c | 22 --
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> index ee6165e..88eb61a 100644
> --- a/drivers/memory/mtk-smi.c
> +++ b/drivers/memory/mtk-smi.c
> @@ -49,6 +49,12 @@
>  #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
>  #define F_MMU_EN   BIT(0)
>
> +/* SMI COMMON */
> +#define SMI_BUS_SEL0x220
> +#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
> +/* All are MMU0 defaultly. Only specialize mmu1 here. */
> +#define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
> +
>  enum mtk_smi_gen {
> MTK_SMI_GEN1,
> MTK_SMI_GEN2
> @@ -57,6 +63,7 @@ enum mtk_smi_gen {
>  struct mtk_smi_common_plat {
> enum mtk_smi_gen gen;
> bool has_gals;
> +   u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 
> */
>  };
>
>  struct mtk_smi_larb_gen {
> @@ -72,8 +79,8 @@ struct mtk_smi {
> struct clk  *clk_apb, *clk_smi;
> struct clk  *clk_gals0, *clk_gals1;
> struct clk  *clk_async; /*only needed by mt2701*/
> -   void __iomem*smi_ao_base;
> -
> +   void __iomem*smi_ao_base; /* only for gen1 */
> +   void __iomem*base;/* only for gen2 */
> const struct mtk_smi_common_plat *plat;
>  };
>
> @@ -409,6 +416,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct 
> device *dev)
>  static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
> .gen  = MTK_SMI_GEN2,
> .has_gals = true,
> +   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
> +   F_MMU1_LARB(7),

Maybe it's ok for now, but I wonder if this is something that should
be specified in device tree? Maybe different applications will want
different larb split between MMU0 and MMU1?

>  };
>
>  static const struct of_device_id mtk_smi_common_of_ids[] = {
> @@ -481,6 +490,11 @@ static int mtk_smi_common_probe(struct platform_device 
> *pdev)
> ret = clk_prepare_enable(common->clk_async);
> if (ret)
> return ret;
> +   } else {
> +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +   common->base = devm_ioremap_resource(dev, res);
> +   if (IS_ERR(common->base))
> +   return PTR_ERR(common->base);
> }
> pm_runtime_enable(dev);
> platform_set_drvdata(pdev, common);
> @@ -496,6 +510,7 @@ static int mtk_smi_common_remove(struct platform_device 
> *pdev)
>  static int __maybe_unused mtk_smi_common_resume(struct device *dev)
>  {
> struct mtk_smi *common = dev_get_drvdata(dev);
> +   u32 bus_sel = common->plat->bus_sel;
> int ret;
>
> ret = mtk_smi_clk_enable(common);
> @@ -503,6 +518,9 @@ static int __maybe_unused mtk_smi_common_resume(struct 
> device *dev)
> dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
> return ret;
> }
> +
> +   if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
> +   writel(bus_sel, common->base + SMI_BUS_SEL);
> return 0;
>  }
>
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 10/18] iommu/mediatek: Add mt8183 IOMMU support

2018-12-20 Thread Nicolas Boichat
On Sat, Dec 8, 2018 at 4:42 PM Yong Wu  wrote:
>
> The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> the ARM Short-descriptor like mt8173, and most of the HW registers
> are the same.
>
> Here list main differences between mt8183 and mt8173/mt2712:
> 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> mode".
> 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> the bit[33:32] in the physical address of the pgtable base, But the
> standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> we add a mask.
> 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> 6) the larb-id in smi-common is remapped. M4U should enable
> larbid_remapped support.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 31 ++-
>  drivers/iommu/mtk_iommu.h |  1 +
>  drivers/memory/mtk-smi.c  | 19 +++
>  3 files changed, 42 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 8ab3b69..d91a554 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -36,6 +36,7 @@
>  #include "mtk_iommu.h"
>
>  #define REG_MMU_PT_BASE_ADDR   0x000
> +#define MMU_PT_ADDR_MASK   GENMASK(31, 7)
>
>  #define REG_MMU_INVALIDATE 0x020
>  #define F_ALL_INVLD0x2
> @@ -54,7 +55,7 @@
>  #define REG_MMU_CTRL_REG   0x110
>  #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
>  #define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> -   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> +   ((data)->plat_data->m4u_plat == M4U_MT8173 ? 5 : 4)

Should the shift value be a member of plat_data instead?

>  /* It's named by F_MMU_TF_PROT_SEL in mt2712. */
>  #define F_MMU_TF_PROTECT_SEL(prot, data) \
> (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> @@ -347,7 +348,7 @@ static int mtk_iommu_attach_device(struct iommu_domain 
> *domain,
> /* Update the pgtable base address register of the M4U HW */
> if (!data->m4u_dom) {
> data->m4u_dom = dom;
> -   writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> +   writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
>data->base + REG_MMU_PT_BASE_ADDR);
> }
>
> @@ -510,6 +511,7 @@ static int mtk_iommu_of_xlate(struct device *dev, struct 
> of_phandle_args *args)
>
>  static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
>  {
> +   enum mtk_iommu_plat m4u_plat = data->plat_data->m4u_plat;
> u32 regval;
> int ret;
>
> @@ -520,7 +522,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
> }
>
> regval = F_MMU_TF_PROTECT_SEL(2, data);
> -   if (data->plat_data->m4u_plat == M4U_MT8173)
> +   if (m4u_plat == M4U_MT8173)
> regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
>
> @@ -541,14 +543,14 @@ static int mtk_iommu_hw_init(const struct 
> mtk_iommu_data *data)
> F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
> writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
>
> -   if (data->plat_data->m4u_plat == M4U_MT8173)
> +   if (m4u_plat == M4U_MT8173)
> regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
> else
> regval = lower_32_bits(data->protect_base) |
>  upper_32_bits(data->protect_base);
> writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>
> -   if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> +   if (data->enable_4GB && m4u_plat == M4U_MT2712) {
> /*
>  * If 4GB mode is enabled, the validate PA range is from
>  * 0x1__ to 0x1__. here record bit[32:30].
> @@ -558,8 +560,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
> }
> writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
>
> -   /* It's MISC control register whose default value is ok except 
> mt8173.*/
> -   if (data->plat_data->m4u_plat == M4U_MT8173)
> +   /*
> +* It's MISC control register whose default value is ok
> +* except mt8173 and mt8183.
> +*/
> +   if (m4u_plat == M4U_MT8173 || m4u_plat == M4U_MT8183)

Again, should this be a field in plat_data?

> writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
>
> if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
> @@ -713,6 +718,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
> *dev)
>  {
> struct mtk_iommu_data *data = dev_get_drvdata(dev);
> struct mtk_iommu_suspend_reg 

Re: [PATCH v4 08/18] iommu/mediatek: Add larb-id remapped support

2018-12-20 Thread Nicolas Boichat
On Sat, Dec 8, 2018 at 4:42 PM Yong Wu  wrote:
>
> The larb-id may be remapped in the smi-common, this means the
> larb-id reported in the mtk_iommu_isr isn't the real larb-id,
>
> Take mt8183 as a example:
>M4U
> |
> -
> |   SMI common  |
> -0-7-5-6-1-2--3-4- <- Id remapped
>  | | | | | |  | |
> larb0 larb1 IPU0  IPU1 larb4 larb5  larb6  CCU
> disp  vdec  img   cam   venc  imgcam
> As above, larb0 connects with the id 0 in smi-common.
>   larb1 connects with the id 7 in smi-common.
>   ...
> If the larb-id reported in the isr is 7, actually it's larb1(vdec).
> In order to output the right larb-id in the isr, we add a larb-id
> remapping relationship in this patch.
>
> This also is a preparing patch for mt8183.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 3 +++
>  drivers/iommu/mtk_iommu.h | 4 
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index eda062a..8ab3b69 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -220,6 +220,9 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
> fault_port = F_MMU0_INT_ID_PORT_ID(regval);
>
> +   if (data->plat_data->larbid_remap_enable)
> +   fault_larb = data->plat_data->larbid_remapped[fault_larb];
> +
> if (report_iommu_fault(>domain, data->dev, fault_iova,
>write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) 
> {
> dev_err_ratelimited(
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index b8749ac..3877050 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -47,6 +47,10 @@ struct mtk_iommu_plat_data {
>
> /* HW will use the EMI clock if there isn't the "bclk". */
> boolhas_bclk;
> +
> +   /* The larb-id may be remapped in the smi-common. */
> +   boollarbid_remap_enable;
> +   unsigned intlarbid_remapped[MTK_LARB_NR_MAX];

Wouldn't it be a little simpler if you just had
larbid_remap[MTK_LARB_NR_MAX] (no larbid_remap_enable), and just set
it to {0, 1, 2, 3, 4, 5, 6, 7} in platforms that don't need
complicated remapping?

Also, unsigned char/u8 array would be enough.

>  };
>
>  struct mtk_iommu_domain;
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 2/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-12-09 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note
that we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc,
as this is not strictly necessary, and would cause a warning
in mm/sl*b.c, as we did not update GFP_SLAB_BUG_MASK.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Cc: sta...@vger.kernel.org
Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Commit message

(v3 used the page_frag approach)

Changes since v4:
 - Do not pass ARM_V7S_TABLE_GFP_DMA to kmem_cache_zalloc, as this
   is unnecessary, and would trigger a warning.

Changes since v5:
 - Rename ARM_V7S_TABLE_SLAB_CACHE to ARM_V7S_TABLE_SLAB_FLAGS.

 drivers/iommu/io-pgtable-arm-v7s.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..d2fdb192f7610f 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -198,13 +206,16 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = kmem_cache_zalloc(data->l2_tables, gfp);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +748,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
+   ARM_V7S_TABLE_SLAB_FLAGS, NULL);
if (!data->l2_tables)
goto out_free_data;
 
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 1/3] mm: Add support for kmem caches in DMA32 zone

2018-12-09 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
to be allocated within the first 4GB of RAM, even on 64-bit systems.
On arm64, this is done by passing GFP_DMA32 flag to memory allocation
functions.

For IOMMU L2 tables that only take 1KB, it would be a waste to allocate
a full page using get_free_pages, so we considered 3 approaches:
 1. This patch, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2
page tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable
to reuse freed fragments until the whole page is freed.

This change makes it possible to create a custom cache in DMA32 zone
using kmem_cache_create, then allocate memory using kmem_cache_alloc.

We do not create a DMA32 kmalloc cache array, as there are currently
no users of kmalloc(..., GFP_DMA32). These calls will continue to
trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK.

This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32
kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and
unnecessary).

Cc: sta...@vger.kernel.org
Signed-off-by: Nicolas Boichat 
Acked-by: Vlastimil Babka 
---

Changes since v2:
 - Clarified commit message
 - Add entry in sysfs-kernel-slab to document the new sysfs file

(v3 used the page_frag approach)

Changes since v4:
 - Added details to commit message
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH 3/3, so that maintainer can decide whether
   to pick the change independently.

(no change since v5)

 include/linux/slab.h | 2 ++
 mm/slab.c| 2 ++
 mm/slab.h| 3 ++-
 mm/slab_common.c | 2 +-
 mm/slub.c| 5 +
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae4057c..9449b19c5f107a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
 /* Use GFP_DMA memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32   ((slab_flags_t __force)0x8000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
 /* Panic if kmem_cache_create() fails */
diff --git a/mm/slab.c b/mm/slab.c
index 73fe23e649c91a..124f8c556d27fb 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+   if (flags & SLAB_CACHE_DMA32)
+   cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9df..fcf717e12f0a86 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int 
object_size,
 
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+SLAB_CACHE_DMA32 | SLAB_PANIC | \
 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
 
 #if defined(CONFIG_DEBUG_SLAB)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 70b0cc85db67f8..18b7b809c8d064 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
SLAB_FAILSLAB | SLAB_KASAN)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
-SLAB_ACCOUNT)
+SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index c229a9b7dd5448..4caadb926838ef 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3583,6 +3583,9 @@ static int calculate_sizes(struct kmem_cache *s, int 
forced_order)
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;
 
+   if (s->flags & SLAB_CACHE_DMA32)
+   s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;
 
@@ -5671,6 +5674,8 @@ static char *create_unique_id(struct kmem_cache *s)
 */
if (s->flags & SLAB_CACHE_DMA)
*p++ = 'd';
+   if (s->

[PATCH v6 3/3] mm: Add /sys/kernel/slab/cache/cache_dma32

2018-12-09 Thread Nicolas Boichat
A previous patch in this series adds support for SLAB_CACHE_DMA32
kmem caches. This adds the corresponding
/sys/kernel/slab/cache/cache_dma32 entries, and fixes slabinfo
tool.

Cc: sta...@vger.kernel.org
Signed-off-by: Nicolas Boichat 
---

There were different opinions on whether this sysfs entry should
be added, so I'll leave it up to the mm/slub maintainers to decide
whether they want to pick this up, or drop it.

No change since v5.

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 mm/slub.c   | 11 +++
 tools/vm/slabinfo.c |  7 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
b/Documentation/ABI/testing/sysfs-kernel-slab
index 29601d93a1c2ea..d742c6cfdffbe9 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -106,6 +106,15 @@ Description:
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.
 
+What:  /sys/kernel/slab/cache/cache_dma32
+Date:  December 2018
+KernelVersion: 4.21
+Contact:   Nicolas Boichat 
+Description:
+   The cache_dma32 file is read-only and specifies whether objects
+   are from ZONE_DMA32.
+   Available when CONFIG_ZONE_DMA32 is enabled.
+
 What:  /sys/kernel/slab/cache/cpu_slabs
 Date:  May 2007
 KernelVersion: 2.6.22
diff --git a/mm/slub.c b/mm/slub.c
index 4caadb926838ef..840f3719d9d543 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5104,6 +5104,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char 
*buf)
 SLAB_ATTR_RO(cache_dma);
 #endif
 
+#ifdef CONFIG_ZONE_DMA32
+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
+{
+   return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
+}
+SLAB_ATTR_RO(cache_dma32);
+#endif
+
 static ssize_t usersize_show(struct kmem_cache *s, char *buf)
 {
return sprintf(buf, "%u\n", s->usersize);
@@ -5444,6 +5452,9 @@ static struct attribute *slab_attrs[] = {
 #ifdef CONFIG_ZONE_DMA
_dma_attr.attr,
 #endif
+#ifdef CONFIG_ZONE_DMA32
+   _dma32_attr.attr,
+#endif
 #ifdef CONFIG_NUMA
_node_defrag_ratio_attr.attr,
 #endif
diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index 334b16db0ebbe9..4ee1bf6e498dfa 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -29,7 +29,7 @@ struct slabinfo {
char *name;
int alias;
int refs;
-   int aliases, align, cache_dma, cpu_slabs, destroy_by_rcu;
+   int aliases, align, cache_dma, cache_dma32, cpu_slabs, destroy_by_rcu;
unsigned int hwcache_align, object_size, objs_per_slab;
unsigned int sanity_checks, slab_size, store_user, trace;
int order, poison, reclaim_account, red_zone;
@@ -531,6 +531,8 @@ static void report(struct slabinfo *s)
printf("** Hardware cacheline aligned\n");
if (s->cache_dma)
printf("** Memory is allocated in a special DMA zone\n");
+   if (s->cache_dma32)
+   printf("** Memory is allocated in a special DMA32 zone\n");
if (s->destroy_by_rcu)
printf("** Slabs are destroyed via RCU\n");
if (s->reclaim_account)
@@ -599,6 +601,8 @@ static void slabcache(struct slabinfo *s)
*p++ = '*';
if (s->cache_dma)
*p++ = 'd';
+   if (s->cache_dma32)
+   *p++ = 'D';
if (s->hwcache_align)
*p++ = 'A';
if (s->poison)
@@ -1205,6 +1209,7 @@ static void read_slab_dir(void)
slab->aliases = get_obj("aliases");
slab->align = get_obj("align");
slab->cache_dma = get_obj("cache_dma");
+   slab->cache_dma32 = get_obj("cache_dma32");
slab->cpu_slabs = get_obj("cpu_slabs");
slab->destroy_by_rcu = get_obj("destroy_by_rcu");
slab->hwcache_align = get_obj("hwcache_align");
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v6 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-09 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], [2].

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For L1 tables that are bigger than a page, we can just use __get_free_pages
with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA).

For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
 1. This series, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]

This series is the most memory-efficient approach.

stable@ note:
  We confirmed that this is a regression, and IOMMU errors happen on 4.19
  and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue
  most likely starts from commit ad67f5a6545f ("arm64: replace ZONE_DMA
  with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek
  platforms (and maybe others?).

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/

Changes since v1:
 - Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
 - iommu/io-pgtable-arm-v7s (patch 3):
   - Changed approach to use SLAB_CACHE_DMA32 added by the previous
 commit.
   - Use DMA or DMA32 depending on the architecture (DMA for arm,
 DMA32 for arm64).

Changes since v2:
 - Reworded and expanded commit messages
 - Added cache_dma32 documentation in PATCH 2/3.

v3 used the page_frag approach, see [3].

Changes since v4:
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH v4 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH v5 3/3, so that maintainer can decide whether
   to pick the change independently.

Changes since v5:
 - Rename ARM_V7S_TABLE_SLAB_CACHE to ARM_V7S_TABLE_SLAB_FLAGS.
 - Add stable@ to cc.

Nicolas Boichat (3):
  mm: Add support for kmem caches in DMA32 zone
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging
  mm: Add /sys/kernel/slab/cache/cache_dma32

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 drivers/iommu/io-pgtable-arm-v7s.c  | 19 +++
 include/linux/slab.h|  2 ++
 mm/slab.c   |  2 ++
 mm/slab.h   |  3 ++-
 mm/slab_common.c|  2 +-
 mm/slub.c   | 16 
 tools/vm/slabinfo.c |  7 ++-
 8 files changed, 53 insertions(+), 7 deletions(-)

-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-12-07 Thread Nicolas Boichat
On Fri, Dec 7, 2018 at 4:05 PM Matthew Wilcox  wrote:
>
> On Fri, Dec 07, 2018 at 02:16:19PM +0800, Nicolas Boichat wrote:
> > +#ifdef CONFIG_ZONE_DMA32
> > +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
> > +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
>
> This name doesn't make any sense.  Why not ARM_V7S_TABLE_SLAB_FLAGS ?

Sure, will fix in v6 then.

> > +#else
> > +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
> > +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
>
> Can you remind me again why it is, on machines which don't support
> ZONE_DMA32, why we have to allocate from ZONE_DMA?  My understanding
> is that 64-bit machines have ZONE_DMA32 and 32-bit machines don't.
> So shouldn't this rather be GFP_KERNEL?

Sorry I mean to reply on the v4 thread, Christoph raised the same
question (https://patchwork.kernel.org/patch/10713025/).

I don't know, and I don't have all the hardware needed to test this
,-( Robin and Will both didn't seem sure.

I'd rather not introduce a new regression, this patch series tries to
fix a known arm64 regression, where we _need_ tables to be in DMA32.
If we want to change 32-bit hardware to use GFP_KERNEL, and we're sure
it works, that's fine by me, but it should be in another patch set.

Hope this makes sense,

Thanks,

> Actually, maybe we could centralise this in gfp.h:
>
> #ifdef CONFIG_64BIT
> # ifdef CONFIG_ZONE_DMA32
> #define GFP_32BIT   GFP_DMA32
> # else
> #define GFP_32BIT   GFP_DMA
> #else /* 32-bit */
> #define GFP_32BIT   GFP_KERNEL
> #endif
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-06 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], [2].

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For L1 tables that are bigger than a page, we can just use __get_free_pages
with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA).

For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
 1. This series, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]

This series is the most memory-efficient approach.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/

Changes since v1:
 - Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
 - iommu/io-pgtable-arm-v7s (patch 3):
   - Changed approach to use SLAB_CACHE_DMA32 added by the previous
 commit.
   - Use DMA or DMA32 depending on the architecture (DMA for arm,
 DMA32 for arm64).

Changes since v2:
 - Reworded and expanded commit messages
 - Added cache_dma32 documentation in PATCH 2/3.

v3 used the page_frag approach, see [3].

Changes since v4:
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH v4 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH v5 3/3, so that maintainer can decide whether
   to pick the change independently.

Nicolas Boichat (3):
  mm: Add support for kmem caches in DMA32 zone
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging
  mm: Add /sys/kernel/slab/cache/cache_dma32

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 drivers/iommu/io-pgtable-arm-v7s.c  | 19 +++
 include/linux/slab.h|  2 ++
 mm/slab.c   |  2 ++
 mm/slab.h   |  3 ++-
 mm/slab_common.c|  2 +-
 mm/slub.c   | 16 
 tools/vm/slabinfo.c |  7 ++-
 8 files changed, 53 insertions(+), 7 deletions(-)

-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-12-06 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note
that we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc,
as this is not strictly necessary, and would cause a warning
in mm/sl*b.c, as we did not update GFP_SLAB_BUG_MASK.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Commit message

(v3 used the page_frag approach)

Changes since v4:
 - Do not pass ARM_V7S_TABLE_GFP_DMA to kmem_cache_zalloc, as this
   is unnecessary, and would trigger a warning.

 drivers/iommu/io-pgtable-arm-v7s.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..f311fe1dfa47b7 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -198,13 +206,16 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = kmem_cache_zalloc(data->l2_tables, gfp);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +748,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
+   ARM_V7S_TABLE_SLAB_CACHE, NULL);
if (!data->l2_tables)
goto out_free_data;
 
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/3] mm: Add support for kmem caches in DMA32 zone

2018-12-06 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
to be allocated within the first 4GB of RAM, even on 64-bit systems.
On arm64, this is done by passing GFP_DMA32 flag to memory allocation
functions.

For IOMMU L2 tables that only take 1KB, it would be a waste to allocate
a full page using get_free_pages, so we considered 3 approaches:
 1. This patch, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2
page tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable
to reuse freed fragments until the whole page is freed.

This change makes it possible to create a custom cache in DMA32 zone
using kmem_cache_create, then allocate memory using kmem_cache_alloc.

We do not create a DMA32 kmalloc cache array, as there are currently
no users of kmalloc(..., GFP_DMA32). These calls will continue to
trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK.

This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32
kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and
unnecessary).

Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Clarified commit message
 - Add entry in sysfs-kernel-slab to document the new sysfs file

(v3 used the page_frag approach)

Changes since v4:
 - Added details to commit message
 - Dropped change that removed GFP_DMA32 from GFP_SLAB_BUG_MASK:
   instead we can just call kmem_cache_*alloc without GFP_DMA32
   parameter. This also means that we can drop PATCH 1/3, as we
   do not make any changes in GFP flag verification.
 - Dropped hunks that added cache_dma32 sysfs file, and moved
   the hunks to PATCH 3/3, so that maintainer can decide whether
   to pick the change independently.

 include/linux/slab.h | 2 ++
 mm/slab.c| 2 ++
 mm/slab.h| 3 ++-
 mm/slab_common.c | 2 +-
 mm/slub.c| 5 +
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae4057c..9449b19c5f107a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
 /* Use GFP_DMA memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32   ((slab_flags_t __force)0x8000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
 /* Panic if kmem_cache_create() fails */
diff --git a/mm/slab.c b/mm/slab.c
index 73fe23e649c91a..124f8c556d27fb 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+   if (flags & SLAB_CACHE_DMA32)
+   cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9df..fcf717e12f0a86 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int 
object_size,
 
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+SLAB_CACHE_DMA32 | SLAB_PANIC | \
 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
 
 #if defined(CONFIG_DEBUG_SLAB)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 70b0cc85db67f8..18b7b809c8d064 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
SLAB_FAILSLAB | SLAB_KASAN)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
-SLAB_ACCOUNT)
+SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index c229a9b7dd5448..4caadb926838ef 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3583,6 +3583,9 @@ static int calculate_sizes(struct kmem_cache *s, int 
forced_order)
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;
 
+   if (s->flags & SLAB_CACHE_DMA32)
+   s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;
 
@@ -5671,6 +5674,8 @@ static char *create_unique_id(struct kmem_cache *s)
 */
if (s->flags & SLAB_CACHE_DMA)
*p++ = 'd';
+   if (s->flags & SLAB_CACHE_DMA32)
+   *p++ = 'D';
if (s-&

[PATCH v5 3/3] mm: Add /sys/kernel/slab/cache/cache_dma32

2018-12-06 Thread Nicolas Boichat
A previous patch in this series adds support for SLAB_CACHE_DMA32
kmem caches. This adds the corresponding
/sys/kernel/slab/cache/cache_dma32 entries, and fixes slabinfo
tool.

Signed-off-by: Nicolas Boichat 
---

There were different opinions on whether this sysfs entry should
be added, so I'll leave it up to the mm/slub maintainers to decide
whether they want to pick this up, or drop it.

 Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 mm/slub.c   | 11 +++
 tools/vm/slabinfo.c |  7 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
b/Documentation/ABI/testing/sysfs-kernel-slab
index 29601d93a1c2ea..d742c6cfdffbe9 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -106,6 +106,15 @@ Description:
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.
 
+What:  /sys/kernel/slab/cache/cache_dma32
+Date:  December 2018
+KernelVersion: 4.21
+Contact:   Nicolas Boichat 
+Description:
+   The cache_dma32 file is read-only and specifies whether objects
+   are from ZONE_DMA32.
+   Available when CONFIG_ZONE_DMA32 is enabled.
+
 What:  /sys/kernel/slab/cache/cpu_slabs
 Date:  May 2007
 KernelVersion: 2.6.22
diff --git a/mm/slub.c b/mm/slub.c
index 4caadb926838ef..840f3719d9d543 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5104,6 +5104,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char 
*buf)
 SLAB_ATTR_RO(cache_dma);
 #endif
 
+#ifdef CONFIG_ZONE_DMA32
+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
+{
+   return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
+}
+SLAB_ATTR_RO(cache_dma32);
+#endif
+
 static ssize_t usersize_show(struct kmem_cache *s, char *buf)
 {
return sprintf(buf, "%u\n", s->usersize);
@@ -5444,6 +5452,9 @@ static struct attribute *slab_attrs[] = {
 #ifdef CONFIG_ZONE_DMA
_dma_attr.attr,
 #endif
+#ifdef CONFIG_ZONE_DMA32
+   _dma32_attr.attr,
+#endif
 #ifdef CONFIG_NUMA
_node_defrag_ratio_attr.attr,
 #endif
diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index 334b16db0ebbe9..4ee1bf6e498dfa 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -29,7 +29,7 @@ struct slabinfo {
char *name;
int alias;
int refs;
-   int aliases, align, cache_dma, cpu_slabs, destroy_by_rcu;
+   int aliases, align, cache_dma, cache_dma32, cpu_slabs, destroy_by_rcu;
unsigned int hwcache_align, object_size, objs_per_slab;
unsigned int sanity_checks, slab_size, store_user, trace;
int order, poison, reclaim_account, red_zone;
@@ -531,6 +531,8 @@ static void report(struct slabinfo *s)
printf("** Hardware cacheline aligned\n");
if (s->cache_dma)
printf("** Memory is allocated in a special DMA zone\n");
+   if (s->cache_dma32)
+   printf("** Memory is allocated in a special DMA32 zone\n");
if (s->destroy_by_rcu)
printf("** Slabs are destroyed via RCU\n");
if (s->reclaim_account)
@@ -599,6 +601,8 @@ static void slabcache(struct slabinfo *s)
*p++ = '*';
if (s->cache_dma)
*p++ = 'd';
+   if (s->cache_dma32)
+   *p++ = 'D';
if (s->hwcache_align)
*p++ = 'A';
if (s->poison)
@@ -1205,6 +1209,7 @@ static void read_slab_dir(void)
slab->aliases = get_obj("aliases");
slab->align = get_obj("align");
slab->cache_dma = get_obj("cache_dma");
+   slab->cache_dma32 = get_obj("cache_dma32");
slab->cpu_slabs = get_obj("cpu_slabs");
slab->destroy_by_rcu = get_obj("destroy_by_rcu");
slab->hwcache_align = get_obj("hwcache_align");
-- 
2.20.0.rc2.403.gdbc3b29805-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-06 Thread Nicolas Boichat
On Thu, Dec 6, 2018 at 5:37 PM Vlastimil Babka  wrote:
>
> On 12/6/18 4:49 AM, Nicolas Boichat wrote:
> >> So it would be fine even unchanged. The check would anyway need some
> >> more love to catch the same with __GFP_DMA to be consistent and cover
> >> all corner cases.
> > Yes, the test is not complete. If we really wanted this to be
> > accurate, we'd need to check that GFP_* exactly matches SLAB_CACHE_*.
> >
> > The only problem with dropping this is test that we should restore
> > GFP_DMA32 warning/errors somewhere else (as Christopher pointed out
> > here: https://lkml.org/lkml/2018/11/22/430), especially for kmalloc
> > case.
>
> I meant just dropping that patch hunk, not the whole test. Then the test
> stays as it is and will keep warning anyone calling kmalloc(GFP_DMA32).
> It would also warn anyone calling kmem_cache_alloc(GFP_DMA32) on
> SLAB_CACHE_DMA32 cache, but since the gfp can be just dropped, and you
> as the only user of this so far will do that, it's fine?

I missed your point, this would work fine indeed.

Thanks.

> > Maybe this can be done in kmalloc_slab.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Nicolas Boichat
On Thu, Dec 6, 2018 at 11:32 AM Wei Yang  wrote:
>
> On Thu, Dec 06, 2018 at 08:41:36AM +0800, Nicolas Boichat wrote:
> >On Wed, Dec 5, 2018 at 8:18 PM Wei Yang  wrote:
> >>
> >> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
> >> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang  wrote:
> >> >>
> >> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >> >> >data structures smaller than a page with GFP_DMA32 flag.
> >> >> >
> >> >> >This change makes it possible to create a custom cache in DMA32 zone
> >> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >> >> >
> >> >> >We do not create a DMA32 kmalloc cache array, as there are currently
> >> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >> >> >ensures that such calls still fail (as they do before this change).
> >> >> >
> >> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >> >> >Signed-off-by: Nicolas Boichat 
> >> >> >---
> >> >> >
> >> >> >Changes since v2:
> >> >> > - Clarified commit message
> >> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >> >> >
> >> >> >(v3 used the page_frag approach)
> >> >> >
> >> >> >Documentation/ABI/testing/sysfs-kernel-slab |  9 +
> >> >> > include/linux/slab.h|  2 ++
> >> >> > mm/internal.h   |  8 ++--
> >> >> > mm/slab.c   |  4 +++-
> >> >> > mm/slab.h   |  3 ++-
> >> >> > mm/slab_common.c|  2 +-
> >> >> > mm/slub.c   | 18 +-
> >> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >> >> >
> >> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
> >> >> >b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >@@ -106,6 +106,15 @@ Description:
> >> >> >   are from ZONE_DMA.
> >> >> >   Available when CONFIG_ZONE_DMA is enabled.
> >> >> >
> >> >> >+What: /sys/kernel/slab/cache/cache_dma32
> >> >> >+Date: December 2018
> >> >> >+KernelVersion:4.21
> >> >> >+Contact:  Nicolas Boichat 
> >> >> >+Description:
> >> >> >+  The cache_dma32 file is read-only and specifies whether 
> >> >> >objects
> >> >> >+  are from ZONE_DMA32.
> >> >> >+  Available when CONFIG_ZONE_DMA32 is enabled.
> >> >> >+
> >> >> > What: /sys/kernel/slab/cache/cpu_slabs
> >> >> > Date: May 2007
> >> >> > KernelVersion:2.6.22
> >> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >> >> >--- a/include/linux/slab.h
> >> >> >+++ b/include/linux/slab.h
> >> >> >@@ -32,6 +32,8 @@
> >> >> > #define SLAB_HWCACHE_ALIGN((slab_flags_t __force)0x2000U)
> >> >> > /* Use GFP_DMA memory */
> >> >> > #define SLAB_CACHE_DMA((slab_flags_t 
> >> >> > __force)0x4000U)
> >> >> >+/* Use GFP_DMA32 memory */
> >> >> >+#define SLAB_CACHE_DMA32  ((slab_flags_t __force)0x8000U)
> >> >> > /* DEBUG: Store the last owner for bug hunting */
> >> >> > #define SLAB_STORE_USER   ((slab_flags_t 
> >> >> > __force)0x0001U)
> >> >> > /* Panic if kmem_cache_create() fails */
> >> >> >diff --git a/mm/internal.h b/mm/internal.h
> >> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >> >

Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Nicolas Boichat
On Wed, Dec 5, 2018 at 10:02 PM Vlastimil Babka  wrote:
>
> On 12/5/18 6:48 AM, Nicolas Boichat wrote:
> > In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> > data structures smaller than a page with GFP_DMA32 flag.
> >
> > This change makes it possible to create a custom cache in DMA32 zone
> > using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> > We do not create a DMA32 kmalloc cache array, as there are currently
> > no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> > ensures that such calls still fail (as they do before this change).
> >
> > Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>
> Same as my comment for 1/3.

I'll drop.

> > Signed-off-by: Nicolas Boichat 
>
> In general,
> Acked-by: Vlastimil Babka 
>
> Some comments below:
>
> > ---
> >
> > Changes since v2:
> >  - Clarified commit message
> >  - Add entry in sysfs-kernel-slab to document the new sysfs file
> >
> > (v3 used the page_frag approach)
> >
> > Documentation/ABI/testing/sysfs-kernel-slab |  9 +
> >  include/linux/slab.h|  2 ++
> >  mm/internal.h   |  8 ++--
> >  mm/slab.c   |  4 +++-
> >  mm/slab.h   |  3 ++-
> >  mm/slab_common.c|  2 +-
> >  mm/slub.c   | 18 +-
> >  7 files changed, 40 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
> > b/Documentation/ABI/testing/sysfs-kernel-slab
> > index 29601d93a1c2ea..d742c6cfdffbe9 100644
> > --- a/Documentation/ABI/testing/sysfs-kernel-slab
> > +++ b/Documentation/ABI/testing/sysfs-kernel-slab
> > @@ -106,6 +106,15 @@ Description:
> >       are from ZONE_DMA.
> >   Available when CONFIG_ZONE_DMA is enabled.
> >
> > +What:/sys/kernel/slab/cache/cache_dma32
> > +Date:December 2018
> > +KernelVersion:   4.21
> > +Contact: Nicolas Boichat 
> > +Description:
> > + The cache_dma32 file is read-only and specifies whether 
> > objects
> > + are from ZONE_DMA32.
> > + Available when CONFIG_ZONE_DMA32 is enabled.
>
> I don't have a strong opinion. It's a new file, yeah, but consistent
> with already existing ones. I'd leave the decision with SL*B maintainers.
>
> >  What:/sys/kernel/slab/cache/cpu_slabs
> >  Date:May 2007
> >  KernelVersion:   2.6.22
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 11b45f7ae4057c..9449b19c5f107a 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -32,6 +32,8 @@
> >  #define SLAB_HWCACHE_ALIGN   ((slab_flags_t __force)0x2000U)
> >  /* Use GFP_DMA memory */
> >  #define SLAB_CACHE_DMA   ((slab_flags_t __force)0x4000U)
> > +/* Use GFP_DMA32 memory */
> > +#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x8000U)
> >  /* DEBUG: Store the last owner for bug hunting */
> >  #define SLAB_STORE_USER  ((slab_flags_t __force)0x0001U)
> >  /* Panic if kmem_cache_create() fails */
> > diff --git a/mm/internal.h b/mm/internal.h
> > index a2ee82a0cd44ae..fd244ad716eaf8 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >
> >  /*
> > @@ -34,9 +35,12 @@
> >  #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >
> >  /* Check for flags that must not be used with a slab allocator */
> > -static inline gfp_t check_slab_flags(gfp_t flags)
> > +static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> >  {
> > - gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> > + gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> > +
> > + if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & 
> > SLAB_CACHE_DMA32))
> > + bug_mask |= __GFP_DMA32;
>
> I'll point out that this is not even strictly needed AFAICS, as only
> flags passed to kmem_cache_alloc() are checked - the cache->allocflags
> derived from SLAB_CACHE_DMA32 are appended only after check_slab_flags()
> (in both SLAB and SLUB AFAICS). And for a cache created with
> SLAB_CACHE_DMA32, the caller of kmem_cache_alloc() doesn'

Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Nicolas Boichat
On Wed, Dec 5, 2018 at 8:18 PM Wei Yang  wrote:
>
> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang  wrote:
> >>
> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >> >data structures smaller than a page with GFP_DMA32 flag.
> >> >
> >> >This change makes it possible to create a custom cache in DMA32 zone
> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >> >
> >> >We do not create a DMA32 kmalloc cache array, as there are currently
> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >> >ensures that such calls still fail (as they do before this change).
> >> >
> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >> >Signed-off-by: Nicolas Boichat 
> >> >---
> >> >
> >> >Changes since v2:
> >> > - Clarified commit message
> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >> >
> >> >(v3 used the page_frag approach)
> >> >
> >> >Documentation/ABI/testing/sysfs-kernel-slab |  9 +
> >> > include/linux/slab.h|  2 ++
> >> > mm/internal.h   |  8 ++--
> >> > mm/slab.c   |  4 +++-
> >> > mm/slab.h   |  3 ++-
> >> > mm/slab_common.c|  2 +-
> >> > mm/slub.c   | 18 +-
> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >> >
> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
> >> >b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >@@ -106,6 +106,15 @@ Description:
> >> >   are from ZONE_DMA.
> >> >   Available when CONFIG_ZONE_DMA is enabled.
> >> >
> >> >+What: /sys/kernel/slab/cache/cache_dma32
> >> >+Date: December 2018
> >> >+KernelVersion:4.21
> >> >+Contact:  Nicolas Boichat 
> >> >+Description:
> >> >+  The cache_dma32 file is read-only and specifies whether 
> >> >objects
> >> >+  are from ZONE_DMA32.
> >> >+  Available when CONFIG_ZONE_DMA32 is enabled.
> >> >+
> >> > What: /sys/kernel/slab/cache/cpu_slabs
> >> > Date: May 2007
> >> > KernelVersion:2.6.22
> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >> >--- a/include/linux/slab.h
> >> >+++ b/include/linux/slab.h
> >> >@@ -32,6 +32,8 @@
> >> > #define SLAB_HWCACHE_ALIGN((slab_flags_t __force)0x2000U)
> >> > /* Use GFP_DMA memory */
> >> > #define SLAB_CACHE_DMA((slab_flags_t __force)0x4000U)
> >> >+/* Use GFP_DMA32 memory */
> >> >+#define SLAB_CACHE_DMA32  ((slab_flags_t __force)0x8000U)
> >> > /* DEBUG: Store the last owner for bug hunting */
> >> > #define SLAB_STORE_USER   ((slab_flags_t __force)0x0001U)
> >> > /* Panic if kmem_cache_create() fails */
> >> >diff --git a/mm/internal.h b/mm/internal.h
> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >> >--- a/mm/internal.h
> >> >+++ b/mm/internal.h
> >> >@@ -14,6 +14,7 @@
> >> > #include 
> >> > #include 
> >> > #include 
> >> >+#include 
> >> > #include 
> >> >
> >> > /*
> >> >@@ -34,9 +35,12 @@
> >> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >> >
> >> > /* Check for flags that must not be used with a slab allocator */
> >> >-static inline gfp_t check_slab_flags(gfp_t flags)
> >> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t 
> >> >slab_flags)
> >> > {
> >> >-  gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >+  gfp_t bug_ma

Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Nicolas Boichat
On Wed, Dec 5, 2018 at 5:56 PM Michal Hocko  wrote:
>
> On Wed 05-12-18 13:48:27, Nicolas Boichat wrote:
> > In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> > data structures smaller than a page with GFP_DMA32 flag.
> >
> > This change makes it possible to create a custom cache in DMA32 zone
> > using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> > We do not create a DMA32 kmalloc cache array, as there are currently
> > no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> > ensures that such calls still fail (as they do before this change).
>
> The changelog should be much more specific about decisions made here.
> First of all it would be nice to mention the usecase.

Ok, I'll copy most of the cover letter text here (i.e. the fact that
IOMMU wants physical memory <4GB for L2 page tables, why it's better
than genalloc/page_frag).

> Secondly, why do we need a new sysfs file? Who is going to consume it?

We have cache_dma, so it seems consistent to add cache_dma32.

I wasn't aware of tools/vm/slabinfo.c, so I can add support for
cache_dma32 in a follow-up patch. Any other user I should take care
of?

> Then why do we need SLAB_MERGE_SAME to cover GFP_DMA32 as well?

SLAB_MERGE_SAME tells us which flags _need_ to be the same for the
slabs to be merged. We don't want slab caches with GFP_DMA32 and
~GFP_DMA32 to be merged, so it should be in there.
(https://elixir.bootlin.com/linux/v4.19.6/source/mm/slab_common.c#L342).

> I
> thought the whole point is to use dedicated slab cache. Who is this
> going to merge with?

Well, if there was another SLAB cache requiring 1KB GFP_DMA32
elements, then I don't see why we would not merge the caches. This is
what happens with this IOMMU L2 tables cache pre-CONFIG_ZONE_DMA32 on
arm64 (output on some 3.18 kernel below), and what would happen on
arm32 since we still use GFP_DMA.

/sys/kernel/slab # ls -l | grep dt-0001024
drwxr-xr-x. 2 root root 0 Dec  5 02:25 :dt-0001024
lrwxrwxrwx. 1 root root 0 Dec  5 02:25 dma-kmalloc-1024 -> :dt-0001024
lrwxrwxrwx. 1 root root 0 Dec  5 02:25 io-pgtable_armv7s_l2 -> :dt-0001024

Thanks!

> --
> Michal Hocko
> SUSE Labs
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-04 Thread Nicolas Boichat
On Wed, Dec 5, 2018 at 3:25 PM Wei Yang  wrote:
>
> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >data structures smaller than a page with GFP_DMA32 flag.
> >
> >This change makes it possible to create a custom cache in DMA32 zone
> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> >We do not create a DMA32 kmalloc cache array, as there are currently
> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >ensures that such calls still fail (as they do before this change).
> >
> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >Signed-off-by: Nicolas Boichat 
> >---
> >
> >Changes since v2:
> > - Clarified commit message
> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >
> >(v3 used the page_frag approach)
> >
> >Documentation/ABI/testing/sysfs-kernel-slab |  9 +
> > include/linux/slab.h|  2 ++
> > mm/internal.h   |  8 ++--
> > mm/slab.c   |  4 +++-
> > mm/slab.h   |  3 ++-
> > mm/slab_common.c|  2 +-
> > mm/slub.c   | 18 +-
> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >
> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
> >b/Documentation/ABI/testing/sysfs-kernel-slab
> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >@@ -106,6 +106,15 @@ Description:
> >       are from ZONE_DMA.
> >   Available when CONFIG_ZONE_DMA is enabled.
> >
> >+What: /sys/kernel/slab/cache/cache_dma32
> >+Date: December 2018
> >+KernelVersion:4.21
> >+Contact:  Nicolas Boichat 
> >+Description:
> >+  The cache_dma32 file is read-only and specifies whether 
> >objects
> >+  are from ZONE_DMA32.
> >+  Available when CONFIG_ZONE_DMA32 is enabled.
> >+
> > What: /sys/kernel/slab/cache/cpu_slabs
> > Date: May 2007
> > KernelVersion:2.6.22
> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >--- a/include/linux/slab.h
> >+++ b/include/linux/slab.h
> >@@ -32,6 +32,8 @@
> > #define SLAB_HWCACHE_ALIGN((slab_flags_t __force)0x2000U)
> > /* Use GFP_DMA memory */
> > #define SLAB_CACHE_DMA((slab_flags_t __force)0x4000U)
> >+/* Use GFP_DMA32 memory */
> >+#define SLAB_CACHE_DMA32  ((slab_flags_t __force)0x8000U)
> > /* DEBUG: Store the last owner for bug hunting */
> > #define SLAB_STORE_USER   ((slab_flags_t __force)0x0001U)
> > /* Panic if kmem_cache_create() fails */
> >diff --git a/mm/internal.h b/mm/internal.h
> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >--- a/mm/internal.h
> >+++ b/mm/internal.h
> >@@ -14,6 +14,7 @@
> > #include 
> > #include 
> > #include 
> >+#include 
> > #include 
> >
> > /*
> >@@ -34,9 +35,12 @@
> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >
> > /* Check for flags that must not be used with a slab allocator */
> >-static inline gfp_t check_slab_flags(gfp_t flags)
> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> > {
> >-  gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >+  gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >+
> >+  if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & 
> >SLAB_CACHE_DMA32))
> >+  bug_mask |= __GFP_DMA32;
>
> The original version doesn't check CONFIG_ZONE_DMA32.
>
> Do we need to add this condition here?
> Could we just decide the bug_mask based on slab_flags?

We can. The reason I did it this way is that when we don't have
CONFIG_ZONE_DMA32, the compiler should be able to simplify to:

bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
if (true || ..) => if (true)
   bug_mask |= __GFP_DMA32;

Then just
bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;

And since the function is inline, slab_flags would not even need to be
accessed at all.

> >
> >   if (unlikely(flags & bug_mask)) {
> >   gfp_t invalid_mask = flags & bug_mask;
> >diff --git a/m

Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-04 Thread Nicolas Boichat
On Wed, Dec 5, 2018 at 10:04 AM Nicolas Boichat  wrote:
>
> On Tue, Dec 4, 2018 at 10:35 PM Vlastimil Babka  wrote:
> >
> > On 12/4/18 10:37 AM, Nicolas Boichat wrote:
> > > On Sun, Nov 11, 2018 at 5:04 PM Nicolas Boichat  
> > > wrote:
> > >>
> > >> This is a follow-up to the discussion in [1], to make sure that the page
> > >> tables allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit
> > >> physical address space.
> > >>
> > >> [1] 
> > >> https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
> > >
> > > Hi everyone,
> > >
> > > Let's try to summarize here.
> > >
> > > First, we confirmed that this is a regression, and IOMMU errors happen
> > > on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13).
> > > The issue most likely starts from ad67f5a6545f ("arm64: replace
> > > ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number
> > > of Mediatek platforms (and maybe others?).
> > >
> > > We have a few options here:
> > > 1. This series [2], that adds support for GFP_DMA32 slab caches,
> > > _without_ adding kmalloc caches (since there are no users of
> > > kmalloc(..., GFP_DMA32)). I think I've addressed all the comments on
> > > the 3 patches, and AFAICT this solution works fine.
> > > 2. genalloc. That works, but unless we preallocate 4MB for L2 tables
> > > (which is wasteful as we usually only need a handful of L2 tables),
> > > we'll need changes in the core (use GFP_ATOMIC) to allow allocating on
> > > demand, and as it stands we'd have no way to shrink the allocation.
> > > 3. page_frag [3]. That works fine, and the code is quite simple. One
> > > drawback is that fragments in partially freed pages cannot be reused
> > > (from limited experiments, I see that IOMMU L2 tables are rarely
> > > freed, so it's unlikely a whole page would get freed). But given the
> > > low number of L2 tables, maybe we can live with that.
> > >
> > > I think 2 is out. Any preference between 1 and 3? I think 1 makes
> > > better use of the memory, so that'd be my preference. But I'm probably
> > > missing something.
> >
> > I would prefer 1 as well. IIRC you already confirmed that alignment
> > requirements are not broken for custom kmem caches even in presence of
> > SLUB debug options (and I would say it's a bug to be fixed if they
> > weren't).
>
> > I just asked (and didn't get a reply I think) about your
> > ability to handle the GFP_ATOMIC allocation failures. They should be
> > rare when only single page allocations are needed for the kmem cache.
> > But in case they are not an option, then preallocating would be needed,
> > thus probably option 2.
>
> Oh, sorry, I missed your question.
>
> I don't have a full answer, but:
> - The allocations themselves are rare (I count a few 10s of L2 tables
> at most on my system, I assume we rarely have >100), and yes, we only
> need a single page, so the failures should be exceptional.
> - My change is probably not making anything worse: I assume that even
> with the current approach using GFP_DMA slab caches on older kernels,
> failures could potentially happen. I don't think we've seen those. If
> we are really concerned about this, maybe we'd need to modify
> mtk_iommu_map to not hold a spinlock (if that's possible), so we don't
> need to use GFP_ATOMIC. I suggest we just keep an eye on such issues,
> and address them if they show up (we can even revisit genalloc at that
> stage).
>
> Anyway, I'll clean up patches for 1 (mostly commit message changes
> based on the comments in the threads) and resend.

Done here: https://patchwork.kernel.org/cover/10713019/ .

> Thanks,
>
> > > [2] https://patchwork.kernel.org/cover/10677529/, 3 patches
> > > [3] https://patchwork.codeaurora.org/patch/671639/
> > >
> > > Thanks,
> > >
> > > Nicolas
> > >
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-12-04 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Commit message

(v3 used the page_frag approach)

 drivers/iommu/io-pgtable-arm-v7s.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..996f7b6d00b44a 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -198,13 +206,17 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = kmem_cache_zalloc(data->l2_tables,
+ gfp | ARM_V7S_TABLE_GFP_DMA);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +749,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
+   ARM_V7S_TABLE_SLAB_CACHE, NULL);
if (!data->l2_tables)
goto out_free_data;
 
-- 
2.20.0.rc1.387.gf8505762e3-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/3] mm: slab/slub: Add check_slab_flags function to check for valid flags

2018-12-04 Thread Nicolas Boichat
Remove duplicated code between slab and slub, and will make it
easier to make the test more complicated in the next commits.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 mm/internal.h | 18 --
 mm/slab.c |  8 +---
 mm/slub.c |  8 +---
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index f0c9ccde3bdb9e..a2ee82a0cd44ae 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -33,8 +33,22 @@
 /* Control allocation cpuset and node placement constraints */
 #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
 
-/* Do not use these with a slab allocator */
-#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
+/* Check for flags that must not be used with a slab allocator */
+static inline gfp_t check_slab_flags(gfp_t flags)
+{
+   gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+   if (unlikely(flags & bug_mask)) {
+   gfp_t invalid_mask = flags & bug_mask;
+
+   flags &= ~bug_mask;
+   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
+   invalid_mask, _mask, flags, );
+   dump_stack();
+   }
+
+   return flags;
+}
 
 void page_writeback_init(void);
 
diff --git a/mm/slab.c b/mm/slab.c
index 73fe23e649c91a..65a774f05e7836 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2643,13 +2643,7 @@ static struct page *cache_grow_begin(struct kmem_cache 
*cachep,
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-   gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-   flags &= ~GFP_SLAB_BUG_MASK;
-   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
-   invalid_mask, _mask, flags, );
-   dump_stack();
-   }
+   flags = check_slab_flags(flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
diff --git a/mm/slub.c b/mm/slub.c
index c229a9b7dd5448..21a3f6866da472 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1685,13 +1685,7 @@ static struct page *allocate_slab(struct kmem_cache *s, 
gfp_t flags, int node)
 
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
-   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-   gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-   flags &= ~GFP_SLAB_BUG_MASK;
-   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
-   invalid_mask, _mask, flags, );
-   dump_stack();
-   }
+   flags = check_slab_flags(flags);
 
return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
-- 
2.20.0.rc1.387.gf8505762e3-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-04 Thread Nicolas Boichat
In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
data structures smaller than a page with GFP_DMA32 flag.

This change makes it possible to create a custom cache in DMA32 zone
using kmem_cache_create, then allocate memory using kmem_cache_alloc.

We do not create a DMA32 kmalloc cache array, as there are currently
no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
ensures that such calls still fail (as they do before this change).

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v2:
 - Clarified commit message
 - Add entry in sysfs-kernel-slab to document the new sysfs file

(v3 used the page_frag approach)

Documentation/ABI/testing/sysfs-kernel-slab |  9 +
 include/linux/slab.h|  2 ++
 mm/internal.h   |  8 ++--
 mm/slab.c   |  4 +++-
 mm/slab.h   |  3 ++-
 mm/slab_common.c|  2 +-
 mm/slub.c   | 18 +-
 7 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-slab 
b/Documentation/ABI/testing/sysfs-kernel-slab
index 29601d93a1c2ea..d742c6cfdffbe9 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -106,6 +106,15 @@ Description:
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.
 
+What:  /sys/kernel/slab/cache/cache_dma32
+Date:  December 2018
+KernelVersion: 4.21
+Contact:   Nicolas Boichat 
+Description:
+   The cache_dma32 file is read-only and specifies whether objects
+   are from ZONE_DMA32.
+   Available when CONFIG_ZONE_DMA32 is enabled.
+
 What:  /sys/kernel/slab/cache/cpu_slabs
 Date:  May 2007
 KernelVersion: 2.6.22
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae4057c..9449b19c5f107a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
 /* Use GFP_DMA memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32   ((slab_flags_t __force)0x8000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
 /* Panic if kmem_cache_create() fails */
diff --git a/mm/internal.h b/mm/internal.h
index a2ee82a0cd44ae..fd244ad716eaf8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -34,9 +35,12 @@
 #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
 
 /* Check for flags that must not be used with a slab allocator */
-static inline gfp_t check_slab_flags(gfp_t flags)
+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
 {
-   gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+   gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+   if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
+   bug_mask |= __GFP_DMA32;
 
if (unlikely(flags & bug_mask)) {
gfp_t invalid_mask = flags & bug_mask;
diff --git a/mm/slab.c b/mm/slab.c
index 65a774f05e7836..2fd3b9a996cbe6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+   if (flags & SLAB_CACHE_DMA32)
+   cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
@@ -2643,7 +2645,7 @@ static struct page *cache_grow_begin(struct kmem_cache 
*cachep,
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   flags = check_slab_flags(flags);
+   flags = check_slab_flags(flags, cachep->flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9df..fcf717e12f0a86 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int 
object_size,
 
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+SLAB_CACHE_DMA32 | SLAB_PANIC | \

[PATCH v4 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-04 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], [2].

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For L1 tables that are bigger than a page, we can just use __get_free_pages
with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA).

For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
 1. This series, adding support for GFP_DMA32 slab caches.
 2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
 3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]

This series is the most memory-efficient approach.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/

Changes since v1:
 - Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
 - iommu/io-pgtable-arm-v7s (patch 3):
   - Changed approach to use SLAB_CACHE_DMA32 added by the previous
 commit.
   - Use DMA or DMA32 depending on the architecture (DMA for arm,
 DMA32 for arm64).

Changes since v2:
 - Reworded and expanded commit messages
 - Added cache_dma32 documentation in PATCH 2/3.

v3 used the page_frag approach, see [3].

Nicolas Boichat (3):
  mm: slab/slub: Add check_slab_flags function to check for valid flags
  mm: Add support for kmem caches in DMA32 zone
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

 Documentation/ABI/testing/sysfs-kernel-slab |  9 
 drivers/iommu/io-pgtable-arm-v7s.c  | 20 +
 include/linux/slab.h|  2 ++
 mm/internal.h   | 22 +--
 mm/slab.c   | 10 +++--
 mm/slab.h   |  3 ++-
 mm/slab_common.c|  2 +-
 mm/slub.c   | 24 +++--
 8 files changed, 70 insertions(+), 22 deletions(-)

-- 
2.20.0.rc1.387.gf8505762e3-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-04 Thread Nicolas Boichat
On Tue, Dec 4, 2018 at 10:35 PM Vlastimil Babka  wrote:
>
> On 12/4/18 10:37 AM, Nicolas Boichat wrote:
> > On Sun, Nov 11, 2018 at 5:04 PM Nicolas Boichat  
> > wrote:
> >>
> >> This is a follow-up to the discussion in [1], to make sure that the page
> >> tables allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit
> >> physical address space.
> >>
> >> [1] 
> >> https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
> >
> > Hi everyone,
> >
> > Let's try to summarize here.
> >
> > First, we confirmed that this is a regression, and IOMMU errors happen
> > on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13).
> > The issue most likely starts from ad67f5a6545f ("arm64: replace
> > ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number
> > of Mediatek platforms (and maybe others?).
> >
> > We have a few options here:
> > 1. This series [2], that adds support for GFP_DMA32 slab caches,
> > _without_ adding kmalloc caches (since there are no users of
> > kmalloc(..., GFP_DMA32)). I think I've addressed all the comments on
> > the 3 patches, and AFAICT this solution works fine.
> > 2. genalloc. That works, but unless we preallocate 4MB for L2 tables
> > (which is wasteful as we usually only need a handful of L2 tables),
> > we'll need changes in the core (use GFP_ATOMIC) to allow allocating on
> > demand, and as it stands we'd have no way to shrink the allocation.
> > 3. page_frag [3]. That works fine, and the code is quite simple. One
> > drawback is that fragments in partially freed pages cannot be reused
> > (from limited experiments, I see that IOMMU L2 tables are rarely
> > freed, so it's unlikely a whole page would get freed). But given the
> > low number of L2 tables, maybe we can live with that.
> >
> > I think 2 is out. Any preference between 1 and 3? I think 1 makes
> > better use of the memory, so that'd be my preference. But I'm probably
> > missing something.
>
> I would prefer 1 as well. IIRC you already confirmed that alignment
> requirements are not broken for custom kmem caches even in presence of
> SLUB debug options (and I would say it's a bug to be fixed if they
> weren't).

> I just asked (and didn't get a reply I think) about your
> ability to handle the GFP_ATOMIC allocation failures. They should be
> rare when only single page allocations are needed for the kmem cache.
> But in case they are not an option, then preallocating would be needed,
> thus probably option 2.

Oh, sorry, I missed your question.

I don't have a full answer, but:
- The allocations themselves are rare (I count a few 10s of L2 tables
at most on my system, I assume we rarely have >100), and yes, we only
need a single page, so the failures should be exceptional.
- My change is probably not making anything worse: I assume that even
with the current approach using GFP_DMA slab caches on older kernels,
failures could potentially happen. I don't think we've seen those. If
we are really concerned about this, maybe we'd need to modify
mtk_iommu_map to not hold a spinlock (if that's possible), so we don't
need to use GFP_ATOMIC. I suggest we just keep an eye on such issues,
and address them if they show up (we can even revisit genalloc at that
stage).

Anyway, I'll clean up patches for 1 (mostly commit message changes
based on the comments in the threads) and resend.

Thanks,

> > [2] https://patchwork.kernel.org/cover/10677529/, 3 patches
> > [3] https://patchwork.codeaurora.org/patch/671639/
> >
> > Thanks,
> >
> > Nicolas
> >
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-12-04 Thread Nicolas Boichat
On Sun, Nov 11, 2018 at 5:04 PM Nicolas Boichat  wrote:
>
> This is a follow-up to the discussion in [1], to make sure that the page
> tables allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit
> physical address space.
>
> [1] 
> https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html

Hi everyone,

Let's try to summarize here.

First, we confirmed that this is a regression, and IOMMU errors happen
on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13).
The issue most likely starts from ad67f5a6545f ("arm64: replace
ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number
of Mediatek platforms (and maybe others?).

We have a few options here:
1. This series [2], that adds support for GFP_DMA32 slab caches,
_without_ adding kmalloc caches (since there are no users of
kmalloc(..., GFP_DMA32)). I think I've addressed all the comments on
the 3 patches, and AFAICT this solution works fine.
2. genalloc. That works, but unless we preallocate 4MB for L2 tables
(which is wasteful as we usually only need a handful of L2 tables),
we'll need changes in the core (use GFP_ATOMIC) to allow allocating on
demand, and as it stands we'd have no way to shrink the allocation.
3. page_frag [3]. That works fine, and the code is quite simple. One
drawback is that fragments in partially freed pages cannot be reused
(from limited experiments, I see that IOMMU L2 tables are rarely
freed, so it's unlikely a whole page would get freed). But given the
low number of L2 tables, maybe we can live with that.

I think 2 is out. Any preference between 1 and 3? I think 1 makes
better use of the memory, so that'd be my preference. But I'm probably
missing something.

[2] https://patchwork.kernel.org/cover/10677529/, 3 patches
[3] https://patchwork.codeaurora.org/patch/671639/

Thanks,

Nicolas
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3, RFC] iommu/io-pgtable-arm-v7s: Use page_frag to request DMA32 memory

2018-12-04 Thread Nicolas Boichat
IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 tables, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 tables (1 KB), we use page_frag to allocate these pages,
as we cannot directly use kmalloc (no slab cache for GFP_DMA32) or
kmem_cache (mm/ code treats GFP_DMA32 as an invalid flag).

One downside is that we only free the allocated page if all the
4 fragments (4 IOMMU L2 tables) are freed, but given that we
usually only allocate limited number of IOMMU L2 tables, this
should not have too much impact on memory usage: In the absolute
worst case (4096 L2 page tables, each on their own 4K page),
we would use 16 MB of memory for 4 MB of L2 tables.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

As an alternative to the series [1], which adds support for GFP_DMA32
to kmem_cache in mm/. IMHO the solution in [1] is cleaner and more
efficient, as it allows freed fragments (L2 tables) to be reused, but
this approach does not require any core change.

[1] https://patchwork.kernel.org/cover/10677529/, 3 patches

 drivers/iommu/io-pgtable-arm-v7s.c | 32 --
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..0de6a51eb6755f 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,12 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -169,7 +175,7 @@ struct arm_v7s_io_pgtable {
struct io_pgtable   iop;
 
arm_v7s_iopte   *pgd;
-   struct kmem_cache   *l2_tables;
+   struct page_frag_cache  l2_tables;
spinlock_t  split_lock;
 };
 
@@ -198,13 +204,17 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = page_frag_alloc(>l2_tables, size,
+   gfp | __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -227,7 +237,7 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
if (lvl == 1)
free_pages((unsigned long)table, get_order(size));
else
-   kmem_cache_free(data->l2_tables, table);
+   page_frag_free(table);
return NULL;
 }
 
@@ -244,7 +254,7 @@ static void __arm_v7s_free_table(void *table, int lvl,
if (lvl == 1)
free_pages((unsigned long)table, get_order(size));
else
-   kmem_cache_free(data->l2_tables, table);
+   page_frag_free(table);
 }
 
 static void __arm_v7s_pte_sync(arm_v7s_iopte *ptep, int num_entries,
@@ -515,7 +525,6 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
__arm_v7s_free_table(iopte_deref(pte, 1), 2, data);
}
__arm_v7s_free_table(data->pgd, 1, data);
-   kmem_cache_destroy(data->l2_tables);
kfree(data);
 }
 
@@ -729,17 +738,11 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
!(cfg->quirks & IO_PGTABLE_QUIRK_NO_PERMS))
return NULL;
 
-   data = kmalloc(sizeof(*data), GFP_KERNEL);
+   data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
return NULL;
 
spin_lock_init(>split_lock);
-   data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
-   ARM_V7S_TABLE_SIZE(2),
-   ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
-   if (!data->l2_tables)
-   goto out_free_data;
 
data->iop.ops = 

Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-28 Thread Nicolas Boichat
On Mon, Nov 26, 2018 at 4:02 PM Christoph Hellwig  wrote:
>
> On Fri, Nov 23, 2018 at 01:23:41PM +0100, Vlastimil Babka wrote:
> > Is this also true for caches created by kmem_cache_create(), that
> > debugging options can result in not respecting the alignment passed to
> > kmem_cache_create()? That would be rather bad, IMHO.
>
> That's what I understood in the discussion.  If not it would make
> our live simpler, but would need to be well document.

>From my experiment, adding `slub_debug` to command line does _not_
break the alignment of kmem_cache_alloc'ed objects.

We do see an increase in slab_size
(/sys/kernel/slab/io-pgtable_armv7s_l2/slab_size), from 1024 to 3072
(probably because slub needs to allocate space on each side for the
red zone/padding, while keeping the alignment?)

> Christoph can probably explain the alignment choices in slub.
>
> >
> > > But I do agree with the sentiment of not wanting to spread GFP_DMA32
> > > futher into the slab allocator.
> >
> > I don't see a problem with GFP_DMA32 for custom caches. Generic
> > kmalloc() would be worse, since it would have to create a new array of
> > kmalloc caches. But that's already ruled out due to the alignment.
>
> True, purely slab probably isn't too bad.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-22 Thread Nicolas Boichat
On Fri, Nov 23, 2018 at 11:04 AM Nicolas Boichat  wrote:
>
> On Thu, Nov 22, 2018 at 4:23 PM Christoph Hellwig  wrote:
> >
> > On Wed, Nov 21, 2018 at 10:26:26PM +, Robin Murphy wrote:
> > > TBH, if this DMA32 stuff is going to be contentious we could possibly just
> > > rip out the offending kmem_cache - it seemed like good practice for the
> > > use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied upon 
> > > to
> > > give the same 1KB alignment and chance of succeeding as the equivalent
> > > kmem_cache_alloc(), then we could quite easily make do with that instead.
> >
> > Neither is the slab support for kmalloc, not do kmalloc allocations
> > have useful alignment apparently (at least if you use slub debug).
> >
> > But I do agree with the sentiment of not wanting to spread GFP_DMA32
> > futher into the slab allocator.
> >
> > I think you want a simple genalloc allocator for this rather special
> > use case.
>
> So I had a look at genalloc, we'd need to add pre-allocated memory
> using gen_pool_add [1]. There can be up to 4096 L2 page tables, so we
> may need to pre-allocate 4MB of memory (1KB per L2 page table). We
> could add chunks on demand, but then it'd be difficult to free them up
> (genalloc does not have a "gen_pool_remove" call). So basically if the
> full 4MB end up being requested, we'd be stuck with that until the
> iommu domain is freed (on the arm64 Mediatek platforms I looked at,
> there is only one iommu domain, and it never gets freed).

I tried out genalloc with pre-allocated 4MB, and that seems to work
fine. Allocating in chunks would require genalloc changes as
gen_pool_add calls kmalloc with just GFP_KERNEL [2], and we are in
atomic context in __arm_v7s_alloc_table...

[2] https://elixir.bootlin.com/linux/latest/source/lib/genalloc.c#L190

> page_frag would at least have a chance to reclaim those pages (if I
> understand Christoph's statement correctly)
>
> Robin: Do you have some ideas of the lifetime/usage of L2 tables? If
> they are usually few of them, or if they don't get reclaimed easily,
> some on demand genalloc allocation would be ok (or even 4MB allocation
> on init, if we're willing to take that hit). If they get allocated and
> freed together, maybe page_frag is a better option?
>
> Thanks,
>
> [1] https://www.kernel.org/doc/html/v4.19/core-api/genalloc.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-22 Thread Nicolas Boichat
On Thu, Nov 22, 2018 at 4:23 PM Christoph Hellwig  wrote:
>
> On Wed, Nov 21, 2018 at 10:26:26PM +, Robin Murphy wrote:
> > TBH, if this DMA32 stuff is going to be contentious we could possibly just
> > rip out the offending kmem_cache - it seemed like good practice for the
> > use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied upon to
> > give the same 1KB alignment and chance of succeeding as the equivalent
> > kmem_cache_alloc(), then we could quite easily make do with that instead.
>
> Neither is the slab support for kmalloc, not do kmalloc allocations
> have useful alignment apparently (at least if you use slub debug).
>
> But I do agree with the sentiment of not wanting to spread GFP_DMA32
> futher into the slab allocator.
>
> I think you want a simple genalloc allocator for this rather special
> use case.

So I had a look at genalloc, we'd need to add pre-allocated memory
using gen_pool_add [1]. There can be up to 4096 L2 page tables, so we
may need to pre-allocate 4MB of memory (1KB per L2 page table). We
could add chunks on demand, but then it'd be difficult to free them up
(genalloc does not have a "gen_pool_remove" call). So basically if the
full 4MB end up being requested, we'd be stuck with that until the
iommu domain is freed (on the arm64 Mediatek platforms I looked at,
there is only one iommu domain, and it never gets freed).

page_frag would at least have a chance to reclaim those pages (if I
understand Christoph's statement correctly)

Robin: Do you have some ideas of the lifetime/usage of L2 tables? If
they are usually few of them, or if they don't get reclaimed easily,
some on demand genalloc allocation would be ok (or even 4MB allocation
on init, if we're willing to take that hit). If they get allocated and
freed together, maybe page_frag is a better option?

Thanks,

[1] https://www.kernel.org/doc/html/v4.19/core-api/genalloc.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-21 Thread Nicolas Boichat
On Thu, Nov 22, 2018 at 10:36 AM Matthew Wilcox  wrote:
>
> On Wed, Nov 21, 2018 at 10:26:26PM +, Robin Murphy wrote:
> > These are IOMMU page tables, rather than CPU ones, so we're already well
> > outside arch code - indeed the original motivation of io-pgtable was to be
> > entirely independent of the p*d types and arch-specific MM code (this Armv7
> > short-descriptor format is already "non-native" when used by drivers in an
> > arm64 kernel).
>
> There was quite a lot of explanation missing from this patch description!

I totally agree ,-) I'm not familiar at all with either iommu or
mm/... Looks like the patchset triggered a helpful discussion, and I
understand the problem better now. I'll improve the description in the
next revision.

> > There are various efficiency reasons for using regular kernel memory instead
> > of coherent DMA allocations - for the most part it works well, we just have
> > the odd corner case like this one where the 32-bit format gets used on
> > 64-bit systems such that the tables themselves still need to be allocated
> > below 4GB (although the final output address can point at higher memory by
> > virtue of the IOMMU in question not implementing permissions and repurposing
> > some of those PTE fields as extra address bits).
> >
> > TBH, if this DMA32 stuff is going to be contentious we could possibly just
> > rip out the offending kmem_cache - it seemed like good practice for the
> > use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied upon to
> > give the same 1KB alignment and chance of succeeding as the equivalent
> > kmem_cache_alloc(), then we could quite easily make do with that instead.
>
> I think you should look at using the page_frag allocator here.  You can
> use whatever GFP_DMA flags you like.

I'll try that.

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-11-21 Thread Nicolas Boichat
On Thu, Nov 22, 2018 at 2:02 AM Michal Hocko  wrote:
>
> On Wed 21-11-18 16:46:38, Will Deacon wrote:
> > On Sun, Nov 11, 2018 at 05:03:41PM +0800, Nicolas Boichat wrote:
> > > For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
> > > is defined (e.g. on arm64 platforms).
> > >
> > > For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.
> > >
> > > Also, print an error when the physical address does not fit in
> > > 32-bit, to make debugging easier in the future.
> > >
> > > Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> > > Signed-off-by: Nicolas Boichat 
> > > ---
> > >
> > > Changes since v1:
> > >  - Changed approach to use SLAB_CACHE_DMA32 added by the previous
> > >commit.
> > >  - Use DMA or DMA32 depending on the architecture (DMA for arm,
> > >DMA32 for arm64).
> > >
> > > drivers/iommu/io-pgtable-arm-v7s.c | 20 
> > >  1 file changed, 16 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> > > b/drivers/iommu/io-pgtable-arm-v7s.c
> > > index 445c3bde04800c..996f7b6d00b44a 100644
> > > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > > @@ -161,6 +161,14 @@
> > >
> > >  #define ARM_V7S_TCR_PD1BIT(5)
> > >
> > > +#ifdef CONFIG_ZONE_DMA32
> > > +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
> > > +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
> > > +#else
> > > +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
> > > +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
> > > +#endif
> >
> > It's a bit grotty that GFP_DMA32 doesn't just map to GFP_DMA on 32-bit
> > architectures, since then we wouldn't need this #ifdeffery afaict.
>
> But GFP_DMA32 should map to GFP_KERNEL on 32b, no? Or what exactly is
> going on in here?

GFP_DMA32 will fail due to check_slab_flags (aka GFP_SLAB_BUG_MASK
before patch 1/3 of this series)... But yes, it may be neater if there
was transparent remapping of GFP_DMA32/SLAB_CACHE_DMA32 to
GFP_DMA/SLAB_CACHE_DMA on 32-bit arch...

> --
> Michal Hocko
> SUSE Labs
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-21 Thread Nicolas Boichat
On Thu, Nov 22, 2018 at 6:27 AM Robin Murphy  wrote:
>
> On 2018-11-21 9:38 pm, Matthew Wilcox wrote:
> > On Wed, Nov 21, 2018 at 06:20:02PM +, Christopher Lameter wrote:
> >> On Sun, 11 Nov 2018, Nicolas Boichat wrote:
> >>
> >>> This is a follow-up to the discussion in [1], to make sure that the page
> >>> tables allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit
> >>> physical address space.
> >>
> >> Page tables? This means you need a page frame? Why go through the slab
> >> allocators?
> >
> > Because this particular architecture has sub-page-size PMD page tables.
> > We desperately need to hoist page table allocation out of the architectures;
> > there're a bunch of different implementations and they're mostly bad,
> > one way or another.
>
> These are IOMMU page tables, rather than CPU ones, so we're already well
> outside arch code - indeed the original motivation of io-pgtable was to
> be entirely independent of the p*d types and arch-specific MM code (this
> Armv7 short-descriptor format is already "non-native" when used by
> drivers in an arm64 kernel).
>
> There are various efficiency reasons for using regular kernel memory
> instead of coherent DMA allocations - for the most part it works well,
> we just have the odd corner case like this one where the 32-bit format
> gets used on 64-bit systems such that the tables themselves still need
> to be allocated below 4GB (although the final output address can point
> at higher memory by virtue of the IOMMU in question not implementing
> permissions and repurposing some of those PTE fields as extra address bits).
>
> TBH, if this DMA32 stuff is going to be contentious we could possibly
> just rip out the offending kmem_cache - it seemed like good practice for
> the use-case, but provided kzalloc(SZ_1K, gfp | GFP_DMA32) can be relied
> upon to give the same 1KB alignment and chance of succeeding as the
> equivalent kmem_cache_alloc(), then we could quite easily make do with
> that instead.

Yes, but if we want to use kzalloc, we'll need to create
kmalloc_caches for DMA32, which seems wasteful as there are no other
users (see my comment here:
https://patchwork.kernel.org/patch/10677525/#22332697).

Thanks,

> Thanks,
> Robin.
>
> > For each level of page table we generally have three cases:
> >
> > 1. single page
> > 2. sub-page, naturally aligned
> > 3. multiple pages, naturally aligned
> >
> > for 1 and 3, the page allocator will do just fine.
> > for 2, we should have a per-MM page_frag allocator.  s390 already has
> > something like this, although it's more complicated.  ppc also has
> > something a little more complex for the cases when it's configured with
> > a 64k page size but wants to use a 4k page table entry.
> >
> > I'd like x86 to be able to simply do:
> >
> > #define pte_alloc_one(mm, addr)   page_alloc_table(mm, addr, 0)
> > #define pmd_alloc_one(mm, addr)   page_alloc_table(mm, addr, 0)
> > #define pud_alloc_one(mm, addr)   page_alloc_table(mm, addr, 0)
> > #define p4d_alloc_one(mm, addr)   page_alloc_table(mm, addr, 0)
> >
> > An architecture with 4k page size and needing a 16k PMD would do:
> >
> > #define pmd_alloc_one(mm, addr) page_alloc_table(mm, addr, 2)
> >
> > while an architecture with a 64k page size needing a 4k PTE would do:
> >
> > #define ARCH_PAGE_TABLE_FRAG
> > #define pte_alloc_one(mm, addr) pagefrag_alloc_table(mm, addr, 4096)
> >
> > I haven't had time to work on this, but perhaps someone with a problem
> > that needs fixing would like to, instead of burying yet another awful
> > implementation away in arch/ somewhere.
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/3] mm: Add support for SLAB_CACHE_DMA32

2018-11-21 Thread Nicolas Boichat
On Thu, Nov 22, 2018 at 2:32 AM Christopher Lameter  wrote:
>
> On Sun, 11 Nov 2018, Nicolas Boichat wrote:
>
> > SLAB_CACHE_DMA32 is only available after explicit kmem_cache_create calls,
> > no default cache is created for kmalloc. Add a test in check_slab_flags
> > for this.
>
> This does not define the dma32 kmalloc array. Is that intentional?

Yes that's intentional, AFAICT there is no user, so there is no point
creating the cache.

 (okay, I could find one, but it's probably broken:
git grep GFP_DMA32 | grep k[a-z]*alloc
drivers/media/platform/vivid/vivid-osd.c: dev->video_vbase =
kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);
).

> In that
> case you need to fail any request for GFP_DMA32 coming in via kmalloc.

Well, we do check for these in check_slab_flags (aka GFP_SLAB_BUG_MASK
before patch 1/3 of this series), so, with or without this patch,
calls with GFP_DMA32 will end up failing in check_slab_flags.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/3] mm: Add support for SLAB_CACHE_DMA32

2018-11-11 Thread Nicolas Boichat
SLAB_CACHE_DMA32 is only available after explicit kmem_cache_create calls,
no default cache is created for kmalloc. Add a test in check_slab_flags
for this.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 include/linux/slab.h |  2 ++
 mm/internal.h|  8 ++--
 mm/slab.c|  4 +++-
 mm/slab.h|  3 ++-
 mm/slab_common.c |  2 +-
 mm/slub.c| 18 +-
 6 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156f4..afc51ee1dae5d4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
 /* Use GFP_DMA memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32   ((slab_flags_t __force)0x8000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
 /* Panic if kmem_cache_create() fails */
diff --git a/mm/internal.h b/mm/internal.h
index 7a500b232e4a43..2aa9c8491d2ca2 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -34,9 +35,12 @@
 #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
 
 /* Check for flags that must not be used with a slab allocator */
-static inline gfp_t check_slab_flags(gfp_t flags)
+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
 {
-   gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+   gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+   if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
+   bug_mask |= __GFP_DMA32;
 
if (unlikely(flags & bug_mask)) {
gfp_t invalid_mask = flags & bug_mask;
diff --git a/mm/slab.c b/mm/slab.c
index 251e09a5a3ef5c..6efcaad6a02b70 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2122,6 +2122,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+   if (flags & SLAB_CACHE_DMA32)
+   cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
@@ -2656,7 +2658,7 @@ static struct page *cache_grow_begin(struct kmem_cache 
*cachep,
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   flags = check_slab_flags(flags);
+   flags = check_slab_flags(flags, cachep->flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
diff --git a/mm/slab.h b/mm/slab.h
index 58c6c1c2a78ee3..9632772e14beb2 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int 
object_size,
 
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+SLAB_CACHE_DMA32 | SLAB_PANIC | \
 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
 
 #if defined(CONFIG_DEBUG_SLAB)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 7eb8dc136c1cb8..f204385553bbac 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
SLAB_FAILSLAB | SLAB_KASAN)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
-SLAB_ACCOUNT)
+SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index 1cca562bebdc8d..c639bd008e8c11 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1681,7 +1681,7 @@ static struct page *allocate_slab(struct kmem_cache *s, 
gfp_t flags, int node)
 
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
-   flags = check_slab_flags(flags);
+   flags = check_slab_flags(flags, s->flags);
 
return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
@@ -3571,6 +3571,9 @@ static int calculate_sizes(struct kmem_cache *s, int 
forced_order)
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;
 
+   if (s->flags & SLAB_CACHE_DMA32)
+   s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIM

[PATCH v2 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-11-11 Thread Nicolas Boichat
For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---

Changes since v1:
 - Changed approach to use SLAB_CACHE_DMA32 added by the previous
   commit.
 - Use DMA or DMA32 depending on the architecture (DMA for arm,
   DMA32 for arm64).

drivers/iommu/io-pgtable-arm-v7s.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..996f7b6d00b44a 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@
 
 #define ARM_V7S_TCR_PD1BIT(5)
 
+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
+#endif
+
 typedef u32 arm_v7s_iopte;
 
 static bool selftest_running;
@@ -198,13 +206,17 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_free_pages(
+   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
-   table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+   table = kmem_cache_zalloc(data->l2_tables,
+ gfp | ARM_V7S_TABLE_GFP_DMA);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit in PTE: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +749,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
-   SLAB_CACHE_DMA, NULL);
+   ARM_V7S_TABLE_SLAB_CACHE, NULL);
if (!data->l2_tables)
goto out_free_data;
 
-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/3] mm: slab/slub: Add check_slab_flags function to check for valid flags

2018-11-11 Thread Nicolas Boichat
Remove duplicated code between slab and slub, and will make it
easier to make the test more complicated in the next commits.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 mm/internal.h | 17 +++--
 mm/slab.c |  8 +---
 mm/slub.c |  8 +---
 3 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 3b1ec1412fd2cd..7a500b232e4a43 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -33,8 +33,22 @@
 /* Control allocation cpuset and node placement constraints */
 #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
 
-/* Do not use these with a slab allocator */
-#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
+/* Check for flags that must not be used with a slab allocator */
+static inline gfp_t check_slab_flags(gfp_t flags)
+{
+   gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+   if (unlikely(flags & bug_mask)) {
+   gfp_t invalid_mask = flags & bug_mask;
+
+   flags &= ~bug_mask;
+   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
+   invalid_mask, _mask, flags, );
+   dump_stack();
+   }
+
+   return flags;
+}
 
 void page_writeback_init(void);
 
diff --git a/mm/slab.c b/mm/slab.c
index 2a5654bb3b3ff3..251e09a5a3ef5c 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2656,13 +2656,7 @@ static struct page *cache_grow_begin(struct kmem_cache 
*cachep,
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-   gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-   flags &= ~GFP_SLAB_BUG_MASK;
-   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
-   invalid_mask, _mask, flags, );
-   dump_stack();
-   }
+   flags = check_slab_flags(flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
 
diff --git a/mm/slub.c b/mm/slub.c
index e3629cd7aff164..1cca562bebdc8d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1681,13 +1681,7 @@ static struct page *allocate_slab(struct kmem_cache *s, 
gfp_t flags, int node)
 
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
-   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
-   gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
-   flags &= ~GFP_SLAB_BUG_MASK;
-   pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x 
(%pGg). Fix your code!\n",
-   invalid_mask, _mask, flags, );
-   dump_stack();
-   }
+   flags = check_slab_flags(flags);
 
return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-11 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], to make sure that the page
tables allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit
physical address space.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html

Fixes since v1:
 - Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
 - iommu/io-pgtable-arm-v7s (patch 3):
   - Changed approach to use SLAB_CACHE_DMA32 added by the previous
 commit.
   - Use DMA or DMA32 depending on the architecture (DMA for arm,
 DMA32 for arm64).

Nicolas Boichat (3):
  mm: slab/slub: Add check_slab_flags function to check for valid flags
  mm: Add support for SLAB_CACHE_DMA32
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

 drivers/iommu/io-pgtable-arm-v7s.c | 20 
 include/linux/slab.h   |  2 ++
 mm/internal.h  | 21 +++--
 mm/slab.c  | 10 +++---
 mm/slab.h  |  3 ++-
 mm/slab_common.c   |  2 +-
 mm/slub.c  | 24 +---
 7 files changed, 60 insertions(+), 22 deletions(-)

-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 1/3] mm: When CONFIG_ZONE_DMA32 is set, use DMA32 for SLAB_CACHE_DMA

2018-11-09 Thread Nicolas Boichat
On Fri, Nov 9, 2018 at 6:43 PM Vlastimil Babka  wrote:
>
> On 11/9/18 9:24 AM, Nicolas Boichat wrote:
> > Some callers, namely iommu/io-pgtable-arm-v7s, expect the physical
> > address returned by kmem_cache_alloc with GFP_DMA parameter to be
> > a 32-bit address.
> >
> > Instead of adding a separate SLAB_CACHE_DMA32 (and then audit
> > all the calls to check if they require memory from DMA or DMA32
> > zone), we simply allocate SLAB_CACHE_DMA cache in DMA32 region,
> > if CONFIG_ZONE_DMA32 is set.
> >
> > Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> > Signed-off-by: Nicolas Boichat 
> > ---
> >  include/linux/slab.h | 13 -
> >  mm/slab.c|  2 +-
> >  mm/slub.c|  2 +-
> >  3 files changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 918f374e7156f4..390afe90c5dec0 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -30,7 +30,7 @@
> >  #define SLAB_POISON  ((slab_flags_t __force)0x0800U)
> >  /* Align objs on cache lines */
> >  #define SLAB_HWCACHE_ALIGN   ((slab_flags_t __force)0x2000U)
> > -/* Use GFP_DMA memory */
> > +/* Use GFP_DMA or GFP_DMA32 memory */
> >  #define SLAB_CACHE_DMA   ((slab_flags_t __force)0x4000U)
> >  /* DEBUG: Store the last owner for bug hunting */
> >  #define SLAB_STORE_USER  ((slab_flags_t __force)0x0001U)
> > @@ -126,6 +126,17 @@
> >  #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
> >   (unsigned long)ZERO_SIZE_PTR)
> >
> > +/*
> > + * When ZONE_DMA32 is defined, have SLAB_CACHE_DMA allocate memory with
> > + * GFP_DMA32 instead of GFP_DMA, as this is what some of the callers
> > + * require (instead of duplicating cache for DMA and DMA32 zones).
> > + */
> > +#ifdef CONFIG_ZONE_DMA32
> > +#define SLAB_CACHE_DMA_GFP GFP_DMA32
> > +#else
> > +#define SLAB_CACHE_DMA_GFP GFP_DMA
> > +#endif
>
> AFAICS this will break e.g. x86 which can have both ZONE_DMA and
> ZONE_DMA32, and now you would make kmalloc(__GFP_DMA) return objects
> from ZONE_DMA32 instead of __ZONE_DMA, which can break something.

Oh, I was not aware that both ZONE_DMA and ZONE_DMA32 can be defined
at the same time. I guess the test should be inverted, something like
this (can be simplified...):
#ifdef CONFIG_ZONE_DMA
#define SLAB_CACHE_DMA_GFP GFP_DMA
#elif defined(CONFIG_ZONE_DMA32)
#define SLAB_CACHE_DMA_GFP GFP_DMA32
#else
#define SLAB_CACHE_DMA_GFP GFP_DMA // ?
#endif

> Also I'm probably missing the point of this all. In patch 3 you use
> __get_dma32_pages() thus __get_free_pages(__GFP_DMA32), which uses
> alloc_pages, thus the page allocator directly, and there's no slab
> caches involved.

__get_dma32_pages fixes level 1 page allocations in the patch 3.

This change fixes level 2 page allocations
(kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA)), by transparently
remapping GFP_DMA to an underlying ZONE_DMA32.

The alternative would be to create a new SLAB_CACHE_DMA32 when
CONFIG_ZONE_DMA32 is defined, but then I'm concerned that the callers
would need to choose between the 2 (GFP_DMA or GFP_DMA32...), and also
need to use some ifdefs (but maybe that's not a valid concern?).

> It makes little sense to involve slab for page table
> allocations anyway, as those tend to be aligned to a page size (or
> high-order page size). So what am I missing?

Level 2 tables are ARM_V7S_TABLE_SIZE(2) => 1kb, so we'd waste 3kb if
we allocated a full page.

Thanks,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: iommu/io-pgtable-arm-v7s: About pagetable 33bit PA

2018-11-09 Thread Nicolas Boichat
Hi Robin/Yong,

On Fri, Nov 9, 2018 at 3:51 PM Yong Wu  wrote:
>
> On Thu, 2018-11-08 at 13:49 +, Robin Murphy wrote:
> > On 08/11/2018 07:31, Yong Wu wrote:
> > > Hi Robin,
> > >
> > > After the commit ad67f5a6545f ("arm64: replace ZONE_DMA with
> > > ZONE_DMA32"), we don't have ZONE_DMA in aarch64, but
> > > __arm_v7s_alloc_table[1] use the GFP_DMA to expect the physical address
> > > of pagetable should be 32bit.
> > >
> > > Right now we meet a issue that the lvl1 pagetable PA is 0x1_3ab6_ in
> > > the 4GB broad. then the IOMMU initialize failed.(This issue can be fixed
> > > if we revert ad67f5a6545f.)
> >
> > But that shouldn't actually be failing on MTK platforms anyway, right,
> > since your IOMMU is still capable of addressing that? Seems like we
> > overlooked that check in __arm_v7s_alloc_table(), where the conversion
> > by casting will want updating to something like
> > "iopte_to_paddr(paddr_to_iopte(...))" in patch #4 of your series.
>
> The current mt8183 IOMMU support the pagetable address over 4GB.
> Unfortunately the previous mt8173 don't support.(mt8173 IOMMU support
> mapping for the dram over 4GB while its pagetable still need locate
> below 4GB.)
>
> In order to unify the code, we still expect lvl2 pagetable base don't
> over 4GB. thus I didn't change it in the current patchset.
>
> >
> > > I planed to add GFP_DMA32 for pagetable allocation. the level1 was
> > > ok but level2 was failed. It looks that slab don't like GFP_DMA32[2].
> > > this is the warning log:
> > > =
> > > Unexpected gfp: 0x4 (GFP_DMA32). Fixing up to gfp: 0x488020 (GFP_ATOMIC|
> > > __GFP_ZERO). Fix your code!
> > > =
> > > I don't know why slab don't allow GFP_DMA32, the gfp flags seems only
> > > be bypassed to alloc_page. Is it possible that let slab allow GFP_DMA32
> > > for aarch64?
> > I had a bit of a look into it some time ago, and I don't recall seeing
> > any obvious reason that the kmem_cache API couldn't support ZONE_DMA32
> > in general (either via a separate SLAB_CACHE_DMA32, or just overloading
> > SLAB_CACHE_DMA depending on which of CONFIG_ZONE_DMA/CONFIG_ZONE_DMA32
> > are enabled) - I guess that's just never been implemented since nobody
> > needed it before.
>
> Thanks for the comment. We could try a patch for this.

I made an attempt here, if that's helpful:
https://lore.kernel.org/patchwork/cover/1009116/ .

I'm still a bit uncomfortable with GFP_DMA passed to kmem_cache_alloc
suddenly becoming GFP_DMA32 in the underlying call, but I'm not sure
if there is a better way (apart from duplicating the caches, and a lot
of ifdefs in callers).

Thanks,

> >
> > Robin.
> >
> > > We notice that this has been discussed[3]. but if we use alloc_page for
> > > the level2 pagetable, It may waste lots of memory.
> > >
> > > Any suggestion is appreciated.
> > >
> > >
> > > Reported-by: Nicolas Boichat 
> > >
> > > [1]
> > > https://elixir.bootlin.com/linux/v4.20-rc1/source/drivers/iommu/io-pgtable-arm-v7s.c#L200
> > > [2] https://elixir.bootlin.com/linux/v4.20-rc1/source/mm/internal.h#L37
> > > [3] https://patchwork.kernel.org/patch/10474289/
> > >
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 1/3] mm: When CONFIG_ZONE_DMA32 is set, use DMA32 for SLAB_CACHE_DMA

2018-11-09 Thread Nicolas Boichat
Some callers, namely iommu/io-pgtable-arm-v7s, expect the physical
address returned by kmem_cache_alloc with GFP_DMA parameter to be
a 32-bit address.

Instead of adding a separate SLAB_CACHE_DMA32 (and then audit
all the calls to check if they require memory from DMA or DMA32
zone), we simply allocate SLAB_CACHE_DMA cache in DMA32 region,
if CONFIG_ZONE_DMA32 is set.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 include/linux/slab.h | 13 -
 mm/slab.c|  2 +-
 mm/slub.c|  2 +-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156f4..390afe90c5dec0 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -30,7 +30,7 @@
 #define SLAB_POISON((slab_flags_t __force)0x0800U)
 /* Align objs on cache lines */
 #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x2000U)
-/* Use GFP_DMA memory */
+/* Use GFP_DMA or GFP_DMA32 memory */
 #define SLAB_CACHE_DMA ((slab_flags_t __force)0x4000U)
 /* DEBUG: Store the last owner for bug hunting */
 #define SLAB_STORE_USER((slab_flags_t __force)0x0001U)
@@ -126,6 +126,17 @@
 #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
 
+/*
+ * When ZONE_DMA32 is defined, have SLAB_CACHE_DMA allocate memory with
+ * GFP_DMA32 instead of GFP_DMA, as this is what some of the callers
+ * require (instead of duplicating cache for DMA and DMA32 zones).
+ */
+#ifdef CONFIG_ZONE_DMA32
+#define SLAB_CACHE_DMA_GFP GFP_DMA32
+#else
+#define SLAB_CACHE_DMA_GFP GFP_DMA
+#endif
+
 #include 
 
 struct mem_cgroup;
diff --git a/mm/slab.c b/mm/slab.c
index 2a5654bb3b3ff3..8810daa052dcdc 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2121,7 +2121,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, 
slab_flags_t flags)
cachep->flags = flags;
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
-   cachep->allocflags |= GFP_DMA;
+   cachep->allocflags |= SLAB_CACHE_DMA_GFP;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
diff --git a/mm/slub.c b/mm/slub.c
index e3629cd7aff164..fdd05323e54cbd 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3575,7 +3575,7 @@ static int calculate_sizes(struct kmem_cache *s, int 
forced_order)
s->allocflags |= __GFP_COMP;
 
if (s->flags & SLAB_CACHE_DMA)
-   s->allocflags |= GFP_DMA;
+   s->allocflags |= SLAB_CACHE_DMA_GFP;
 
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;
-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

2018-11-09 Thread Nicolas Boichat
For level 1 pages, use __get_dma32_pages to make sure physical
memory address is 32-bit.

For level 2 pages, kmem_cache_zalloc with GFP_DMA has been modified
in a previous patch to be allocated in DMA32 zone.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..862fd404de84d8 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -198,13 +198,15 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;
 
if (lvl == 1)
-   table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+   table = (void *)__get_dma32_pages(__GFP_ZERO, get_order(size));
else if (lvl == 2)
table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
phys = virt_to_phys(table);
-   if (phys != (arm_v7s_iopte)phys)
+   if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+   dev_err(dev, "Page table does not fit: %pa", );
goto out_free;
+   }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 2/3] include/linux/gfp.h: Add __get_dma32_pages macro

2018-11-09 Thread Nicolas Boichat
Some callers (e.g. iommu/io-pgtable-arm-v7s) require DMA32 memory
when calling __get_dma_pages. Add a new macro for that purpose.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat 
---
 include/linux/gfp.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 76f8db0b0e715c..50e04896b78017 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -535,6 +535,8 @@ void * __meminit alloc_pages_exact_nid(int nid, size_t 
size, gfp_t gfp_mask);
 
 #define __get_dma_pages(gfp_mask, order) \
__get_free_pages((gfp_mask) | GFP_DMA, (order))
+#define __get_dma32_pages(gfp_mask, order) \
+   __get_free_pages((gfp_mask) | GFP_DMA32, (order))
 
 extern void __free_pages(struct page *page, unsigned int order);
 extern void free_pages(unsigned long addr, unsigned int order);
-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

2018-11-09 Thread Nicolas Boichat
This is a follow-up to the discussion in [1], to make sure that the page tables
allocated by iommu/io-pgtable-arm-v7s are contained within 32-bit physical
address space.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html

Nicolas Boichat (3):
  mm: When CONFIG_ZONE_DMA32 is set, use DMA32 for SLAB_CACHE_DMA
  include/linux/gfp.h: Add __get_dma32_pages macro
  iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

 drivers/iommu/io-pgtable-arm-v7s.c |  6 --
 include/linux/gfp.h|  2 ++
 include/linux/slab.h   | 13 -
 mm/slab.c  |  2 +-
 mm/slub.c  |  2 +-
 5 files changed, 20 insertions(+), 5 deletions(-)

-- 
2.19.1.930.g4563a0d9d0-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu