Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Jean-Philippe, On Fri, 5 Mar 2021 09:30:49 +0100, Jean-Philippe Brucker wrote: > On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote: > > Hi Jean-Philippe, > > > > On Thu, 4 Mar 2021 10:49:37 +0100, Jean-Philippe Brucker > > wrote: > > > > > On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote: > > > > Hi Jacob, > > > > > > > > On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan > > > > wrote: > > > > > > > > > Hi Tejun, > > > > > > > > > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo > > > > > wrote: > > > > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > > > > > > IOASIDs are used to associate DMA requests with virtual > > > > > > > address spaces. They are a system-wide limited resource made > > > > > > > available to the userspace applications. Let it be VMs or > > > > > > > user-space device drivers. > > > > > > > > > > > > > > This RFC patch introduces a cgroup controller to address the > > > > > > > following problems: > > > > > > > 1. Some user applications exhaust all the available IOASIDs > > > > > > > thus depriving others of the same host. > > > > > > > 2. System admins need to provision VMs based on their needs > > > > > > > for IOASIDs, e.g. the number of VMs with assigned devices > > > > > > > that perform DMA requests with PASID. > > > > > > > > > > > > Please take a look at the proposed misc controller: > > > > > > > > > > > > > > > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > > > > > > > > > > > Would that fit your bill? > > > > > The interface definitely can be reused. But IOASID has a different > > > > > behavior in terms of migration and ownership checking. I guess > > > > > SEV key IDs are not tied to a process whereas IOASIDs are. > > > > > Perhaps this can be solved by adding > > > > > + .can_attach = ioasids_can_attach, > > > > > + .cancel_attach = ioasids_cancel_attach, > > > > > Let me give it a try and come back. > > > > > > > > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup > > > > proposal. I'd like to have a direction check on whether this idea of > > > > using cgroup for IOASID/PASID resource management is viable. > > > > > > Yes, even for host SVA it would be good to have a cgroup. Currently > > > the number of shared address spaces is naturally limited by number of > > > processes, which can be controlled with rlimit and cgroup. But on Arm > > > the hardware limit on shared address spaces is 64k (number of ASIDs), > > > easily exhausted with the default PASID and PID limits. So a cgroup > > > for managing this resource is more than welcome. > > > > > > It looks like your current implementation is very dependent on > > > IOASID_SET_TYPE_MM? I'll need to do more reading about cgroup to see > > > how easily it can be adapted to host SVA which uses > > > IOASID_SET_TYPE_NULL. > > Right, I was assuming have three use cases of IOASIDs: > > 1. host supervisor SVA (not a concern, just one init_mm to bind) > > 2. host user SVA, either one IOASID per process or perhaps some private > > IOASID for private address space > > 3. VM use for guest SVA, each IOASID is bound to a guest process > > > > My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which > > is allocated by the new /dev/ioasid interface. > > > > For #2, I was thinking you can limit the host process via PIDs cgroup? > > i.e. limit fork. > > That works but isn't perfect, because the hardware resource of shared > address spaces can be much lower that PID limit - 16k ASIDs on Arm. To > allow an admin to fairly distribute that resource we could introduce > another cgroup just to limit the number of shared address spaces, but > limiting the number of IOASIDs does the trick. > make sense. it would be cleaner to have a single approach to limit IOASIDs (as Jason asked). > > So the host IOASIDs are currently allocated from the system pool > > with quota of chosen by iommu_sva_init() in my patch, 0 means unlimited > > use whatever is available. https://lkml.org/lkml/2021/2/28/18 > > Yes that's sensible, but it would be good to plan the cgroup user > interface to work for #2 as well, even if we don't implement it right > away. > will do it in the next version. > Thanks, > Jean Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Fri, Mar 05, 2021 at 09:30:49AM +0100, Jean-Philippe Brucker wrote: > That works but isn't perfect, because the hardware resource of shared > address spaces can be much lower that PID limit - 16k ASIDs on Arm. To Sorry I meant 16-bit here - 64k Thanks, Jean
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote: > Hi Jean-Philippe, > > On Thu, 4 Mar 2021 10:49:37 +0100, Jean-Philippe Brucker > wrote: > > > On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote: > > > Hi Jacob, > > > > > > On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan > > > wrote: > > > > > > > Hi Tejun, > > > > > > > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo wrote: > > > > > > > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > > > > > IOASIDs are used to associate DMA requests with virtual address > > > > > > spaces. They are a system-wide limited resource made available to > > > > > > the userspace applications. Let it be VMs or user-space device > > > > > > drivers. > > > > > > > > > > > > This RFC patch introduces a cgroup controller to address the > > > > > > following problems: > > > > > > 1. Some user applications exhaust all the available IOASIDs thus > > > > > > depriving others of the same host. > > > > > > 2. System admins need to provision VMs based on their needs for > > > > > > IOASIDs, e.g. the number of VMs with assigned devices that perform > > > > > > DMA requests with PASID. > > > > > > > > > > Please take a look at the proposed misc controller: > > > > > > > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > > > > > > > > > Would that fit your bill? > > > > The interface definitely can be reused. But IOASID has a different > > > > behavior in terms of migration and ownership checking. I guess SEV key > > > > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be > > > > solved by adding > > > > + .can_attach = ioasids_can_attach, > > > > + .cancel_attach = ioasids_cancel_attach, > > > > Let me give it a try and come back. > > > > > > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup > > > proposal. I'd like to have a direction check on whether this idea of > > > using cgroup for IOASID/PASID resource management is viable. > > > > Yes, even for host SVA it would be good to have a cgroup. Currently the > > number of shared address spaces is naturally limited by number of > > processes, which can be controlled with rlimit and cgroup. But on Arm the > > hardware limit on shared address spaces is 64k (number of ASIDs), easily > > exhausted with the default PASID and PID limits. So a cgroup for managing > > this resource is more than welcome. > > > > It looks like your current implementation is very dependent on > > IOASID_SET_TYPE_MM? I'll need to do more reading about cgroup to see how > > easily it can be adapted to host SVA which uses IOASID_SET_TYPE_NULL. > > > Right, I was assuming have three use cases of IOASIDs: > 1. host supervisor SVA (not a concern, just one init_mm to bind) > 2. host user SVA, either one IOASID per process or perhaps some private > IOASID for private address space > 3. VM use for guest SVA, each IOASID is bound to a guest process > > My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which is > allocated by the new /dev/ioasid interface. > > For #2, I was thinking you can limit the host process via PIDs cgroup? i.e. > limit fork. That works but isn't perfect, because the hardware resource of shared address spaces can be much lower that PID limit - 16k ASIDs on Arm. To allow an admin to fairly distribute that resource we could introduce another cgroup just to limit the number of shared address spaces, but limiting the number of IOASIDs does the trick. > So the host IOASIDs are currently allocated from the system pool > with quota of chosen by iommu_sva_init() in my patch, 0 means unlimited use > whatever is available. https://lkml.org/lkml/2021/2/28/18 Yes that's sensible, but it would be good to plan the cgroup user interface to work for #2 as well, even if we don't implement it right away. Thanks, Jean
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Jason, On Thu, 4 Mar 2021 15:02:53 -0400, Jason Gunthorpe wrote: > On Thu, Mar 04, 2021 at 11:01:44AM -0800, Jacob Pan wrote: > > > > For something like qemu I'd expect to put the qemu process in a cgroup > > > with 1 PASID. Who cares what qemu uses the PASID for, or how it was > > > allocated? > > > > For vSVA, we will need one PASID per guest process. But that is up to > > the admin based on whether or how many SVA capable devices are directly > > assigned. > > I hope the virtual IOMMU driver can communicate the PASID limit and > the cgroup machinery in the guest can know what the actual limit is. > For VT-d, emulated vIOMMU can communicate with the guest IOMMU driver on how many PASID bits are supported (extended cap reg PASID size fields). But it cannot communicate how many PASIDs are in the pool(host cgroup capacity). The QEMU process may not be the only one in a cgroup so it cannot give hard guarantees. I don't see a good way to communicate accurately at runtime as the process migrates or limit changes. We were thinking to adopt the "Limits" model as defined in the cgroup-v2 doc. " Limits -- A child can only consume upto the configured amount of the resource. Limits can be over-committed - the sum of the limits of children can exceed the amount of resource available to the parent. " So the guest cgroup would still think it has full 20 bits of PASID at its disposal. But PASID allocation may fail before reaching the full 20 bits (2M). Similar on the host side, we only enforce the limit set by the cgroup but not guarantee it. > I was thinking of a case where qemu is using a single PASID to setup > the guest kVA or similar > got it. > Jason Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Thu, Mar 04, 2021 at 11:01:44AM -0800, Jacob Pan wrote: > > For something like qemu I'd expect to put the qemu process in a cgroup > > with 1 PASID. Who cares what qemu uses the PASID for, or how it was > > allocated? > > For vSVA, we will need one PASID per guest process. But that is up to the > admin based on whether or how many SVA capable devices are directly > assigned. I hope the virtual IOMMU driver can communicate the PASID limit and the cgroup machinery in the guest can know what the actual limit is. I was thinking of a case where qemu is using a single PASID to setup the guest kVA or similar Jason
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Jason, On Thu, 4 Mar 2021 13:54:02 -0400, Jason Gunthorpe wrote: > On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote: > > > Right, I was assuming have three use cases of IOASIDs: > > 1. host supervisor SVA (not a concern, just one init_mm to bind) > > 2. host user SVA, either one IOASID per process or perhaps some private > > IOASID for private address space > > 3. VM use for guest SVA, each IOASID is bound to a guest process > > > > My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which > > is allocated by the new /dev/ioasid interface. > > > > For #2, I was thinking you can limit the host process via PIDs cgroup? > > i.e. limit fork. So the host IOASIDs are currently allocated from the > > system pool with quota of chosen by iommu_sva_init() in my patch, 0 > > means unlimited use whatever is available. > > https://lkml.org/lkml/2021/2/28/18 > > Why do we need two pools? > > If PASID's are limited then why does it matter how the PASID was > allocated? Either the thing requesting it is below the limit, or it > isn't. > you are right. it should be tracked based on the process regardless it is allocated by the user (/dev/ioasid) or indirectly by kernel drivers during iommu_sva_bind_device(). Need to consolidate both 2 and 3 and decouple cgroup and IOASID set. > For something like qemu I'd expect to put the qemu process in a cgroup > with 1 PASID. Who cares what qemu uses the PASID for, or how it was > allocated? > For vSVA, we will need one PASID per guest process. But that is up to the admin based on whether or how many SVA capable devices are directly assigned. > Jason Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote: > Right, I was assuming have three use cases of IOASIDs: > 1. host supervisor SVA (not a concern, just one init_mm to bind) > 2. host user SVA, either one IOASID per process or perhaps some private > IOASID for private address space > 3. VM use for guest SVA, each IOASID is bound to a guest process > > My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which is > allocated by the new /dev/ioasid interface. > > For #2, I was thinking you can limit the host process via PIDs cgroup? i.e. > limit fork. So the host IOASIDs are currently allocated from the system pool > with quota of chosen by iommu_sva_init() in my patch, 0 means unlimited use > whatever is available. https://lkml.org/lkml/2021/2/28/18 Why do we need two pools? If PASID's are limited then why does it matter how the PASID was allocated? Either the thing requesting it is below the limit, or it isn't. For something like qemu I'd expect to put the qemu process in a cgroup with 1 PASID. Who cares what qemu uses the PASID for, or how it was allocated? Jason
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Jean-Philippe, On Thu, 4 Mar 2021 10:49:37 +0100, Jean-Philippe Brucker wrote: > On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote: > > Hi Jacob, > > > > On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan > > wrote: > > > > > Hi Tejun, > > > > > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo wrote: > > > > > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > > > > IOASIDs are used to associate DMA requests with virtual address > > > > > spaces. They are a system-wide limited resource made available to > > > > > the userspace applications. Let it be VMs or user-space device > > > > > drivers. > > > > > > > > > > This RFC patch introduces a cgroup controller to address the > > > > > following problems: > > > > > 1. Some user applications exhaust all the available IOASIDs thus > > > > > depriving others of the same host. > > > > > 2. System admins need to provision VMs based on their needs for > > > > > IOASIDs, e.g. the number of VMs with assigned devices that perform > > > > > DMA requests with PASID. > > > > > > > > Please take a look at the proposed misc controller: > > > > > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > > > > > > > Would that fit your bill? > > > The interface definitely can be reused. But IOASID has a different > > > behavior in terms of migration and ownership checking. I guess SEV key > > > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be > > > solved by adding > > > + .can_attach = ioasids_can_attach, > > > + .cancel_attach = ioasids_cancel_attach, > > > Let me give it a try and come back. > > > > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup > > proposal. I'd like to have a direction check on whether this idea of > > using cgroup for IOASID/PASID resource management is viable. > > Yes, even for host SVA it would be good to have a cgroup. Currently the > number of shared address spaces is naturally limited by number of > processes, which can be controlled with rlimit and cgroup. But on Arm the > hardware limit on shared address spaces is 64k (number of ASIDs), easily > exhausted with the default PASID and PID limits. So a cgroup for managing > this resource is more than welcome. > > It looks like your current implementation is very dependent on > IOASID_SET_TYPE_MM? I'll need to do more reading about cgroup to see how > easily it can be adapted to host SVA which uses IOASID_SET_TYPE_NULL. > Right, I was assuming have three use cases of IOASIDs: 1. host supervisor SVA (not a concern, just one init_mm to bind) 2. host user SVA, either one IOASID per process or perhaps some private IOASID for private address space 3. VM use for guest SVA, each IOASID is bound to a guest process My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which is allocated by the new /dev/ioasid interface. For #2, I was thinking you can limit the host process via PIDs cgroup? i.e. limit fork. So the host IOASIDs are currently allocated from the system pool with quota of chosen by iommu_sva_init() in my patch, 0 means unlimited use whatever is available. https://lkml.org/lkml/2021/2/28/18 > Thanks, > Jean Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote: > Hi Jacob, > > On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan > wrote: > > > Hi Tejun, > > > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo wrote: > > > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > > > IOASIDs are used to associate DMA requests with virtual address > > > > spaces. They are a system-wide limited resource made available to the > > > > userspace applications. Let it be VMs or user-space device drivers. > > > > > > > > This RFC patch introduces a cgroup controller to address the following > > > > problems: > > > > 1. Some user applications exhaust all the available IOASIDs thus > > > > depriving others of the same host. > > > > 2. System admins need to provision VMs based on their needs for > > > > IOASIDs, e.g. the number of VMs with assigned devices that perform > > > > DMA requests with PASID. > > > > > > Please take a look at the proposed misc controller: > > > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > > > > > Would that fit your bill? > > The interface definitely can be reused. But IOASID has a different > > behavior in terms of migration and ownership checking. I guess SEV key > > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be > > solved by adding > > + .can_attach = ioasids_can_attach, > > + .cancel_attach = ioasids_cancel_attach, > > Let me give it a try and come back. > > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup proposal. > I'd like to have a direction check on whether this idea of using cgroup for > IOASID/PASID resource management is viable. Yes, even for host SVA it would be good to have a cgroup. Currently the number of shared address spaces is naturally limited by number of processes, which can be controlled with rlimit and cgroup. But on Arm the hardware limit on shared address spaces is 64k (number of ASIDs), easily exhausted with the default PASID and PID limits. So a cgroup for managing this resource is more than welcome. It looks like your current implementation is very dependent on IOASID_SET_TYPE_MM? I'll need to do more reading about cgroup to see how easily it can be adapted to host SVA which uses IOASID_SET_TYPE_NULL. Thanks, Jean
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Jacob, On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan wrote: > Hi Tejun, > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo wrote: > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > > IOASIDs are used to associate DMA requests with virtual address > > > spaces. They are a system-wide limited resource made available to the > > > userspace applications. Let it be VMs or user-space device drivers. > > > > > > This RFC patch introduces a cgroup controller to address the following > > > problems: > > > 1. Some user applications exhaust all the available IOASIDs thus > > > depriving others of the same host. > > > 2. System admins need to provision VMs based on their needs for > > > IOASIDs, e.g. the number of VMs with assigned devices that perform > > > DMA requests with PASID. > > > > Please take a look at the proposed misc controller: > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > > > Would that fit your bill? > The interface definitely can be reused. But IOASID has a different > behavior in terms of migration and ownership checking. I guess SEV key > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be > solved by adding > + .can_attach = ioasids_can_attach, > + .cancel_attach = ioasids_cancel_attach, > Let me give it a try and come back. > While I am trying to fit the IOASIDs cgroup in to the misc cgroup proposal. I'd like to have a direction check on whether this idea of using cgroup for IOASID/PASID resource management is viable. Alex/Jason/Jean and everyone, your feedback is much appreciated. > Thanks for the pointer. > > Jacob > > > > > Thanks. > > > > > Thanks, > > Jacob Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote: > > The interface definitely can be reused. But IOASID has a different > > behavior in terms of migration and ownership checking. I guess SEV key > > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be > > solved by adding > > + .can_attach = ioasids_can_attach, > > + .cancel_attach = ioasids_cancel_attach, > > Let me give it a try and come back. > > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup proposal. > I'd like to have a direction check on whether this idea of using cgroup for > IOASID/PASID resource management is viable. > > Alex/Jason/Jean and everyone, your feedback is much appreciated. IMHO I can't think of anything else to enforce some limit on a HW scarce resource that unpriv userspace can consume. Jason
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
Hi Tejun, On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo wrote: > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > > IOASIDs are used to associate DMA requests with virtual address spaces. > > They are a system-wide limited resource made available to the userspace > > applications. Let it be VMs or user-space device drivers. > > > > This RFC patch introduces a cgroup controller to address the following > > problems: > > 1. Some user applications exhaust all the available IOASIDs thus > > depriving others of the same host. > > 2. System admins need to provision VMs based on their needs for IOASIDs, > > e.g. the number of VMs with assigned devices that perform DMA requests > > with PASID. > > Please take a look at the proposed misc controller: > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com > > Would that fit your bill? The interface definitely can be reused. But IOASID has a different behavior in terms of migration and ownership checking. I guess SEV key IDs are not tied to a process whereas IOASIDs are. Perhaps this can be solved by adding + .can_attach = ioasids_can_attach, + .cancel_attach = ioasids_cancel_attach, Let me give it a try and come back. Thanks for the pointer. Jacob > > Thanks. > Thanks, Jacob
Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller
On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote: > IOASIDs are used to associate DMA requests with virtual address spaces. > They are a system-wide limited resource made available to the userspace > applications. Let it be VMs or user-space device drivers. > > This RFC patch introduces a cgroup controller to address the following > problems: > 1. Some user applications exhaust all the available IOASIDs thus > depriving others of the same host. > 2. System admins need to provision VMs based on their needs for IOASIDs, > e.g. the number of VMs with assigned devices that perform DMA requests > with PASID. Please take a look at the proposed misc controller: http://lkml.kernel.org/r/20210302081705.1990283-2-vipi...@google.com Would that fit your bill? Thanks. -- tejun
[RFC PATCH 15/18] cgroup: Introduce ioasids controller
IOASIDs are used to associate DMA requests with virtual address spaces. They are a system-wide limited resource made available to the userspace applications. Let it be VMs or user-space device drivers. This RFC patch introduces a cgroup controller to address the following problems: 1. Some user applications exhaust all the available IOASIDs thus depriving others of the same host. 2. System admins need to provision VMs based on their needs for IOASIDs, e.g. the number of VMs with assigned devices that perform DMA requests with PASID. This patch is nowhere near its completion, it merely provides the basic functionality for resource distribution and cgroup hierarchy organizational changes. Since this is part of a greater effort to enable Shared Virtual Address (SVA) virtualization. We would like to have a direction check and collect feedback early. For details, please refer to the documentation: Documentation/admin-guide/cgroup-v1/ioasids.rst Signed-off-by: Jacob Pan --- include/linux/cgroup_subsys.h | 4 + include/linux/ioasid.h| 17 ++ init/Kconfig | 7 + kernel/cgroup/Makefile| 1 + kernel/cgroup/ioasids.c | 345 ++ 5 files changed, 374 insertions(+) create mode 100644 kernel/cgroup/ioasids.c diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index acb77dcff3b4..cda75ecdcdcb 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -57,6 +57,10 @@ SUBSYS(hugetlb) SUBSYS(pids) #endif +#if IS_ENABLED(CONFIG_CGROUP_IOASIDS) +SUBSYS(ioasids) +#endif + #if IS_ENABLED(CONFIG_CGROUP_RDMA) SUBSYS(rdma) #endif diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 4547086797df..5ea4710efb02 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -135,8 +135,25 @@ void ioasid_set_for_each_ioasid(struct ioasid_set *sdata, void *data); int ioasid_register_notifier_mm(struct mm_struct *mm, struct notifier_block *nb); void ioasid_unregister_notifier_mm(struct mm_struct *mm, struct notifier_block *nb); +#ifdef CONFIG_CGROUP_IOASIDS +int ioasid_cg_charge(struct ioasid_set *set); +void ioasid_cg_uncharge(struct ioasid_set *set); +#else +/* No cgroup control, allocation will proceed until run out total pool */ +static inline int ioasid_cg_charge(struct ioasid_set *set) +{ + return 0; +} + +static inline int ioasid_cg_uncharge(struct ioasid_set *set) +{ + return 0; +} +#endif /* CGROUP_IOASIDS */ bool ioasid_queue_work(struct work_struct *work); + #else /* !CONFIG_IOASID */ + static inline void ioasid_install_capacity(ioasid_t total) { } diff --git a/init/Kconfig b/init/Kconfig index b77c60f8b963..9a23683dad98 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1017,6 +1017,13 @@ config CGROUP_PIDS since the PIDs limit only affects a process's ability to fork, not to attach to a cgroup. +config CGROUP_IOASIDS + bool "IOASIDs controller" + depends on IOASID + help + Provides enforcement of IO Address Space ID limits in the scope of a + cgroup. + config CGROUP_RDMA bool "RDMA controller" help diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile index 5d7a76bfbbb7..c5ad7c9a2305 100644 --- a/kernel/cgroup/Makefile +++ b/kernel/cgroup/Makefile @@ -3,6 +3,7 @@ obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o freezer.o obj-$(CONFIG_CGROUP_FREEZER) += legacy_freezer.o obj-$(CONFIG_CGROUP_PIDS) += pids.o +obj-$(CONFIG_CGROUP_IOASIDS) += ioasids.o obj-$(CONFIG_CGROUP_RDMA) += rdma.o obj-$(CONFIG_CPUSETS) += cpuset.o obj-$(CONFIG_CGROUP_DEBUG) += debug.o diff --git a/kernel/cgroup/ioasids.c b/kernel/cgroup/ioasids.c new file mode 100644 index ..ac43813da6ad --- /dev/null +++ b/kernel/cgroup/ioasids.c @@ -0,0 +1,345 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * IO Address Space ID limiting controller for cgroups. + * + */ +#define pr_fmt(fmt)"ioasids_cg: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include + +#define IOASIDS_MAX_STR "max" +static DEFINE_MUTEX(ioasids_cg_lock); + +struct ioasids_cgroup { + struct cgroup_subsys_state css; + atomic64_t counter; + atomic64_t limit; + struct cgroup_file events_file; + /* Number of times allocations failed because limit was hit. */ + atomic64_t events_limit; +}; + +static struct ioasids_cgroup *css_ioasids(struct cgroup_subsys_state *css) +{ + return container_of(css, struct ioasids_cgroup, css); +} + +static struct ioasids_cgroup *parent_ioasids(struct ioasids_cgroup *ioasids) +{ + return css_ioasids(ioasids->css.parent); +} + +static struct cgroup_subsys_state * +ioasids_css_alloc(struct cgroup_subsys_state *parent) +{ + struct ioasids_cgroup *ioasids; + + ioasids = kzalloc(sizeof(struct i