Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sun, Feb 21, 2016 at 07:41:08PM +0530, Parav Pandit wrote:
> CONFIG_CGROUP_RDMA
> 
> On Sun, Feb 21, 2016 at 7:15 PM, Leon Romanovsky  wrote:
> > On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
> >> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
> >> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> >> > Can you place this ifdef before declaring struct rdma_cgroup?
> >> Yes. I missed out this cleanup. Done locally now.
> >
> > Great, additional thing which spotted my attention was related to
> > declaring and using the new cgroups functions. There are number of
> > places where you protected the calls by specific ifdefs in the
> > IB/core c-files and not in h-files as it is usually done.
> >
> ib_device_register_rdmacg, ib_device_unregister_rdmacg are the only
> two functions called from IB/core as its tied to functionality.
> They can also be implemented as NULL call when CONFIG_CGROUP_RDMA is 
> undefined.
> (Similar to ib_rdmacg_try_charge and others).
> I didn't do because occurrence of call of register and unregister is
> limited to single file and only twice compare to charge/uncharge
> functions.
> Either way is fine with me, I can make the changes which you
> described. Let me know.

Please do,
IMHO, it is better to have one place which handles all relevant ifdefs
and functions. IB/core doesn't need to know about cgroups implementation.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> > Can you place this ifdef before declaring struct rdma_cgroup?
> Yes. I missed out this cleanup. Done locally now.

Great, additional thing which spotted my attention was related to
declaring and using the new cgroups functions. There are number of
places where you protected the calls by specific ifdefs in the
IB/core c-files and not in h-files as it is usually done.

> 
> > Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Parav Pandit
On Sun, Feb 21, 2016 at 8:39 PM, Leon Romanovsky  wrote:
> On Sun, Feb 21, 2016 at 07:41:08PM +0530, Parav Pandit wrote:
>> CONFIG_CGROUP_RDMA
>>
>> On Sun, Feb 21, 2016 at 7:15 PM, Leon Romanovsky  wrote:
>> > On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
>> >> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
>> >> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
>> >> > Can you place this ifdef before declaring struct rdma_cgroup?
>> >> Yes. I missed out this cleanup. Done locally now.
>> >
>> > Great, additional thing which spotted my attention was related to
>> > declaring and using the new cgroups functions. There are number of
>> > places where you protected the calls by specific ifdefs in the
>> > IB/core c-files and not in h-files as it is usually done.
>> >
>> ib_device_register_rdmacg, ib_device_unregister_rdmacg are the only
>> two functions called from IB/core as its tied to functionality.
>> They can also be implemented as NULL call when CONFIG_CGROUP_RDMA is 
>> undefined.
>> (Similar to ib_rdmacg_try_charge and others).
>> I didn't do because occurrence of call of register and unregister is
>> limited to single file and only twice compare to charge/uncharge
>> functions.
>> Either way is fine with me, I can make the changes which you
>> described. Let me know.
>
> Please do,
> IMHO, it is better to have one place which handles all relevant ifdefs
> and functions. IB/core doesn't need to know about cgroups implementation.
>
ok. Done. Thanks for the review. I will accumulate more comments from
Tejun and others before spinning v7.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/memory-barriers: fix wrong comment in example

2016-02-21 Thread SeongJae Park
On Sun, Feb 21, 2016 at 2:25 PM, Paul E. McKenney
 wrote:
> On Sun, Feb 21, 2016 at 07:50:19AM +0900, SeongJae Park wrote:
>> On Sun, Feb 21, 2016 at 4:57 AM, Paul E. McKenney
>>  wrote:
>> > On Sat, Feb 20, 2016 at 03:01:08PM +0900, SeongJae Park wrote:
>> >> There is wrong comment in example for compiler store omit behavior.  It
>> >> shows example of the problem and than problem solved version code.
>> >> However, the comment in the solved version is still same with not solved
>> >> version.  Fix the wrong statement with this commit.
>> >>
>> >> Signed-off-by: SeongJae Park 
>> >
>> > Hmmm...  The code between the two stores of zero to "a" is intended to
>> > remain the same in the broken and fixed versions.  So the only change
>> > is from "a = 0" to "WRITE_ONCE(a, 0)".  Note that it is some other
>> > CPU that did the third store to "a".
>>
>> Agree, of course.
>>
>> >
>> > Or am I missing your point here?
>>
>> My point is about the comment.
>> I thought the comment in broken version is saying "Below line(a = 0) says
>> it will store to variable 'a', but it will not in actual because a compiler 
>> can
>> omit it".
>> However, in fixed version, because the compiler cannot omit the store
>> now, I thought the comment also should be changed to say the difference
>> between broken and fixed version.
>>
>> If I am understanding anything wrong, please let me know.
>
> Hmmm...  The intent of the comment is to act as a placeholder for
> arbitrary code that does not affect the value of "a".  The current
> comment is clearly not doing that for you.  Possible changes include:
>
> o   Adding test to the comment making the intent more clear.
> o   Replacing the comment with a function call, perhaps to
> does_not_change_a() or some similar name.
> o   Keeping the current comment, but adding a call to something
> like does_not_change_a() after it.
>
> Other thoughts?

Ah, now I understood the original intent.  Thank you for the kind explanation.
I think your third option will be most helpful for me. How about the
patch below?

BTW, the problem looks trivial rather than critical.
If you think so, feel free to ignore my patch, please.


Thanks,
SeongJae Park


== >3 ==
>From 77e0b1c77d64c358b329b097cffcdacd2c484867 Mon Sep 17 00:00:00 2001
From: SeongJae Park 
Date: Sun, 21 Feb 2016 15:18:16 +0900
Subject: [PATCH] Documentation/memory-barriers: polish compiler store omit
 example

Comments of examples about compiler store omit in memory-barriers.txt is
about code that could be possible at that point.  However, someone could
interpret the comment as an explanation about below line.  This commit
exploits the intent more explicitly by adding a function call below the
comment.

Signed-off-by: SeongJae Park 
---
 Documentation/memory-barriers.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/memory-barriers.txt
b/Documentation/memory-barriers.txt
index 904ee42..3a17d66 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1460,6 +1460,7 @@ of optimizations:

  a = 0;
  /* Code that does not store to variable a. */
+ does_not_change_a();
  a = 0;

  The compiler sees that the value of variable 'a' is already zero, so
@@ -1472,6 +1473,7 @@ of optimizations:

  WRITE_ONCE(a, 0);
  /* Code that does not store to variable a. */
+ does_not_change_a();
  WRITE_ONCE(a, 0);

  (*) The compiler is within its rights to reorder memory accesses unless
-- 
1.9.1


>
> Thanx, Paul
>
>> Thanks,
>> SeongJae Park
>>
>> >
>> > Thanx, Paul
>> >
>> >> ---
>> >>  Documentation/memory-barriers.txt | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/Documentation/memory-barriers.txt 
>> >> b/Documentation/memory-barriers.txt
>> >> index 061ff29..b4754c7 100644
>> >> --- a/Documentation/memory-barriers.txt
>> >> +++ b/Documentation/memory-barriers.txt
>> >> @@ -1471,7 +1471,7 @@ of optimizations:
>> >>   wrong guess:
>> >>
>> >>   WRITE_ONCE(a, 0);
>> >> - /* Code that does not store to variable a. */
>> >> + /* Code that does store to variable a. */
>> >>   WRITE_ONCE(a, 0);
>> >>
>> >>   (*) The compiler is within its rights to reorder memory accesses unless
>> >> --
>> >> 1.9.1
>> >>
>> >
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND] TTY, devpts: document pty count limiting

2016-02-21 Thread Konstantin Khlebnikov
Logic has been changed in kernel 3.4 by commit e9aba5158a80
("tty: rework pty count limiting") but still not documented.

Sysctl kernel.pty.max works as global limit, kernel.pty.reserve ptys
are reserved for initial devpts instance (mounted without "newinstance").
Per-instance limit also could be set by mount option "max=%d".

Signed-off-by: Konstantin Khlebnikov 
---
 Documentation/filesystems/devpts.txt |9 +
 Documentation/sysctl/kernel.txt  |1 +
 2 files changed, 10 insertions(+)

diff --git a/Documentation/filesystems/devpts.txt 
b/Documentation/filesystems/devpts.txt
index 68dffd87f9b7..30d2fcb32f72 100644
--- a/Documentation/filesystems/devpts.txt
+++ b/Documentation/filesystems/devpts.txt
@@ -51,6 +51,15 @@ where 'ns_exec -cm /bin/bash' calls clone() with CLONE_NEWNS 
flag and execs
 /bin/bash in the child process.  A pty created by the sshd is not visible in
 the original mount of /dev/pts.
 
+Total count of pty pairs in all instances is limited by sysctls:
+kernel.pty.max = 4096  - global limit
+kernel.pty.reserve = 1024  - reserve for initial instance
+kernel.pty.nr  - current count of ptys
+
+Per-instance limit could be set by adding mount option "max=".
+This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve.
+In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit.
+
 User-space changes
 --
 
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index a93b414672a7..d05e70b7d8dd 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -64,6 +64,7 @@ show up in /proc/sys/kernel:
 - printk_delay
 - printk_ratelimit
 - printk_ratelimit_burst
+- pty ==> Documentation/filesystems/devpts.txt
 - randomize_va_space
 - real-root-dev   ==> Documentation/initrd.txt
 - reboot-cmd  [ SPARC only ]

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Parav Pandit
CONFIG_CGROUP_RDMA

On Sun, Feb 21, 2016 at 7:15 PM, Leon Romanovsky  wrote:
> On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
>> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
>> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
>> > Can you place this ifdef before declaring struct rdma_cgroup?
>> Yes. I missed out this cleanup. Done locally now.
>
> Great, additional thing which spotted my attention was related to
> declaring and using the new cgroups functions. There are number of
> places where you protected the calls by specific ifdefs in the
> IB/core c-files and not in h-files as it is usually done.
>
ib_device_register_rdmacg, ib_device_unregister_rdmacg are the only
two functions called from IB/core as its tied to functionality.
They can also be implemented as NULL call when CONFIG_CGROUP_RDMA is undefined.
(Similar to ib_rdmacg_try_charge and others).
I didn't do because occurrence of call of register and unregister is
limited to single file and only twice compare to charge/uncharge
functions.
Either way is fine with me, I can make the changes which you
described. Let me know.

>>
>> > Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Parav Pandit
On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
> On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> Can you place this ifdef before declaring struct rdma_cgroup?
Yes. I missed out this cleanup. Done locally now.

> Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> Added rdma cgroup controller that does accounting, limit enforcement
> on rdma/IB verbs and hw resources.
> 
> Added rdma cgroup header file which defines its APIs to perform
> charing/uncharing functionality and device registration which will
> participate in controller functions of accounting and limit
> enforcements. It also define rdmacg_device structure to bind IB stack
> and RDMA cgroup controller.
> 
> RDMA resources are tracked using resource pool. Resource pool is per
> device, per cgroup entity which allows setting up accounting limits
> on per device basis.
> 
> Resources are not defined by the RDMA cgroup, instead they are defined
> by the external module IB stack. This allows extending IB stack
> without changing kernel, as IB stack is going through changes
> and enhancements.
> 
> Resource pool is created/destroyed dynamically whenever
> charging/uncharging occurs respectively and whenever user
> configuration is done. Its a tradeoff of memory vs little more code
> space that creates resource pool whenever necessary,
> instead of creating them during cgroup creation and device registration
> time.
> 
> Signed-off-by: Parav Pandit 
> ---
>  include/linux/cgroup_rdma.h   |  53 +++
>  include/linux/cgroup_subsys.h |   4 +
>  init/Kconfig  |  10 +
>  kernel/Makefile   |   1 +
>  kernel/cgroup_rdma.c  | 753 
> ++
>  5 files changed, 821 insertions(+)
>  create mode 100644 include/linux/cgroup_rdma.h
>  create mode 100644 kernel/cgroup_rdma.c
> 
> diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h
> new file mode 100644
> index 000..b370733
> --- /dev/null
> +++ b/include/linux/cgroup_rdma.h
> @@ -0,0 +1,53 @@
> +#ifndef _CGROUP_RDMA_H
> +#define _CGROUP_RDMA_H
> +
> +#include 
> +
> +struct rdma_cgroup {
> +#ifdef CONFIG_CGROUP_RDMA
> + struct cgroup_subsys_state  css;
> +
> + spinlock_t rpool_list_lock; /* protects resource pool list */
> + struct list_head rpool_head;/* head to keep track of all resource
> +  * pools that belongs to this cgroup.
> +  */
> +#endif
> +};
> +
> +#ifdef CONFIG_CGROUP_RDMA

I'm sure that you already asked about that, but why do you need ifdef
embedded in struct rdma_cgroup and right after that the same one?
Can you place this ifdef before declaring struct rdma_cgroup?

> +
> +struct rdmacg_device;
> +

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/3] init: add support to directly boot to a mapped device

2016-02-21 Thread Alasdair G Kergon
On Sat, Feb 20, 2016 at 10:13:49AM -0800, Kees Cook wrote:
> This is a resurrection of a patch series from a few years back, first
> brought to the dm maintainers in 2010. It creates a way to define dm
> devices on the kernel command line for systems that do not use an
> initramfs, or otherwise need a dm running before init starts.
> 
> This has been used by Chrome OS for several years, and now by Brillo
> (and likely Android soon).
> 
> The last version was v4:
> https://patchwork.kernel.org/patch/104860/
> https://patchwork.kernel.org/patch/104861/
 
Inconsistencies in the terminology here can be sorted out during review,
and I see that you've taken on board some of my review comments from
2010, but what are your responses to the rest of them?

Alasdair

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation/memory-barriers: fix wrong comment in example

2016-02-21 Thread Paul E. McKenney
On Sun, Feb 21, 2016 at 07:50:19AM +0900, SeongJae Park wrote:
> On Sun, Feb 21, 2016 at 4:57 AM, Paul E. McKenney
>  wrote:
> > On Sat, Feb 20, 2016 at 03:01:08PM +0900, SeongJae Park wrote:
> >> There is wrong comment in example for compiler store omit behavior.  It
> >> shows example of the problem and than problem solved version code.
> >> However, the comment in the solved version is still same with not solved
> >> version.  Fix the wrong statement with this commit.
> >>
> >> Signed-off-by: SeongJae Park 
> >
> > Hmmm...  The code between the two stores of zero to "a" is intended to
> > remain the same in the broken and fixed versions.  So the only change
> > is from "a = 0" to "WRITE_ONCE(a, 0)".  Note that it is some other
> > CPU that did the third store to "a".
> 
> Agree, of course.
> 
> >
> > Or am I missing your point here?
> 
> My point is about the comment.
> I thought the comment in broken version is saying "Below line(a = 0) says
> it will store to variable 'a', but it will not in actual because a compiler 
> can
> omit it".
> However, in fixed version, because the compiler cannot omit the store
> now, I thought the comment also should be changed to say the difference
> between broken and fixed version.
> 
> If I am understanding anything wrong, please let me know.

Hmmm...  The intent of the comment is to act as a placeholder for
arbitrary code that does not affect the value of "a".  The current
comment is clearly not doing that for you.  Possible changes include:

o   Adding test to the comment making the intent more clear.
o   Replacing the comment with a function call, perhaps to
does_not_change_a() or some similar name.
o   Keeping the current comment, but adding a call to something
like does_not_change_a() after it.

Other thoughts?

Thanx, Paul

> Thanks,
> SeongJae Park
> 
> >
> > Thanx, Paul
> >
> >> ---
> >>  Documentation/memory-barriers.txt | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/Documentation/memory-barriers.txt 
> >> b/Documentation/memory-barriers.txt
> >> index 061ff29..b4754c7 100644
> >> --- a/Documentation/memory-barriers.txt
> >> +++ b/Documentation/memory-barriers.txt
> >> @@ -1471,7 +1471,7 @@ of optimizations:
> >>   wrong guess:
> >>
> >>   WRITE_ONCE(a, 0);
> >> - /* Code that does not store to variable a. */
> >> + /* Code that does store to variable a. */
> >>   WRITE_ONCE(a, 0);
> >>
> >>   (*) The compiler is within its rights to reorder memory accesses unless
> >> --
> >> 1.9.1
> >>
> >
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v16 1/6] fpga: add bindings document for fpga region

2016-02-21 Thread Rob Herring
On Fri, Feb 05, 2016 at 04:44:46PM -0600, Josh Cartwright wrote:
> Hey Alan-
> 
> First off, thanks for all of your (and others') work on this.
> 
> On Fri, Feb 05, 2016 at 03:29:58PM -0600, at...@opensource.altera.com wrote:
> > From: Alan Tull 
> > 
> > New bindings document for FPGA Region to support programming
> > FPGA's under Device Tree control
> > 
> > Signed-off-by: Alan Tull 
> > Signed-off-by: Moritz Fischer 
> [..]
> > ---
> >  .../devicetree/bindings/fpga/fpga-region.txt   |  348 
> > 
> >  1 file changed, 348 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/fpga/fpga-region.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/fpga/fpga-region.txt 
> > b/Documentation/devicetree/bindings/fpga/fpga-region.txt
> > new file mode 100644
> > index 000..ccd7127
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/fpga/fpga-region.txt
> [..]
> > +FPGA Manager & FPGA Manager Framework
> > + * An FPGA Manager is a hardware block that programs an FPGA under the 
> > control
> > +   of a host processor.
> > + * The FPGA Manager Framework provides drivers and functions to program an
> > +   FPGA.
> > +
> > +FPGA Bridge Framework
> > + * Provides drivers and functions to control bridges that enable/disable
> > +   data to the FPGA.
> > + * FPGA Bridges should be disabled while the FPGA is being programmed to
> > +   prevent spurious data on the bus.
> > + * FPGA Bridges may not be needed in implementations where the FPGA Manager
> > +   handles this.
> 
> It still seems strange for me architecturally for the FPGA Bridge to be
> a first-class top-level concept in your architecture, as they are a
> reflection of the SoC FPGA manager design.  That is, I would expect the
> bridges not to be associated with the FPGA Region, but with the FPGA
> manager.
> 
> Although, I will concede that you you've made it possible to not use
> FPGA Bridges (like on Zynq where they aren't necessary), so maybe it
> doesn't matter, just smells strangely.

In general, DT models buses in the node hierarchy. To go from one bus to 
another, you need a bridge. Going from an onchip bus to an FPGA bus 
has to have some sort of bridge logic in between for isolation 
minimally. Zynq has to have something similar. Perhaps the bridge 
control is not part of the bridges themselves?

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 0/3] rdmacg: IB/core: rdma controller support

2016-02-21 Thread Parav Pandit
Hi Tejun, Doug,

I would like to know direction from you on how do we intent to merge this code.
So that I generate next patch v7 against right tree.

Few options that comes to me are:
1. Shall we merge this code from Doug's linux-rdma tree, where there
are no merge conflicts in cgroup?
Or
2. Shall we merge this as two separate patches one from cgroup side
from Tejun's cgroups tree as kernel support and 2nd from linux-rdma
for IB core changes to make use of those features?
Or
3. Some other options that you have in mind.

I was thinking of (1) as that has the least churn in development, test
and merge efforts.

Regards,
Parav Pandit

On Sat, Feb 20, 2016 at 4:30 PM, Parav Pandit  wrote:
> Overview:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
> to allocate any rdma resources. This results into service unavailibility.
>
> RDMA cgroup addresses this issue by allowing resource accounting,
> limit enforcement on per cgroup, per rdma device basis.
>
> Resources are not defined by the RDMA cgroup. Resources are defined
> by RDMA/IB stack. This allows rdma cgroup to remain constant while RDMA/IB
> stack can evolve without the need of rdma cgroup update. A new
> resource can be easily added by the RDMA/IB stack without touching
> rdma cgroup.
>
> RDMA uverbs layer will enforce limits on well defined RDMA verb
> resources without any HCA vendor device driver involvement.
>
> RDMA uverbs layer will not do accounting of hw vendor specific resources.
> Instead rdma cgroup provides set of APIs through which vendor specific
> drivers can do resource accounting by making use of rdma cgroup.
>
> Resource limit enforcement is hierarchical.
>
> When process is migrated with active RDMA resources, rdma cgroup
> continues to uncharge original cgroup for allocated resource. New resource
> is charged to current process's cgroup, which means if the process is
> migrated with active resources, for new resources it will be charged to
> new cgroup and old resources will be correctly uncharged from old cgroup.
>
> Changes from v5:
>  * (To address comments from Tejun)
>1. Removed two type of resource pool, made is single type (as Tejun
>   described in past comment)
>2. Removed match tokens and have array definition like "qp", "mr",
>   "cq" etc.
>3. Wrote small parser and avoided match_token API as that won't work
>   due to different array definitions
>4. Removed one-off remove API to unconfigure cgroup, instead all
>   resource should be set to max.
>5. Removed resource pool type (user/default), instead having
>   max_num_cnt, when ref_cnt drops to zero and
>   max_num_cnt = total_rescource_cnt, pool is freed.
>6. Resource definition ownership is now only with IB stack at single
>   header file, no longer in each low level driver.
>   This goes through IB maintainer and other reviewers eyes.
>   This continue to give flexibility to not force kernel upgrade for
>   few enums additions for new resource type.
>7. Wherever possible pool lock is pushed out, except for hierarchical
>   charging/unchanging points, as it not possible to do so, due to
>   iterative process involves blocking allocations of rpool. Coming up
>   more levels up to release locks doesn't make any sense either.
>   This is anyway slow path where rpool is not allocated. Except for
>   typical first resource allocation, this is less travelled path.
>8. Avoided %d manipulation due to removal of match_token and replaced
>   with seq_putc etc friend functions.
>  * Other minor cleanups.
>  * Fixed rdmacg_register_device to return error in case of IB stack
>tries to register for than 64 resources.
>  * Fixed not allowing negative value on resource setting.
>  * Fixed cleaning up resource pools during device removal.
>  * Simplfied and rename table length field to use ARRAY_SIZE macro.
>  * Updated documentation to reflect single resource pool and shorter
>file names.
>
> Changes from v4:
>  * Fixed compilation errors for lockdep_assert_held reported by kbuild
>test robot
>  * Fixed compilation warning reported by coccinelle for extra
>semicolon.
>  * Fixed compilation error for inclusion of linux/parser.h which
>cannot be included in any header file, as that triggers multiple
>inclusion error. parser.h is included in C files which intent to
>use it.
>  * Removed unused header file inclusion in cgroup_rdma.c
>
> Changes from v3:
>  * (To address comments from Tejun)
>1. Renamed cg_resource to rdmacg_resource
>2. Merged dealloc_cg_rpool and _dealloc_cg_rpool to single function
>3. Renamed _find_cg_rpool to find_cg_rpool_locked()
>5. Removed RDMACG_MAX_RESOURCE_INDEX limitation
>6. Fixed few alignments.
>7. Improved description for RDMA cgroup configurat