Re: [PATCH v4 4/6] ipc: Clamp msgmni and shmmni to the real IPCMNI limit

2018-03-13 Thread Waiman Long
On 03/13/2018 04:29 PM, Eric W. Biederman wrote:
> Waiman Long  writes:
>
>> On 03/13/2018 02:17 PM, Eric W. Biederman wrote:
>>> Waiman Long  writes:
>>>
 A user can write arbitrary integer values to msgmni and shmmni sysctl
 parameters without getting error, but the actual limit is really
 IPCMNI (32k). This can mislead users as they think they can get a
 value that is not real.

 Enforcing the limit by failing the sysctl parameter write, however,
 can break existing user applications.
>>> Which applications examples please.
>>>
>>> I am seeing this patchset late but it looks like a whole lot of changes
>>> to avoid a theoretical possibility.
>>>
>>> Changes that have an impact on more than just the ipc code you are
>>> patching.
>>>
>>> That makes me feel very uncomfortable with these changes.
>>>
>>> Eric
>> This patchset is constructed to address a customer request that there is
>> no easy way to find out the actual usable range of a sysctl parameter.
>> In this particular case, the customer wants to use more than 32k shared
>> memory segments. They can put in a large value into shmmni, but the
>> application didn't work properly because shmmni was internally clamped
>> to 32k without any visible sign that a smaller limit has been imposed.
>>
>> Out of a concern that there might be customers out there setting those
>> sysctl parameters outside of the allowable range without knowing it,
>> just enforcing the right limits may have the undesirable consequence of
>> breaking their existing setup scripts. I don't have concrete example of
>> what customers are doing that, but it  won't look good if we wait until
>> the complaints come in.
>>
>> The new code won't affect existing code unless the necessary flag is
>> set. So would you mind elaborating what other impact do you see that
>> will affect other non-IPC code in an undesirable way?
> The increase in size of struct ctl_table.  Every caller is affected.
> Plus it increases everyone's cognitive load to figure out what is
> this flags field as they fill out ctl_table.

As said earlier, there is a way to add the new flags without increasing
the size of the structure. The flags was originally a uint16_t which
won't increase the size of the structure. It was changed in v4 to
provide more space for future extension. I am going to change it back to
uint16_t. It can certainly be changed again in the future if we really
need more than 16 bits.

> Just introducing a proc_dointvec_minmax_clamped follows the existing
> pattern and it makes it easier for everyone who both read the code.

It was the approach that was taken in my v1 patch. I changed it to use a
flag in v2 as it is more general. It also make it easier to extend the
feature in the future. Adding more proc_handlers also has the same issue
of increasing cognitive load of what proc_handler to use in the
ctl_table. This is the side effect of adding more complexity and there
is no way to work around it.

> It strikes me as quite peculiar that the response to bug report where
> the complaint is an error is not given, is to continue the current
> behavior without giving an error.
The customer actually has 2 requests. First, it is to make it easier to
figure out what are the real ranges of some of the sysctl parameters.
Secondly, they want to increase the IPCMNI limit to more than 32k. This
patch tries to address the first one. I will leave the second one to a
future patch once this one is done. As said before, the reason for
adding a clamping mode is to avoid regression. If you guys strongly
believe that this is not an issue at all, I am fine going along and
enforce the limit by failing invalid value.

I am also working on a follow-up patch to add an iteration API to dump
out the the ranges of relevant sysctl parameters that allow range of
values. That will require using a flag to annotate ctl_table entries
that have ranges.

> Arguably the simplest fix here would be to kill IPCMNI entirely.  Assign
> the shmids from a sequence counter.  And place those structures in a
> rbtree indexed by shmni.  There are 32bit fields but I don't think we
> must keep the low 16bits for an index into an array and the high 16bits
> as the actual sequence number.

Yes, that is on my to-do list.

Cheers,
Longman



Re: [PATCH v4 4/6] ipc: Clamp msgmni and shmmni to the real IPCMNI limit

2018-03-13 Thread Eric W. Biederman
Waiman Long  writes:

> On 03/13/2018 02:17 PM, Eric W. Biederman wrote:
>> Waiman Long  writes:
>>
>>> A user can write arbitrary integer values to msgmni and shmmni sysctl
>>> parameters without getting error, but the actual limit is really
>>> IPCMNI (32k). This can mislead users as they think they can get a
>>> value that is not real.
>>>
>>> Enforcing the limit by failing the sysctl parameter write, however,
>>> can break existing user applications.
>> Which applications examples please.
>>
>> I am seeing this patchset late but it looks like a whole lot of changes
>> to avoid a theoretical possibility.
>>
>> Changes that have an impact on more than just the ipc code you are
>> patching.
>>
>> That makes me feel very uncomfortable with these changes.
>>
>> Eric
>
> This patchset is constructed to address a customer request that there is
> no easy way to find out the actual usable range of a sysctl parameter.
> In this particular case, the customer wants to use more than 32k shared
> memory segments. They can put in a large value into shmmni, but the
> application didn't work properly because shmmni was internally clamped
> to 32k without any visible sign that a smaller limit has been imposed.
>
> Out of a concern that there might be customers out there setting those
> sysctl parameters outside of the allowable range without knowing it,
> just enforcing the right limits may have the undesirable consequence of
> breaking their existing setup scripts. I don't have concrete example of
> what customers are doing that, but it  won't look good if we wait until
> the complaints come in.
>
> The new code won't affect existing code unless the necessary flag is
> set. So would you mind elaborating what other impact do you see that
> will affect other non-IPC code in an undesirable way?

The increase in size of struct ctl_table.  Every caller is affected.
Plus it increases everyone's cognitive load to figure out what is
this flags field as they fill out ctl_table.

Just introducing a proc_dointvec_minmax_clamped follows the existing
pattern and it makes it easier for everyone who both read the code.



It strikes me as quite peculiar that the response to bug report where
the complaint is an error is not given, is to continue the current
behavior without giving an error.


Arguably the simplest fix here would be to kill IPCMNI entirely.  Assign
the shmids from a sequence counter.  And place those structures in a
rbtree indexed by shmni.  There are 32bit fields but I don't think we
must keep the low 16bits for an index into an array and the high 16bits
as the actual sequence number.

Except for the checkpoint/restart case which is aguably much too
specific about how these ids are assigned that would give much more
freedom and allow people the number of shm segments that they actually
want to use.


For a further complication I don't expect you can get away with changing
the size or the fields in struct ctl_table in the kernel your customers
are running.


So please use a new function not flags it will simplify everyone's life.
If you can please actually fix this so you can have more shmids that
would be the really classy thing todo.

Eric


Re: [PATCH v4 4/6] ipc: Clamp msgmni and shmmni to the real IPCMNI limit

2018-03-13 Thread Waiman Long
On 03/13/2018 02:17 PM, Eric W. Biederman wrote:
> Waiman Long  writes:
>
>> A user can write arbitrary integer values to msgmni and shmmni sysctl
>> parameters without getting error, but the actual limit is really
>> IPCMNI (32k). This can mislead users as they think they can get a
>> value that is not real.
>>
>> Enforcing the limit by failing the sysctl parameter write, however,
>> can break existing user applications.
> Which applications examples please.
>
> I am seeing this patchset late but it looks like a whole lot of changes
> to avoid a theoretical possibility.
>
> Changes that have an impact on more than just the ipc code you are
> patching.
>
> That makes me feel very uncomfortable with these changes.
>
> Eric

This patchset is constructed to address a customer request that there is
no easy way to find out the actual usable range of a sysctl parameter.
In this particular case, the customer wants to use more than 32k shared
memory segments. They can put in a large value into shmmni, but the
application didn't work properly because shmmni was internally clamped
to 32k without any visible sign that a smaller limit has been imposed.

Out of a concern that there might be customers out there setting those
sysctl parameters outside of the allowable range without knowing it,
just enforcing the right limits may have the undesirable consequence of
breaking their existing setup scripts. I don't have concrete example of
what customers are doing that, but it  won't look good if we wait until
the complaints come in.

The new code won't affect existing code unless the necessary flag is
set. So would you mind elaborating what other impact do you see that
will affect other non-IPC code in an undesirable way?

Cheers,
Longman





Re: [PATCH v4 4/6] ipc: Clamp msgmni and shmmni to the real IPCMNI limit

2018-03-13 Thread Eric W. Biederman
Waiman Long  writes:

> A user can write arbitrary integer values to msgmni and shmmni sysctl
> parameters without getting error, but the actual limit is really
> IPCMNI (32k). This can mislead users as they think they can get a
> value that is not real.
>
> Enforcing the limit by failing the sysctl parameter write, however,
> can break existing user applications.

Which applications examples please.

I am seeing this patchset late but it looks like a whole lot of changes
to avoid a theoretical possibility.

Changes that have an impact on more than just the ipc code you are
patching.

That makes me feel very uncomfortable with these changes.

Eric


> Instead, the range clamping flag
> is set to enforce the limit without failing existing user code. Users
> can easily figure out if the sysctl parameter value is out of range
> by either reading back the parameter value or checking the kernel
> ring buffer for warning.
>
> Signed-off-by: Waiman Long 
> ---
>  ipc/ipc_sysctl.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
> index 8ad93c2..1955dd4 100644
> --- a/ipc/ipc_sysctl.c
> +++ b/ipc/ipc_sysctl.c
> @@ -99,6 +99,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, 
> int write,
>  static int zero;
>  static int one = 1;
>  static int int_max = INT_MAX;
> +static int ipc_mni = IPCMNI;
>  
>  static struct ctl_table ipc_kern_table[] = {
>   {
> @@ -120,7 +121,10 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, 
> int write,
>   .data   = &init_ipc_ns.shm_ctlmni,
>   .maxlen = sizeof(init_ipc_ns.shm_ctlmni),
>   .mode   = 0644,
> - .proc_handler   = proc_ipc_dointvec,
> + .proc_handler   = proc_ipc_dointvec_minmax,
> + .extra1 = &zero,
> + .extra2 = &ipc_mni,
> + .flags  = CTL_FLAGS_CLAMP_RANGE,
>   },
>   {
>   .procname   = "shm_rmid_forced",
> @@ -147,7 +151,8 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, 
> int write,
>   .mode   = 0644,
>   .proc_handler   = proc_ipc_dointvec_minmax,
>   .extra1 = &zero,
> - .extra2 = &int_max,
> + .extra2 = &ipc_mni,
> + .flags  = CTL_FLAGS_CLAMP_RANGE,
>   },
>   {
>   .procname   = "auto_msgmni",


[PATCH v4 4/6] ipc: Clamp msgmni and shmmni to the real IPCMNI limit

2018-03-12 Thread Waiman Long
A user can write arbitrary integer values to msgmni and shmmni sysctl
parameters without getting error, but the actual limit is really
IPCMNI (32k). This can mislead users as they think they can get a
value that is not real.

Enforcing the limit by failing the sysctl parameter write, however,
can break existing user applications. Instead, the range clamping flag
is set to enforce the limit without failing existing user code. Users
can easily figure out if the sysctl parameter value is out of range
by either reading back the parameter value or checking the kernel
ring buffer for warning.

Signed-off-by: Waiman Long 
---
 ipc/ipc_sysctl.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index 8ad93c2..1955dd4 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -99,6 +99,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int 
write,
 static int zero;
 static int one = 1;
 static int int_max = INT_MAX;
+static int ipc_mni = IPCMNI;
 
 static struct ctl_table ipc_kern_table[] = {
{
@@ -120,7 +121,10 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, 
int write,
.data   = &init_ipc_ns.shm_ctlmni,
.maxlen = sizeof(init_ipc_ns.shm_ctlmni),
.mode   = 0644,
-   .proc_handler   = proc_ipc_dointvec,
+   .proc_handler   = proc_ipc_dointvec_minmax,
+   .extra1 = &zero,
+   .extra2 = &ipc_mni,
+   .flags  = CTL_FLAGS_CLAMP_RANGE,
},
{
.procname   = "shm_rmid_forced",
@@ -147,7 +151,8 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, 
int write,
.mode   = 0644,
.proc_handler   = proc_ipc_dointvec_minmax,
.extra1 = &zero,
-   .extra2 = &int_max,
+   .extra2 = &ipc_mni,
+   .flags  = CTL_FLAGS_CLAMP_RANGE,
},
{
.procname   = "auto_msgmni",
-- 
1.8.3.1