date:20170907

On Thu, 2017-09-07 at 22:14 -0700, Guenter Roeck wrote:
> On 09/07/2017 08:40 PM, Andrew Jeffery wrote:
> > On Fri, 2017-09-08 at 12:51 +1000, Andrew Jeffery wrote:
> > > > I can't test with devicetree. x86 system.
> > > >   
> > > > 2,100+ iterations with your driver, no failures.
> > > 
> > > Great. I really appreciate your testing here, so thanks for your
> > > efforts. I owe you a few drinks if we ever happen to meet.
> > 
> > Actually, on that, how did you chop out the devicetree parts? Did you
> > keep the configuration writes at the end of max31785_of_fan_config()
> > and max31785_of_tmp_config()? Or did you just not call them? These two
> > functions cause the bulk of the bus traffic with on probe.
> > 
> 
> fan config hardcoded, four fans connected. Still no failure.

Great. Thanks again for your efforts here.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors


On 09/07/2017 08:40 PM, Andrew Jeffery wrote:

On Fri, 2017-09-08 at 12:51 +1000, Andrew Jeffery wrote:

I can't test with devicetree. x86 system.
  
2,100+ iterations with your driver, no failures.


Great. I really appreciate your testing here, so thanks for your
efforts. I owe you a few drinks if we ever happen to meet.


Actually, on that, how did you chop out the devicetree parts? Did you
keep the configuration writes at the end of max31785_of_fan_config()
and max31785_of_tmp_config()? Or did you just not call them? These two
functions cause the bulk of the bus traffic with on probe.


fan config hardcoded, four fans connected. Still no failure.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Thu, 2017-09-07 at 21:43 -0700, Guenter Roeck wrote:
> On 09/07/2017 08:40 PM, Andrew Jeffery wrote:
> > On Fri, 2017-09-08 at 12:51 +1000, Andrew Jeffery wrote:
> > > > I can't test with devicetree. x86 system.
> > > >   
> > > > 2,100+ iterations with your driver, no failures.
> > > 
> > > Great. I really appreciate your testing here, so thanks for your
> > > efforts. I owe you a few drinks if we ever happen to meet.
> > 
> > Actually, on that, how did you chop out the devicetree parts? Did you
> > keep the configuration writes at the end of max31785_of_fan_config()
> > and max31785_of_tmp_config()? Or did you just not call them? These two
> > functions cause the bulk of the bus traffic with on probe.
> > 
> 
> I didn't change to code, just compiled and run it. Guess that means
> this part was skipped.

Right, yeah looking at it a bit more, dev->of_node will be NULL for
for_each_child_of_node(dev->of_node, child), therefore the loop body
won't execute.

> 
> I'll replaced the fan configuration with some hard-coded values.
> We'll see if it makes a difference.

Sounds good.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors


On 09/07/2017 08:40 PM, Andrew Jeffery wrote:

On Fri, 2017-09-08 at 12:51 +1000, Andrew Jeffery wrote:

I can't test with devicetree. x86 system.
  
2,100+ iterations with your driver, no failures.


Great. I really appreciate your testing here, so thanks for your
efforts. I owe you a few drinks if we ever happen to meet.


Actually, on that, how did you chop out the devicetree parts? Did you
keep the configuration writes at the end of max31785_of_fan_config()
and max31785_of_tmp_config()? Or did you just not call them? These two
functions cause the bulk of the bus traffic with on probe.



I didn't change to code, just compiled and run it. Guess that means
this part was skipped.

I'll replaced the fan configuration with some hard-coded values.
We'll see if it makes a difference.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Fri, 2017-09-08 at 12:51 +1000, Andrew Jeffery wrote:
> > I can't test with devicetree. x86 system.
> > 
> > 2,100+ iterations with your driver, no failures.
> 
> Great. I really appreciate your testing here, so thanks for your
> efforts. I owe you a few drinks if we ever happen to meet.

Actually, on that, how did you chop out the devicetree parts? Did you
keep the configuration writes at the end of max31785_of_fan_config()
and max31785_of_tmp_config()? Or did you just not call them? These two
functions cause the bulk of the bus traffic with on probe.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Thu, 2017-09-07 at 19:17 -0700, Guenter Roeck wrote:
> On 09/07/2017 07:06 PM, Andrew Jeffery wrote:
> > On Thu, 2017-09-07 at 18:26 -0700, Guenter Roeck wrote:
> > > On 09/07/2017 06:02 PM, Andrew Jeffery wrote:
> > > > On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:
> > > > > On 09/07/2017 08:22 AM, Andrew Jeffery wrote:
> > > > > > On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:
> > > > > > > On 09/06/2017 04:32 PM, Andrew Jeffery wrote:
> > > > > > > 
> > > > > > > > >  
> > > > > > > > > Guess I need to dig up my eval board and see if I can 
> > > > > > > > > reproduce the problem.
> > > > > > > > > Seems you are saying that the problem is always seen when 
> > > > > > > > > issuing a sequence
> > > > > > > > > of "clear faults" commands on multiple pages ?
> > > > > > > > 
> > > > > > > > Yeah. We're also seeing bad behaviour under other command 
> > > > > > > > sequences as well,
> > > > > > > > which lead to this hack of a work-around patch[1].
> > > > > > > > 
> > > > > > > > I'd be very interested in the results of testing against the 
> > > > > > > > eval board. I
> > > > > > > > don't have access to one and it seems Maxim have discontinued 
> > > > > > > > them.
> > > > > > > > 
> > > > > > > 
> > > > > > > Do you have a somewhat reliable means to reproduce the problem ?
> > > > > > 
> > > > > > It seems we hit a bunch of problems by just continually
> > > > > > binding/unbinding the driver, if you don't apply that hacky oneshot
> > > > > > retry patch. We can hit problems (in our design?) with something 
> > > > > > like:
> > > > > > 
> > > > > > # cd /sys/bus/i2c/drivers/max31785; \
> > > > > > echo $addr > unbind; \
> > > > > > while echo $addr > bind; \
> > > > > > do echo $addr > unbind; echo -n .; done;
> > > > > > 
> > > > > > It should hit issues covered by this patch, as the register checks 
> > > > > > are
> > > > > > used in the operations used by probe.
> > > > > > 
> > > > > 
> > > > > Hmm ... I didn't use your driver but my prototype driver which also 
> > > > > supports
> > > > > temperature and voltage attributes, so if anything it should create 
> > > > > more
> > > > > stress on the chip.
> > > > 
> > > > I did add the temp and voltage attributes...
> > > > 
> > > > Any chance you can give mine a try? I don't know what I would have done
> > > > to invoke this kind of behaviour, so it would be useful to know whether
> > > > or not it happens with one driver but not the other.
> > > > 
> > > 
> > > Will do.
> > 
> > Thanks. For reference, here's a devicetree description:
> > 
> > https://github.com/openbmc/linux/blob/dev-4.10/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts#L283
> > 
> 
> I can't test with devicetree. x86 system.
> 
> 2,100+ iterations with your driver, no failures.

Great. I really appreciate your testing here, so thanks for your
efforts. I owe you a few drinks if we ever happen to meet.

> 
> Either it is because my chip is a MAX31785 (not A), or the configuration 
> makes a difference,
> or it is your hardware.

Yep. My understanding is the A variant is just a difference of
microcode, but who knows what affect that could have. 

> 
> I'll try to connect a couple of fans next (so far I did without) and try 
> again.

Keep me posted if you do.

Thanks again.

Andrew

> 
> Guenter

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On 09/07/2017 07:06 PM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 18:26 -0700, Guenter Roeck wrote:

On 09/07/2017 06:02 PM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:

On 09/07/2017 08:22 AM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:

On 09/06/2017 04:32 PM, Andrew Jeffery wrote:

Guess I need to dig up my eval board and see if I can reproduce the problem.

Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?

Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.

Do you have a somewhat reliable means to reproduce the problem ?

It seems we hit a bunch of problems by just continually
binding/unbinding the driver, if you don't apply that hacky oneshot
retry patch. We can hit problems (in our design?) with something like:

# cd /sys/bus/i2c/drivers/max31785; \
echo $addr > unbind; \
while echo $addr > bind; \
do echo $addr > unbind; echo -n .; done;

It should hit issues covered by this patch, as the register checks are
used in the operations used by probe.

Hmm ... I didn't use your driver but my prototype driver which also supports
temperature and voltage attributes, so if anything it should create more
stress on the chip.

I did add the temp and voltage attributes...

Any chance you can give mine a try? I don't know what I would have done
to invoke this kind of behaviour, so it would be useful to know whether
or not it happens with one driver but not the other.

Will do.

Thanks. For reference, here's a devicetree description:

https://github.com/openbmc/linux/blob/dev-4.10/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts#L283

I can't test with devicetree. x86 system.

2,100+ iterations with your driver, no failures.

Either it is because my chip is a MAX31785 (not A), or the configuration makes
a difference,
or it is your hardware.

I'll try to connect a couple of fans next (so far I did without) and try again.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Thu, 2017-09-07 at 18:26 -0700, Guenter Roeck wrote:
> On 09/07/2017 06:02 PM, Andrew Jeffery wrote:
> > On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:
> > > On 09/07/2017 08:22 AM, Andrew Jeffery wrote:
> > > > On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:
> > > > > On 09/06/2017 04:32 PM, Andrew Jeffery wrote:
> > > > > 
> > > > > > > 
> > > > > > > Guess I need to dig up my eval board and see if I can reproduce 
> > > > > > > the problem.
> > > > > > > Seems you are saying that the problem is always seen when issuing 
> > > > > > > a sequence
> > > > > > > of "clear faults" commands on multiple pages ?
> > > > > > 
> > > > > > Yeah. We're also seeing bad behaviour under other command sequences 
> > > > > > as well,
> > > > > > which lead to this hack of a work-around patch[1].
> > > > > > 
> > > > > > I'd be very interested in the results of testing against the eval 
> > > > > > board. I
> > > > > > don't have access to one and it seems Maxim have discontinued them.
> > > > > > 
> > > > > 
> > > > > Do you have a somewhat reliable means to reproduce the problem ?
> > > > 
> > > > It seems we hit a bunch of problems by just continually
> > > > binding/unbinding the driver, if you don't apply that hacky oneshot
> > > > retry patch. We can hit problems (in our design?) with something like:
> > > > 
> > > > # cd /sys/bus/i2c/drivers/max31785; \
> > > > echo $addr > unbind; \
> > > > while echo $addr > bind; \
> > > > do echo $addr > unbind; echo -n .; done;
> > > > 
> > > > It should hit issues covered by this patch, as the register checks are
> > > > used in the operations used by probe.
> > > > 
> > > 
> > > Hmm ... I didn't use your driver but my prototype driver which also 
> > > supports
> > > temperature and voltage attributes, so if anything it should create more
> > > stress on the chip.
> > 
> > I did add the temp and voltage attributes...
> > 
> > Any chance you can give mine a try? I don't know what I would have done
> > to invoke this kind of behaviour, so it would be useful to know whether
> > or not it happens with one driver but not the other.
> > 
> 
> Will do.

Thanks. For reference, here's a devicetree description:

https://github.com/openbmc/linux/blob/dev-4.10/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts#L283


> 
> > >   No error so far, after running the script for a couple
> > > of minutes. How long does it take for errors to appear, and how do I see
> > > that there is an error ?
> > 
> > I'm seeing failures after anything from a handful of bind/unbinds, to
> > hundreds of bind/unbinds. It seems to vary.
> > 
> > > Does the driver fail to instantiate ?
> > 
> > Typically probe fails so the loop exits. It usually gets -EIO and the
> > shell spits out "No such device".
> > 
> > Thanks for testing, it's a useful data point for us hunting down the
> > source of our problems.
> > 
> 
> I aborted the test after ~2,500 loops without error.

Yeah, I'd consider that fairly stable.

Cheers,

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors


On 09/07/2017 06:02 PM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:

On 09/07/2017 08:22 AM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:

On 09/06/2017 04:32 PM, Andrew Jeffery wrote:


Guess I need to dig up my eval board and see if I can reproduce the problem.

Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?


Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.



Do you have a somewhat reliable means to reproduce the problem ?


It seems we hit a bunch of problems by just continually
binding/unbinding the driver, if you don't apply that hacky oneshot
retry patch. We can hit problems (in our design?) with something like:

# cd /sys/bus/i2c/drivers/max31785; \
echo $addr > unbind; \
while echo $addr > bind; \
do echo $addr > unbind; echo -n .; done;

It should hit issues covered by this patch, as the register checks are
used in the operations used by probe.



Hmm ... I didn't use your driver but my prototype driver which also supports
temperature and voltage attributes, so if anything it should create more
stress on the chip.


I did add the temp and voltage attributes...

Any chance you can give mine a try? I don't know what I would have done
to invoke this kind of behaviour, so it would be useful to know whether
or not it happens with one driver but not the other.



Will do.


  No error so far, after running the script for a couple
of minutes. How long does it take for errors to appear, and how do I see
that there is an error ?


I'm seeing failures after anything from a handful of bind/unbinds, to
hundreds of bind/unbinds. It seems to vary.


Does the driver fail to instantiate ?


Typically probe fails so the loop exits. It usually gets -EIO and the
shell spits out "No such device".

Thanks for testing, it's a useful data point for us hunting down the
source of our problems.


I aborted the test after ~2,500 loops without error.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:
> On 09/07/2017 08:22 AM, Andrew Jeffery wrote:
> > On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:
> > > On 09/06/2017 04:32 PM, Andrew Jeffery wrote:
> > > 
> > > > >    
> > > > > Guess I need to dig up my eval board and see if I can reproduce the 
> > > > > problem.
> > > > > Seems you are saying that the problem is always seen when issuing a 
> > > > > sequence
> > > > > of "clear faults" commands on multiple pages ?
> > > > 
> > > > Yeah. We're also seeing bad behaviour under other command sequences as 
> > > > well,
> > > > which lead to this hack of a work-around patch[1].
> > > > 
> > > > I'd be very interested in the results of testing against the eval 
> > > > board. I
> > > > don't have access to one and it seems Maxim have discontinued them.
> > > > 
> > > 
> > > Do you have a somewhat reliable means to reproduce the problem ?
> > 
> > It seems we hit a bunch of problems by just continually
> > binding/unbinding the driver, if you don't apply that hacky oneshot
> > retry patch. We can hit problems (in our design?) with something like:
> > 
> > # cd /sys/bus/i2c/drivers/max31785; \
> > echo $addr > unbind; \
> > while echo $addr > bind; \
> > do echo $addr > unbind; echo -n .; done;
> > 
> > It should hit issues covered by this patch, as the register checks are
> > used in the operations used by probe.
> > 
> 
> Hmm ... I didn't use your driver but my prototype driver which also supports
> temperature and voltage attributes, so if anything it should create more
> stress on the chip.

I did add the temp and voltage attributes...

Any chance you can give mine a try? I don't know what I would have done
to invoke this kind of behaviour, so it would be useful to know whether
or not it happens with one driver but not the other.

>  No error so far, after running the script for a couple
> of minutes. How long does it take for errors to appear, and how do I see
> that there is an error ? 

I'm seeing failures after anything from a handful of bind/unbinds, to
hundreds of bind/unbinds. It seems to vary. 

> Does the driver fail to instantiate ?

Typically probe fails so the loop exits. It usually gets -EIO and the
shell spits out "No such device".

Thanks for testing, it's a useful data point for us hunting down the
source of our problems.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors


On 09/07/2017 08:22 AM, Andrew Jeffery wrote:

On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:

On 09/06/2017 04:32 PM, Andrew Jeffery wrote:

   
Guess I need to dig up my eval board and see if I can reproduce the problem.

Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?


Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.



Do you have a somewhat reliable means to reproduce the problem ?


It seems we hit a bunch of problems by just continually
binding/unbinding the driver, if you don't apply that hacky oneshot
retry patch. We can hit problems (in our design?) with something like:

# cd /sys/bus/i2c/drivers/max31785; \
echo $addr > unbind; \
while echo $addr > bind; \
do echo $addr > unbind; echo -n .; done;

It should hit issues covered by this patch, as the register checks are
used in the operations used by probe.



Hmm ... I didn't use your driver but my prototype driver which also supports
temperature and voltage attributes, so if anything it should create more
stress on the chip. No error so far, after running the script for a couple
of minutes. How long does it take for errors to appear, and how do I see
that there is an error ? Does the driver fail to instantiate ?

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

2017-09-07 Thread David Rientjes

On Thu, 7 Sep 2017, Christopher Lameter wrote:

> > SGI required it when it was introduced simply to avoid the very expensive
> > tasklist scan.  Adding Christoph Lameter to the cc since he was involved
> > back then.
> 
> Really? From what I know and worked on way back when: The reason was to be
> able to contain the affected application in a cpuset. Multiple apps may
> have been running in multiple cpusets on a large NUMA machine and the OOM
> condition in one cpuset should not affect the other. It also helped to
> isolate the application behavior causing the oom in numerous cases.
> 
> Doesnt this requirement transfer to cgroups in the same way?
> 
> Left SGI in 2008 so adding Dimitri who may know about the current
> situation. Robin Holt also left SGI as far as I know.
> 

It may have been Paul Jackson, but I remember the oom_kill_allocating_task 
knob being required due to very slow oom killer due to the very lengthy 
iteration of the tasklist.  It would be helpful if someone from SGI could 
confirm whether or not they actively use this sysctl.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

On Thu, 7 Sep 2017, Roman Gushchin wrote:

> On Thu, Sep 07, 2017 at 10:03:24AM -0500, Christopher Lameter wrote:
> > On Thu, 7 Sep 2017, Roman Gushchin wrote:
> >
> > > > Really? From what I know and worked on way back when: The reason was to 
> > > > be
> > > > able to contain the affected application in a cpuset. Multiple apps may
> > > > have been running in multiple cpusets on a large NUMA machine and the 
> > > > OOM
> > > > condition in one cpuset should not affect the other. It also helped to
> > > > isolate the application behavior causing the oom in numerous cases.
> > > >
> > > > Doesnt this requirement transfer to cgroups in the same way?
> > >
> > > We have per-node memory stats and plan to use them during the OOM victim
> > > selection. Hopefully it can help.
> >
> > One of the OOM causes could be that memory was restricted to a certain
> > node set. Killing the allocating task is (was?) default behavior in that
> > case so that the task that has the restrictions is killed. Not any task
> > that may not have the restrictions and woiuld not experience OOM.
>
> As I can see, it's not the default behavior these days. If we have a way
> to select a victim between memcgs/tasks which are actually using
> the corresponding type of memory, it's much better than to kill
> an allocating task.

Kill the whole set of processes constituting an app in a cgroup or so
sounds good to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

2017-09-07 Thread Roman Gushchin

On Thu, Sep 07, 2017 at 10:03:24AM -0500, Christopher Lameter wrote:
> On Thu, 7 Sep 2017, Roman Gushchin wrote:
> 
> > > Really? From what I know and worked on way back when: The reason was to be
> > > able to contain the affected application in a cpuset. Multiple apps may
> > > have been running in multiple cpusets on a large NUMA machine and the OOM
> > > condition in one cpuset should not affect the other. It also helped to
> > > isolate the application behavior causing the oom in numerous cases.
> > >
> > > Doesnt this requirement transfer to cgroups in the same way?
> >
> > We have per-node memory stats and plan to use them during the OOM victim
> > selection. Hopefully it can help.
> 
> One of the OOM causes could be that memory was restricted to a certain
> node set. Killing the allocating task is (was?) default behavior in that
> case so that the task that has the restrictions is killed. Not any task
> that may not have the restrictions and woiuld not experience OOM.

As I can see, it's not the default behavior these days. If we have a way
to select a victim between memcgs/tasks which are actually using
the corresponding type of memory, it's much better than to kill
an allocating task.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

On Wed, 6 Sep 2017, Michal Hocko wrote:

> I am not sure this is how things evolved actually. This is way before
> my time so my git log interpretation might be imprecise. We do have
> oom_badness heuristic since out_of_memory has been introduced and
> oom_kill_allocating_task has been introduced much later because of large
> boxes with zillions of tasks (SGI I suspect) which took too long to
> select a victim so David has added this heuristic.

Nope. The logic was required for tasks that run out of memory when the
restriction on the allocation did not allow the use of all of memory.
cpuset restrictions and memory policy restrictions where the prime
considerations at the time.

It has *nothing* to do with zillions of tasks. Its amusing that the SGI
ghost is still haunting the discussion here. The company died a couple of
years ago finally (ok somehow HP has an "SGI" brand now I believe). But
there are multiple companies that have large NUMA configurations and they
all have configurations where they want to restrict allocations of a
process to subset of system memory. This is even more important now that
we get new forms of memory (NVDIMM, PCI-E device memory etc). You need to
figure out what to do with allocations that fail because the *allowed*
memory pools are empty.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

On Tue, 5 Sep 2017, Michal Hocko wrote:

> I would argue that we should simply deprecate and later drop the sysctl.
> I _strongly_ suspect anybody is using this. If yes it is not that hard
> to change the kernel command like rather than select the sysctl. The
> deprecation process would be
>   - warn when somebody writes to the sysctl and check both boot
> and sysctl values
>   [ wait some time ]
>   - keep the sysctl but return EINVAL
>   [ wait some time ]
>   - remove the sysctl

Note that the behavior that would be enabled by the sysctl is the default
behavior for the case of a constrained allocation. If a process does an
mbind to numa node 3 and it runs out of memory then that process should be
killed and the rest is fine.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 2/5] mm, oom: cgroup-aware OOM killer

On Mon, 4 Sep 2017, Roman Gushchin wrote

> To address these issues, cgroup-aware OOM killer is introduced.

You are missing a major issue here. Processes may have allocation
constraints to memory nodes, special DMA zones etc etc. OOM conditions on
such resource constricted allocations need to be dealt with. Killing
processes that do not allocate with the same restrictions may not do
anything to improve conditions.

> But a user can change this behavior by enabling the per-cgroup
> oom_kill_all_tasks option. If set, it causes the OOM killer treat
> the whole cgroup as an indivisible memory consumer. In case if it's
> selected as on OOM victim, all belonging tasks will be killed.

Sounds good in general. Unless the cgroup or processes therein run out of
memory due to memory access restrictions. How do you detect that and how
it is handled?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

2017-09-07 Thread Johannes Weiner

On Wed, Sep 06, 2017 at 10:28:59AM +0200, Michal Hocko wrote:
> On Tue 05-09-17 17:53:44, Johannes Weiner wrote:
> > The cgroup-awareness in the OOM killer is exactly the same thing. It
> > should have been the default from the beginning, because the user
> > configures a group of tasks to be an interdependent, terminal unit of
> > memory consumption, and it's undesirable for the OOM killer to ignore
> > this intention and compare members across these boundaries.
> 
> I would agree if that was true in general. I can completely see how the
> cgroup awareness is useful in e.g. containerized environments (especially
> with kill-all enabled) but memcgs are used in a large variety of
> usecases and I cannot really say all of them really demand the new
> semantic. Say I have a workload which doesn't want to see reclaim
> interference from others on the same machine. Why should I kill a
> process from that particular memcg just because it is the largest one
> when there is a memory hog/leak outside of this memcg?

Sure, it's always possible to come up with a config for which this
isn't the optimal behavior. But this is about picking a default that
makes sense to most users, and that type of cgroup usage just isn't
the common case.

> From my point of view the safest (in a sense of the least surprise)
> way to go with opt-in for the new heuristic. I am pretty sure all who
> would benefit from the new behavior will enable it while others will not
> regress in unexpected way.

This thinking simply needs to be balanced against the need to make an
unsurprising and consistent final interface.

The current behavior breaks isolation by letting tasks in different
cgroups compete with each other during an OOM kill. While you can
rightfully argue that it's possible for usecases to rely on this, you
cannot tell me that this is the least-surprising thing we can offer
users; certainly not new users, but also not many/most existing ones.

> We can talk about the way _how_ to control these oom strategies, of
> course. But I would be really reluctant to change the default which is
> used for years and people got used to it.

I really doubt there are many cgroup users that rely on that
particular global OOM behavior.

We have to agree to disagree, I guess.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:
> On 09/06/2017 04:32 PM, Andrew Jeffery wrote:
> 
> > >   
> > > Guess I need to dig up my eval board and see if I can reproduce the 
> > > problem.
> > > Seems you are saying that the problem is always seen when issuing a 
> > > sequence
> > > of "clear faults" commands on multiple pages ?
> > 
> > Yeah. We're also seeing bad behaviour under other command sequences as well,
> > which lead to this hack of a work-around patch[1].
> > 
> > I'd be very interested in the results of testing against the eval board. I
> > don't have access to one and it seems Maxim have discontinued them.
> > 
> 
> Do you have a somewhat reliable means to reproduce the problem ?

It seems we hit a bunch of problems by just continually
binding/unbinding the driver, if you don't apply that hacky oneshot
retry patch. We can hit problems (in our design?) with something like:

# cd /sys/bus/i2c/drivers/max31785; \
echo $addr > unbind; \
while echo $addr > bind; \
do echo $addr > unbind; echo -n .; done;

It should hit issues covered by this patch, as the register checks are
used in the operations used by probe.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

On Thu, 7 Sep 2017, Roman Gushchin wrote:

> > Really? From what I know and worked on way back when: The reason was to be
> > able to contain the affected application in a cpuset. Multiple apps may
> > have been running in multiple cpusets on a large NUMA machine and the OOM
> > condition in one cpuset should not affect the other. It also helped to
> > isolate the application behavior causing the oom in numerous cases.
> >
> > Doesnt this requirement transfer to cgroups in the same way?
>
> We have per-node memory stats and plan to use them during the OOM victim
> selection. Hopefully it can help.

One of the OOM causes could be that memory was restricted to a certain
node set. Killing the allocating task is (was?) default behavior in that
case so that the task that has the restrictions is killed. Not any task
that may not have the restrictions and woiuld not experience OOM.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

2017-09-07 Thread Roman Gushchin

On Thu, Sep 07, 2017 at 09:43:30AM -0500, Christopher Lameter wrote:
> On Wed, 6 Sep 2017, David Rientjes wrote:
> 
> > > The oom_kill_allocating_task sysctl which causes the OOM killer
> > > to simple kill the allocating task is useless. Killing the random
> > > task is not the best idea.
> > >
> > > Nobody likes it, and hopefully nobody uses it.
> > > We want to completely deprecate it at some point.
> > >
> >
> > SGI required it when it was introduced simply to avoid the very expensive
> > tasklist scan.  Adding Christoph Lameter to the cc since he was involved
> > back then.
> 
> Really? From what I know and worked on way back when: The reason was to be
> able to contain the affected application in a cpuset. Multiple apps may
> have been running in multiple cpusets on a large NUMA machine and the OOM
> condition in one cpuset should not affect the other. It also helped to
> isolate the application behavior causing the oom in numerous cases.
> 
> Doesnt this requirement transfer to cgroups in the same way?

We have per-node memory stats and plan to use them during the OOM victim
selection. Hopefully it can help.

> 
> Left SGI in 2008 so adding Dimitri who may know about the current
> situation. Robin Holt also left SGI as far as I know.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

On Wed, 6 Sep 2017, David Rientjes wrote:

> > The oom_kill_allocating_task sysctl which causes the OOM killer
> > to simple kill the allocating task is useless. Killing the random
> > task is not the best idea.
> >
> > Nobody likes it, and hopefully nobody uses it.
> > We want to completely deprecate it at some point.
> >
>
> SGI required it when it was introduced simply to avoid the very expensive
> tasklist scan.  Adding Christoph Lameter to the cc since he was involved
> back then.

Really? From what I know and worked on way back when: The reason was to be
able to contain the affected application in a cpuset. Multiple apps may
have been running in multiple cpusets on a large NUMA machine and the OOM
condition in one cpuset should not affect the other. It also helped to
isolate the application behavior causing the oom in numerous cases.

Doesnt this requirement transfer to cgroups in the same way?

Left SGI in 2008 so adding Dimitri who may know about the current
situation. Robin Holt also left SGI as far as I know.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors