subject:"Re\: \[PATCH\] add some drop_caches documentation and info messsge"

Re: [PATCH] add some drop_caches documentation and info messsge

2012-11-01 Thread Pavel Machek

Hi!

> > > hmpf.  This patch worries me.  If there are people out there who are
> > > regularly using drop_caches because the VM sucks, it seems pretty
> > > obnoxious of us to go dumping stuff into their syslog.  What are they
> > > supposed to do?  Stop using drop_caches?
> > 
> > People use drop_caches because they _think_ the VM sucks, or they
> > _think_ they're "tuning" their system.  _They_ are supposed to stop
> > using drop_caches. :)
> 
> Well who knows.  Could be that people's vm *does* suck.  Or they have
> some particularly peculiar worklosd or requirement[*].  Or their VM
> *used* to suck, and the drop_caches is not really needed any more but
> it's there in vendor-provided code and they can't practically prevent
> it.

Or they have ipw wifi that does order 5 allocation :-).

I seen drop_caches used in some android code, as part of SD card handling IIRC.

> > What kind of interface _is_ it in the first place?  Is it really a
> > production-level thing that we expect users to be poking at?  Or, is it
> > a rarely-used debugging and benchmarking knob which is fair game for us
> > to tweak like this?
> 
> It was a rarely-used mainly-developer-only thing which, apparently, real
> people found useful at some point in the past.  Perhaps we should never
> have offered it.

And yes, documentation would be good. IIRC you claimed that
drop_caches is not safe to use year-or-so-ago, is that still true?


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-11-01 Thread Pavel Machek

Hi!

   hmpf.  This patch worries me.  If there are people out there who are
   regularly using drop_caches because the VM sucks, it seems pretty
   obnoxious of us to go dumping stuff into their syslog.  What are they
   supposed to do?  Stop using drop_caches?
  
  People use drop_caches because they _think_ the VM sucks, or they
  _think_ they're tuning their system.  _They_ are supposed to stop
  using drop_caches. :)
 
 Well who knows.  Could be that people's vm *does* suck.  Or they have
 some particularly peculiar worklosd or requirement[*].  Or their VM
 *used* to suck, and the drop_caches is not really needed any more but
 it's there in vendor-provided code and they can't practically prevent
 it.

Or they have ipw wifi that does order 5 allocation :-).

I seen drop_caches used in some android code, as part of SD card handling IIRC.

  What kind of interface _is_ it in the first place?  Is it really a
  production-level thing that we expect users to be poking at?  Or, is it
  a rarely-used debugging and benchmarking knob which is fair game for us
  to tweak like this?
 
 It was a rarely-used mainly-developer-only thing which, apparently, real
 people found useful at some point in the past.  Perhaps we should never
 have offered it.

And yes, documentation would be good. IIRC you claimed that
drop_caches is not safe to use year-or-so-ago, is that still true?


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-31 Thread Borislav Petkov

On Wed, Oct 31, 2012 at 06:31:54PM +0100, Pavel Machek wrote:
> Hmm? When I resume from hibernate, I want to use my machine.

Well, in my case with a workstation with 8 Gb, the only time the swapin
is noticeable is when I try to use firefox with a couple of dozens tabs
open. Once that thing is swapped in, system perf is back to normal.

I'll bet that even this slowdown would disappear if I use an SSD.

But I can imagine some workloads where swapping everything back in could
be discomforting.

> Kernel will not normally swap anything in automatically. Some people
> do swapoff -a; swapon -a to work around that. (And yes, maybe some
> automatic-swap-in-when-there's-plenty-of-RAM would be useful.).

That's a good idea, actually.

So, in any case, the current situation is fine as it is, I'd say: people
can decide whether they want to drop caches before suspending or not.
Problem solved.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-31 Thread Pavel Machek

On Mon 2012-10-29 10:58:19, Borislav Petkov wrote:
> On Mon, Oct 29, 2012 at 09:59:59AM +0100, Jiri Kosina wrote:
> > You might or might not want to do that. Dropping caches around suspend
> > makes the hibernation process itself faster, but the realtime response
> > of the applications afterwards is worse, as everything touched by user
> > has to be paged in again.

Also note that page-in is slower than  reading hibernation image,
because it is not compressed, and involves seeking.

> Right, do you know of a real use-case where people hibernate, then
> resume and still care about applications response time right afterwards?

Hmm? When I resume from hibernate, I want to use my
machine. *Everyone* cares about resume time afterwards. You move your
mouse, and you don't want to wait for X to be paged-in.

> Besides, once everything is swapped back in, perf. is back to normal,
> i.e. like before suspending.

Kernel will not normally swap anything in automatically. Some people
do swapoff -a; swapon -a to work around that. (And yes, maybe some
automatic-swap-in-when-there's-plenty-of-RAM would be useful.).
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-31 Thread Pavel Machek

On Mon 2012-10-29 10:58:19, Borislav Petkov wrote:
 On Mon, Oct 29, 2012 at 09:59:59AM +0100, Jiri Kosina wrote:
  You might or might not want to do that. Dropping caches around suspend
  makes the hibernation process itself faster, but the realtime response
  of the applications afterwards is worse, as everything touched by user
  has to be paged in again.

Also note that page-in is slower than  reading hibernation image,
because it is not compressed, and involves seeking.

 Right, do you know of a real use-case where people hibernate, then
 resume and still care about applications response time right afterwards?

Hmm? When I resume from hibernate, I want to use my
machine. *Everyone* cares about resume time afterwards. You move your
mouse, and you don't want to wait for X to be paged-in.

 Besides, once everything is swapped back in, perf. is back to normal,
 i.e. like before suspending.

Kernel will not normally swap anything in automatically. Some people
do swapoff -a; swapon -a to work around that. (And yes, maybe some
automatic-swap-in-when-there's-plenty-of-RAM would be useful.).
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-31 Thread Borislav Petkov

On Wed, Oct 31, 2012 at 06:31:54PM +0100, Pavel Machek wrote:
 Hmm? When I resume from hibernate, I want to use my machine.

Well, in my case with a workstation with 8 Gb, the only time the swapin
is noticeable is when I try to use firefox with a couple of dozens tabs
open. Once that thing is swapped in, system perf is back to normal.

I'll bet that even this slowdown would disappear if I use an SSD.

But I can imagine some workloads where swapping everything back in could
be discomforting.

 Kernel will not normally swap anything in automatically. Some people
 do swapoff -a; swapon -a to work around that. (And yes, maybe some
 automatic-swap-in-when-there's-plenty-of-RAM would be useful.).

That's a good idea, actually.

So, in any case, the current situation is fine as it is, I'd say: people
can decide whether they want to drop caches before suspending or not.
Problem solved.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Borislav Petkov

On Mon, Oct 29, 2012 at 11:01:59AM +0100, Jiri Kosina wrote:
> Well if the point of dropping caches is lowering the resume time, then
> the point is rendered moot as soon as you switch to your browser and
> have to wait noticeable amount of time until it starts reacting.

Not the resume time - the suspend time. If, say, one has 8Gb of memory
and Linux nicely spreads all over it in caches, you don't want to wait
too long for the suspend image creation.

And nowadays, since you can have 8Gb in a laptop, you really want to
keep that image minimal so that suspend-to-disk is quick.

The penalty of faulting everything back in is a cost we'd be willing to
pay, I guess.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Jiri Kosina

On Mon, 29 Oct 2012, Borislav Petkov wrote:

> > You might or might not want to do that. Dropping caches around suspend
> > makes the hibernation process itself faster, but the realtime response
> > of the applications afterwards is worse, as everything touched by user
> > has to be paged in again.
> 
> Right, do you know of a real use-case where people hibernate, then
> resume and still care about applications response time right afterwards?

Well if the point of dropping caches is lowering the resume time, then the 
point is rendered moot as soon as you switch to your browser and have to 
wait noticeable amount of time until it starts reacting.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Borislav Petkov

On Mon, Oct 29, 2012 at 09:59:59AM +0100, Jiri Kosina wrote:
> You might or might not want to do that. Dropping caches around suspend
> makes the hibernation process itself faster, but the realtime response
> of the applications afterwards is worse, as everything touched by user
> has to be paged in again.

Right, do you know of a real use-case where people hibernate, then
resume and still care about applications response time right afterwards?

Besides, once everything is swapped back in, perf. is back to normal,
i.e. like before suspending.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Jiri Kosina

On Wed, 24 Oct 2012, Andrew Morton wrote:

> > > > I have drop_caches in my suspend-to-disk script so that the hibernation
> > > > image is kept at minimum and suspend times are as small as possible.
> > > 
> > > hm, that sounds smart.
> > > 
> > > > Would that be a valid use-case?
> > > 
> > > I'd say so, unless we change the kernel to do that internally.  We do
> > > have the hibernation-specific shrink_all_memory() in the vmscan code. 
> > > We didn't see fit to document _why_ that exists, but IIRC it's there to
> > > create enough free memory for hibernation to be able to successfully
> > > complete, but no more.
> > 
> > That's correct.
> 
> Well, my point was: how about the idea of reclaiming clean pagecache
> (and inodes, dentries, etc) before hibernation so we read/write less
> disk data?

You might or might not want to do that. Dropping caches around suspend 
makes the hibernation process itself faster, but the realtime response of 
the applications afterwards is worse, as everything touched by user has to 
be paged in again.

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Jiri Kosina

On Wed, 24 Oct 2012, Andrew Morton wrote:

I have drop_caches in my suspend-to-disk script so that the hibernation
image is kept at minimum and suspend times are as small as possible.
   
   hm, that sounds smart.
   
Would that be a valid use-case?
   
   I'd say so, unless we change the kernel to do that internally.  We do
   have the hibernation-specific shrink_all_memory() in the vmscan code. 
   We didn't see fit to document _why_ that exists, but IIRC it's there to
   create enough free memory for hibernation to be able to successfully
   complete, but no more.
  
  That's correct.
 
 Well, my point was: how about the idea of reclaiming clean pagecache
 (and inodes, dentries, etc) before hibernation so we read/write less
 disk data?

You might or might not want to do that. Dropping caches around suspend 
makes the hibernation process itself faster, but the realtime response of 
the applications afterwards is worse, as everything touched by user has to 
be paged in again.

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Borislav Petkov

On Mon, Oct 29, 2012 at 09:59:59AM +0100, Jiri Kosina wrote:
 You might or might not want to do that. Dropping caches around suspend
 makes the hibernation process itself faster, but the realtime response
 of the applications afterwards is worse, as everything touched by user
 has to be paged in again.

Right, do you know of a real use-case where people hibernate, then
resume and still care about applications response time right afterwards?

Besides, once everything is swapped back in, perf. is back to normal,
i.e. like before suspending.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Jiri Kosina

On Mon, 29 Oct 2012, Borislav Petkov wrote:

  You might or might not want to do that. Dropping caches around suspend
  makes the hibernation process itself faster, but the realtime response
  of the applications afterwards is worse, as everything touched by user
  has to be paged in again.
 
 Right, do you know of a real use-case where people hibernate, then
 resume and still care about applications response time right afterwards?

Well if the point of dropping caches is lowering the resume time, then the 
point is rendered moot as soon as you switch to your browser and have to 
wait noticeable amount of time until it starts reacting.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-29 Thread Borislav Petkov

On Mon, Oct 29, 2012 at 11:01:59AM +0100, Jiri Kosina wrote:
 Well if the point of dropping caches is lowering the resume time, then
 the point is rendered moot as soon as you switch to your browser and
 have to wait noticeable amount of time until it starts reacting.

Not the resume time - the suspend time. If, say, one has 8Gb of memory
and Linux nicely spreads all over it in caches, you don't want to wait
too long for the suspend image creation.

And nowadays, since you can have 8Gb in a laptop, you really want to
keep that image minimal so that suspend-to-disk is quick.

The penalty of faulting everything back in is a cost we'd be willing to
pay, I guess.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-26 Thread Mika Boström

On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> Dave Hansen  wrote:
> > What kind of interface _is_ it in the first place?  Is it really a
> > production-level thing that we expect users to be poking at?  Or, is it
> > a rarely-used debugging and benchmarking knob which is fair game for us
> > to tweak like this?
> 
> It was a rarely-used mainly-developer-only thing which, apparently, real
> people found useful at some point in the past.  Perhaps we should never
> have offered it.

I've found it useful on occasion when generating large public keys.
When key generation hangs due to not-enough-entropy, dropping all
caches (followed by an intensive read) has allowed the system to
collect enough entropy to let the key generation finish.

Usefulness of the trick is probably going the way of the dodo, thanks to
SSD's becoming more common.

-- 
 Mika Boström   Individualisti, eksistentialisti,
 www.iki.fi/bostik  rationalisti ja mulkvisti
 GPG: 0x2AED22CC; 6FC9 8375 31B7 3BA2 B5DC  484E F19F 8AD6 2AED 22CC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-26 Thread Mika Boström

On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
 Dave Hansen d...@linux.vnet.ibm.com wrote:
  What kind of interface _is_ it in the first place?  Is it really a
  production-level thing that we expect users to be poking at?  Or, is it
  a rarely-used debugging and benchmarking knob which is fair game for us
  to tweak like this?
 
 It was a rarely-used mainly-developer-only thing which, apparently, real
 people found useful at some point in the past.  Perhaps we should never
 have offered it.

I've found it useful on occasion when generating large public keys.
When key generation hangs due to not-enough-entropy, dropping all
caches (followed by an intensive read) has allowed the system to
collect enough entropy to let the key generation finish.

Usefulness of the trick is probably going the way of the dodo, thanks to
SSD's becoming more common.

-- 
 Mika Boström   Individualisti, eksistentialisti,
 www.iki.fi/bostik  rationalisti ja mulkvisti
 GPG: 0x2AED22CC; 6FC9 8375 31B7 3BA2 B5DC  484E F19F 8AD6 2AED 22CC
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Rafael J. Wysocki

On Wednesday, October 24, 2012 06:17:52 PM Andrew Morton wrote:
> On Thu, 25 Oct 2012 00:04:46 +0200 "Rafael J. Wysocki"  wrote:
> 
> > On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
> > > On Wed, 24 Oct 2012 23:06:00 +0200
> > > Borislav Petkov  wrote:
> > > 
> > > > On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> > > > > Well who knows. Could be that people's vm *does* suck. Or they have
> > > > > some particularly peculiar worklosd or requirement[*]. Or their VM
> > > > > *used* to suck, and the drop_caches is not really needed any more but
> > > > > it's there in vendor-provided code and they can't practically prevent
> > > > > it.
> > > > 
> > > > I have drop_caches in my suspend-to-disk script so that the hibernation
> > > > image is kept at minimum and suspend times are as small as possible.
> > > 
> > > hm, that sounds smart.
> > > 
> > > > Would that be a valid use-case?
> > > 
> > > I'd say so, unless we change the kernel to do that internally.  We do
> > > have the hibernation-specific shrink_all_memory() in the vmscan code. 
> > > We didn't see fit to document _why_ that exists, but IIRC it's there to
> > > create enough free memory for hibernation to be able to successfully
> > > complete, but no more.
> > 
> > That's correct.
> 
> Well, my point was: how about the idea of reclaiming clean pagecache
> (and inodes, dentries, etc) before hibernation so we read/write less
> disk data?

We may actually want to write more into the image to improve post-resume
responsiveness.

> Given that it's so easy to do from the hibernation script, I guess
> there's not much point...

Well, I'd say so. :-)

 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Thu 25-10-12 04:57:11, Dave Hansen wrote:
[...]
> Here's the problem: Joe Kernel Developer gets a bug report, usually
> something like "the kernel is slow", or "the kernel is eating up all my
> memory".  We then start going and digging in to the problem with the
> usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
> but less likely, that we get things like vmstat along with such a bug
> report.
> 
> Joe Kernel Developer digs in the statistics or the dmesg and tries to
> figure out what happened.  I've run in to a couple of cases in practice
> (and I assume Michal has too) where the bug reporter was using
> drop_caches _heavily_ and did not realize the implications.  It was
> quite hard to track down exactly how the page cache and dentries/inodes
> were getting purged.

Yes, very same here. Not that I would meet issues like that often but it
happened in the past few times and it was always a lot of burnt time.

> There are rarely oopses involved in these scenarios.
> 
> The primary goal of this patch is to make debugging those scenarios
> easier so that we can quickly realize that drop_caches is the reason our
> caches went away, not some anomalous VM activity.  A secondary goal is
> to tell the user: "Hey, maybe this isn't something you want to be doing
> all the time."

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Borislav Petkov

On Thu, Oct 25, 2012 at 04:57:11AM -0700, Dave Hansen wrote:
> On 10/25/2012 02:24 AM, Borislav Petkov wrote:
> > But let's discuss this a bit further. So, for the benchmarking aspect,
> > you're either going to have to always require dmesg along with
> > benchmarking results or /proc/vmstat, depending on where the drop_caches
> > stats end up.
> > 
> > Is this how you envision it?
> > 
> > And then there are the VM bug cases, where you might not always get
> > full dmesg from a panicked system. In that case, you'd want the kernel
> > tainting thing too, so that it at least appears in the oops backtrace.
> > 
> > Although the tainting thing might not be enough - a user could
> > drop_caches at some point in time and the oops happening much later
> > could be unrelated but that can't be expressed in taint flags.
> 
> Here's the problem: Joe Kernel Developer gets a bug report, usually
> something like "the kernel is slow", or "the kernel is eating up all my
> memory".  We then start going and digging in to the problem with the
> usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
> but less likely, that we get things like vmstat along with such a bug
> report.
> 
> Joe Kernel Developer digs in the statistics or the dmesg and tries to
> figure out what happened.  I've run in to a couple of cases in practice
> (and I assume Michal has too) where the bug reporter was using
> drop_caches _heavily_ and did not realize the implications.  It was
> quite hard to track down exactly how the page cache and dentries/inodes
> were getting purged.
> 
> There are rarely oopses involved in these scenarios.
> 
> The primary goal of this patch is to make debugging those scenarios
> easier so that we can quickly realize that drop_caches is the reason our
> caches went away, not some anomalous VM activity.  A secondary goal is
> to tell the user: "Hey, maybe this isn't something you want to be doing
> all the time."

Ok, understood. So you will be requiring dmesg, ok, then it makes sense.

This way you're also getting timestamps of when exactly and how many
times drop_caches was used. For that, though, you'll need to add the
timestamp explicitly to the printk because CONFIG_PRINTK_TIME is not
always enabled.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Wed 24-10-12 18:35:43, KOSAKI Motohiro wrote:
> >> I have drop_caches in my suspend-to-disk script so that the hibernation
> >> image is kept at minimum and suspend times are as small as possible.
> > 
> > hm, that sounds smart.
> > 
> >> Would that be a valid use-case?
> > 
> > I'd say so, unless we change the kernel to do that internally.  We do
> > have the hibernation-specific shrink_all_memory() in the vmscan code. 
> > We didn't see fit to document _why_ that exists, but IIRC it's there to
> > create enough free memory for hibernation to be able to successfully
> > complete, but no more.
> 
> shrink_all_memory() drop minimum memory to be needed from hibernation.
> that's trade off matter.
> 
> - drop all page cache
>   pros.
>speed up hibernation time
>   cons.
>after go back from hibernation, system works very slow a while until
>system will get enough file cache.
> 
> - drop minimum page cache
>   pros.
>system works quickly when go back from hibernation.
>   cons.
>relative large hibernation time
> 
> 
> So, I'm not fun change hibernation default. hmmm... Does adding
> tracepint instead of printk makes sense?

I guess you mean trace_printk. I have seen that one for debugging
purposes only but it seems like it could be used here. CONFIG_TRACING
seems to be enabled on the most distribution kernels.

I am just worried it needs debugfs mounted and my recollection is that
this has some security implications so there might be some pushback on
mounting it on production systems which would defeat the primary
motivation.
Maybe this concern is not that important wrt. excessive logging, though.
I can live with this solution as well if people really hate logging
approach.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Wed 24-10-12 12:54:39, Andrew Morton wrote:
> On Wed, 24 Oct 2012 08:29:45 +0200
> Michal Hocko  wrote:
[...]
> hmpf.  This patch worries me.  If there are people out there who are
> regularly using drop_caches because the VM sucks, it seems pretty
> obnoxious of us to go dumping stuff into their syslog.  What are they
> supposed to do?  Stop using drop_caches?  But that would unfix the
> problem which they fixed with drop_caches in the first case.
> 
> And they might not even have control over the code - they need to go
> back to their supplier and say "please send me a new version", along
> with all the additional costs and risks involed in an update.

I understand your worries and that's why I suggested a higher log level
which is under admin's control. Does even that sound too excessive?

> > > More friendly alternatives might be:
> > > 
> > > - Taint the kernel.  But that will only become apparent with an oops
> > >   trace or similar.
> > > 
> > > - Add a drop_caches counter and make that available in /proc/vmstat,
> > >   show_mem() output and perhaps other places.
> > 
> > We would loose timing and originating process name in both cases which
> > can be really helpful while debugging. It is fair to say that we could
> > deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
> > already and we do collect them often but this is not the case all of the
> > time and sometimes it is important to know _who_ is doing all this.
> 
> But how important is all that?  The main piece of information the
> kernel developer wants is "this guy is using drop_caches a lot".  All
> the other info is peripheral and can be gathered by other means if so
> desired.

Well, I have experienced a debugging session where I suspected that an
excessive drop_caches is going on but I had hard time to prove who is
doing that (customer, of course, claimed they are not doing anything
like that) so we went through many loops until we could point the
finger.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Dave Hansen

On 10/25/2012 02:24 AM, Borislav Petkov wrote:
> But let's discuss this a bit further. So, for the benchmarking aspect,
> you're either going to have to always require dmesg along with
> benchmarking results or /proc/vmstat, depending on where the drop_caches
> stats end up.
> 
> Is this how you envision it?
> 
> And then there are the VM bug cases, where you might not always get
> full dmesg from a panicked system. In that case, you'd want the kernel
> tainting thing too, so that it at least appears in the oops backtrace.
> 
> Although the tainting thing might not be enough - a user could
> drop_caches at some point in time and the oops happening much later
> could be unrelated but that can't be expressed in taint flags.

Here's the problem: Joe Kernel Developer gets a bug report, usually
something like "the kernel is slow", or "the kernel is eating up all my
memory".  We then start going and digging in to the problem with the
usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
but less likely, that we get things like vmstat along with such a bug
report.

Joe Kernel Developer digs in the statistics or the dmesg and tries to
figure out what happened.  I've run in to a couple of cases in practice
(and I assume Michal has too) where the bug reporter was using
drop_caches _heavily_ and did not realize the implications.  It was
quite hard to track down exactly how the page cache and dentries/inodes
were getting purged.

There are rarely oopses involved in these scenarios.

The primary goal of this patch is to make debugging those scenarios
easier so that we can quickly realize that drop_caches is the reason our
caches went away, not some anomalous VM activity.  A secondary goal is
to tell the user: "Hey, maybe this isn't something you want to be doing
all the time."

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 08:56:45PM -0400, KOSAKI Motohiro wrote:
> > That effectively means removing it from the kernel since distros ship
> > with those config options off.  We don't want to do that since there
> > _are_ valid, occasional uses like benchmarking that we want to be
> > consistent.
> 
> Agreed. we don't want to remove valid interface never.

Ok, duly noted.

But let's discuss this a bit further. So, for the benchmarking aspect,
you're either going to have to always require dmesg along with
benchmarking results or /proc/vmstat, depending on where the drop_caches
stats end up.

Is this how you envision it?

And then there are the VM bug cases, where you might not always get
full dmesg from a panicked system. In that case, you'd want the kernel
tainting thing too, so that it at least appears in the oops backtrace.

Although the tainting thing might not be enough - a user could
drop_caches at some point in time and the oops happening much later
could be unrelated but that can't be expressed in taint flags.

So you'd need some sort of a drop_caches counter, I'd guess. Or a last
drop_caches timestamp something.

Am I understanding the intent correctly?

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 08:56:45PM -0400, KOSAKI Motohiro wrote:
  That effectively means removing it from the kernel since distros ship
  with those config options off.  We don't want to do that since there
  _are_ valid, occasional uses like benchmarking that we want to be
  consistent.
 
 Agreed. we don't want to remove valid interface never.

Ok, duly noted.

But let's discuss this a bit further. So, for the benchmarking aspect,
you're either going to have to always require dmesg along with
benchmarking results or /proc/vmstat, depending on where the drop_caches
stats end up.

Is this how you envision it?

And then there are the VM bug cases, where you might not always get
full dmesg from a panicked system. In that case, you'd want the kernel
tainting thing too, so that it at least appears in the oops backtrace.

Although the tainting thing might not be enough - a user could
drop_caches at some point in time and the oops happening much later
could be unrelated but that can't be expressed in taint flags.

So you'd need some sort of a drop_caches counter, I'd guess. Or a last
drop_caches timestamp something.

Am I understanding the intent correctly?

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Dave Hansen

On 10/25/2012 02:24 AM, Borislav Petkov wrote:
 But let's discuss this a bit further. So, for the benchmarking aspect,
 you're either going to have to always require dmesg along with
 benchmarking results or /proc/vmstat, depending on where the drop_caches
 stats end up.
 
 Is this how you envision it?
 
 And then there are the VM bug cases, where you might not always get
 full dmesg from a panicked system. In that case, you'd want the kernel
 tainting thing too, so that it at least appears in the oops backtrace.
 
 Although the tainting thing might not be enough - a user could
 drop_caches at some point in time and the oops happening much later
 could be unrelated but that can't be expressed in taint flags.

Here's the problem: Joe Kernel Developer gets a bug report, usually
something like the kernel is slow, or the kernel is eating up all my
memory.  We then start going and digging in to the problem with the
usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
but less likely, that we get things like vmstat along with such a bug
report.

Joe Kernel Developer digs in the statistics or the dmesg and tries to
figure out what happened.  I've run in to a couple of cases in practice
(and I assume Michal has too) where the bug reporter was using
drop_caches _heavily_ and did not realize the implications.  It was
quite hard to track down exactly how the page cache and dentries/inodes
were getting purged.

There are rarely oopses involved in these scenarios.

The primary goal of this patch is to make debugging those scenarios
easier so that we can quickly realize that drop_caches is the reason our
caches went away, not some anomalous VM activity.  A secondary goal is
to tell the user: Hey, maybe this isn't something you want to be doing
all the time.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Wed 24-10-12 12:54:39, Andrew Morton wrote:
 On Wed, 24 Oct 2012 08:29:45 +0200
 Michal Hocko mho...@suse.cz wrote:
[...]
 hmpf.  This patch worries me.  If there are people out there who are
 regularly using drop_caches because the VM sucks, it seems pretty
 obnoxious of us to go dumping stuff into their syslog.  What are they
 supposed to do?  Stop using drop_caches?  But that would unfix the
 problem which they fixed with drop_caches in the first case.
 
 And they might not even have control over the code - they need to go
 back to their supplier and say please send me a new version, along
 with all the additional costs and risks involed in an update.

I understand your worries and that's why I suggested a higher log level
which is under admin's control. Does even that sound too excessive?

   More friendly alternatives might be:
   
   - Taint the kernel.  But that will only become apparent with an oops
 trace or similar.
   
   - Add a drop_caches counter and make that available in /proc/vmstat,
 show_mem() output and perhaps other places.
  
  We would loose timing and originating process name in both cases which
  can be really helpful while debugging. It is fair to say that we could
  deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
  already and we do collect them often but this is not the case all of the
  time and sometimes it is important to know _who_ is doing all this.
 
 But how important is all that?  The main piece of information the
 kernel developer wants is this guy is using drop_caches a lot.  All
 the other info is peripheral and can be gathered by other means if so
 desired.

Well, I have experienced a debugging session where I suspected that an
excessive drop_caches is going on but I had hard time to prove who is
doing that (customer, of course, claimed they are not doing anything
like that) so we went through many loops until we could point the
finger.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Wed 24-10-12 18:35:43, KOSAKI Motohiro wrote:
  I have drop_caches in my suspend-to-disk script so that the hibernation
  image is kept at minimum and suspend times are as small as possible.
  
  hm, that sounds smart.
  
  Would that be a valid use-case?
  
  I'd say so, unless we change the kernel to do that internally.  We do
  have the hibernation-specific shrink_all_memory() in the vmscan code. 
  We didn't see fit to document _why_ that exists, but IIRC it's there to
  create enough free memory for hibernation to be able to successfully
  complete, but no more.
 
 shrink_all_memory() drop minimum memory to be needed from hibernation.
 that's trade off matter.
 
 - drop all page cache
   pros.
speed up hibernation time
   cons.
after go back from hibernation, system works very slow a while until
system will get enough file cache.
 
 - drop minimum page cache
   pros.
system works quickly when go back from hibernation.
   cons.
relative large hibernation time
 
 
 So, I'm not fun change hibernation default. hmmm... Does adding
 tracepint instead of printk makes sense?

I guess you mean trace_printk. I have seen that one for debugging
purposes only but it seems like it could be used here. CONFIG_TRACING
seems to be enabled on the most distribution kernels.

I am just worried it needs debugfs mounted and my recollection is that
this has some security implications so there might be some pushback on
mounting it on production systems which would defeat the primary
motivation.
Maybe this concern is not that important wrt. excessive logging, though.
I can live with this solution as well if people really hate logging
approach.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Borislav Petkov

On Thu, Oct 25, 2012 at 04:57:11AM -0700, Dave Hansen wrote:
 On 10/25/2012 02:24 AM, Borislav Petkov wrote:
  But let's discuss this a bit further. So, for the benchmarking aspect,
  you're either going to have to always require dmesg along with
  benchmarking results or /proc/vmstat, depending on where the drop_caches
  stats end up.
  
  Is this how you envision it?
  
  And then there are the VM bug cases, where you might not always get
  full dmesg from a panicked system. In that case, you'd want the kernel
  tainting thing too, so that it at least appears in the oops backtrace.
  
  Although the tainting thing might not be enough - a user could
  drop_caches at some point in time and the oops happening much later
  could be unrelated but that can't be expressed in taint flags.
 
 Here's the problem: Joe Kernel Developer gets a bug report, usually
 something like the kernel is slow, or the kernel is eating up all my
 memory.  We then start going and digging in to the problem with the
 usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
 but less likely, that we get things like vmstat along with such a bug
 report.
 
 Joe Kernel Developer digs in the statistics or the dmesg and tries to
 figure out what happened.  I've run in to a couple of cases in practice
 (and I assume Michal has too) where the bug reporter was using
 drop_caches _heavily_ and did not realize the implications.  It was
 quite hard to track down exactly how the page cache and dentries/inodes
 were getting purged.
 
 There are rarely oopses involved in these scenarios.
 
 The primary goal of this patch is to make debugging those scenarios
 easier so that we can quickly realize that drop_caches is the reason our
 caches went away, not some anomalous VM activity.  A secondary goal is
 to tell the user: Hey, maybe this isn't something you want to be doing
 all the time.

Ok, understood. So you will be requiring dmesg, ok, then it makes sense.

This way you're also getting timestamps of when exactly and how many
times drop_caches was used. For that, though, you'll need to add the
timestamp explicitly to the printk because CONFIG_PRINTK_TIME is not
always enabled.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Michal Hocko

On Thu 25-10-12 04:57:11, Dave Hansen wrote:
[...]
 Here's the problem: Joe Kernel Developer gets a bug report, usually
 something like the kernel is slow, or the kernel is eating up all my
 memory.  We then start going and digging in to the problem with the
 usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
 but less likely, that we get things like vmstat along with such a bug
 report.
 
 Joe Kernel Developer digs in the statistics or the dmesg and tries to
 figure out what happened.  I've run in to a couple of cases in practice
 (and I assume Michal has too) where the bug reporter was using
 drop_caches _heavily_ and did not realize the implications.  It was
 quite hard to track down exactly how the page cache and dentries/inodes
 were getting purged.

Yes, very same here. Not that I would meet issues like that often but it
happened in the past few times and it was always a lot of burnt time.

 There are rarely oopses involved in these scenarios.
 
 The primary goal of this patch is to make debugging those scenarios
 easier so that we can quickly realize that drop_caches is the reason our
 caches went away, not some anomalous VM activity.  A secondary goal is
 to tell the user: Hey, maybe this isn't something you want to be doing
 all the time.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-25 Thread Rafael J. Wysocki

On Wednesday, October 24, 2012 06:17:52 PM Andrew Morton wrote:
 On Thu, 25 Oct 2012 00:04:46 +0200 Rafael J. Wysocki r...@sisk.pl wrote:
 
  On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
   On Wed, 24 Oct 2012 23:06:00 +0200
   Borislav Petkov b...@alien8.de wrote:
   
On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
 Well who knows. Could be that people's vm *does* suck. Or they have
 some particularly peculiar worklosd or requirement[*]. Or their VM
 *used* to suck, and the drop_caches is not really needed any more but
 it's there in vendor-provided code and they can't practically prevent
 it.

I have drop_caches in my suspend-to-disk script so that the hibernation
image is kept at minimum and suspend times are as small as possible.
   
   hm, that sounds smart.
   
Would that be a valid use-case?
   
   I'd say so, unless we change the kernel to do that internally.  We do
   have the hibernation-specific shrink_all_memory() in the vmscan code. 
   We didn't see fit to document _why_ that exists, but IIRC it's there to
   create enough free memory for hibernation to be able to successfully
   complete, but no more.
  
  That's correct.
 
 Well, my point was: how about the idea of reclaiming clean pagecache
 (and inodes, dentries, etc) before hibernation so we read/write less
 disk data?

We may actually want to write more into the image to improve post-resume
responsiveness.

 Given that it's so easy to do from the hibernation script, I guess
 there's not much point...

Well, I'd say so. :-)

 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Thu, 25 Oct 2012 00:04:46 +0200 "Rafael J. Wysocki"  wrote:

> On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
> > On Wed, 24 Oct 2012 23:06:00 +0200
> > Borislav Petkov  wrote:
> > 
> > > On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> > > > Well who knows. Could be that people's vm *does* suck. Or they have
> > > > some particularly peculiar worklosd or requirement[*]. Or their VM
> > > > *used* to suck, and the drop_caches is not really needed any more but
> > > > it's there in vendor-provided code and they can't practically prevent
> > > > it.
> > > 
> > > I have drop_caches in my suspend-to-disk script so that the hibernation
> > > image is kept at minimum and suspend times are as small as possible.
> > 
> > hm, that sounds smart.
> > 
> > > Would that be a valid use-case?
> > 
> > I'd say so, unless we change the kernel to do that internally.  We do
> > have the hibernation-specific shrink_all_memory() in the vmscan code. 
> > We didn't see fit to document _why_ that exists, but IIRC it's there to
> > create enough free memory for hibernation to be able to successfully
> > complete, but no more.
> 
> That's correct.

Well, my point was: how about the idea of reclaiming clean pagecache
(and inodes, dentries, etc) before hibernation so we read/write less
disk data?

Given that it's so easy to do from the hibernation script, I guess
there's not much point...


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread KOSAKI Motohiro

On Wed, Oct 24, 2012 at 6:57 PM, Dave Hansen  wrote:
> On 10/24/2012 03:48 PM, Borislav Petkov wrote:
>> On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
>>> Sounds fairly valid to me. But, it's also one that would not be harmed
>>> or disrupted in any way because of a single additional printk() during
>>> each suspend-to-disk operation.
>>
>> back to the drop_caches patch. How about we hide the drop_caches
>> interface behind some mm debugging option in "Kernel Hacking"? Assuming
>> we don't need it otherwise on production kernels. Probably make it
>> depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.
>>
>> And then also add it to /proc/vmstat, in addition.
>
> That effectively means removing it from the kernel since distros ship
> with those config options off.  We don't want to do that since there
> _are_ valid, occasional uses like benchmarking that we want to be
> consistent.

Agreed. we don't want to remove valid interface never.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 03:48 PM, Borislav Petkov wrote:
> On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
>> Sounds fairly valid to me. But, it's also one that would not be harmed
>> or disrupted in any way because of a single additional printk() during
>> each suspend-to-disk operation.
> 
> back to the drop_caches patch. How about we hide the drop_caches
> interface behind some mm debugging option in "Kernel Hacking"? Assuming
> we don't need it otherwise on production kernels. Probably make it
> depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.
> 
> And then also add it to /proc/vmstat, in addition.

That effectively means removing it from the kernel since distros ship
with those config options off.  We don't want to do that since there
_are_ valid, occasional uses like benchmarking that we want to be
consistent.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
> Sounds fairly valid to me. But, it's also one that would not be harmed
> or disrupted in any way because of a single additional printk() during
> each suspend-to-disk operation.

Btw,

back to the drop_caches patch. How about we hide the drop_caches
interface behind some mm debugging option in "Kernel Hacking"? Assuming
we don't need it otherwise on production kernels. Probably make it
depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.

And then also add it to /proc/vmstat, in addition.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread KOSAKI Motohiro

>> I have drop_caches in my suspend-to-disk script so that the hibernation
>> image is kept at minimum and suspend times are as small as possible.
> 
> hm, that sounds smart.
> 
>> Would that be a valid use-case?
> 
> I'd say so, unless we change the kernel to do that internally.  We do
> have the hibernation-specific shrink_all_memory() in the vmscan code. 
> We didn't see fit to document _why_ that exists, but IIRC it's there to
> create enough free memory for hibernation to be able to successfully
> complete, but no more.

shrink_all_memory() drop minimum memory to be needed from hibernation.
that's trade off matter.

- drop all page cache
  pros.
   speed up hibernation time
  cons.
   after go back from hibernation, system works very slow a while until
   system will get enough file cache.

- drop minimum page cache
  pros.
   system works quickly when go back from hibernation.
  cons.
   relative large hibernation time


So, I'm not fun change hibernation default. hmmm... Does adding tracepint 
instead of printk
makes sense?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Rafael J. Wysocki

On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
> On Wed, 24 Oct 2012 23:06:00 +0200
> Borislav Petkov  wrote:
> 
> > On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> > > Well who knows. Could be that people's vm *does* suck. Or they have
> > > some particularly peculiar worklosd or requirement[*]. Or their VM
> > > *used* to suck, and the drop_caches is not really needed any more but
> > > it's there in vendor-provided code and they can't practically prevent
> > > it.
> > 
> > I have drop_caches in my suspend-to-disk script so that the hibernation
> > image is kept at minimum and suspend times are as small as possible.
> 
> hm, that sounds smart.
> 
> > Would that be a valid use-case?
> 
> I'd say so, unless we change the kernel to do that internally.  We do
> have the hibernation-specific shrink_all_memory() in the vmscan code. 
> We didn't see fit to document _why_ that exists, but IIRC it's there to
> create enough free memory for hibernation to be able to successfully
> complete, but no more.

That's correct.

> Who owns hibernaton nowadays?  Rafael, I guess?

I'm still maintaining it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 02:06 PM, Borislav Petkov wrote:
> On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
>> Well who knows. Could be that people's vm *does* suck. Or they have
>> some particularly peculiar worklosd or requirement[*]. Or their VM
>> *used* to suck, and the drop_caches is not really needed any more but
>> it's there in vendor-provided code and they can't practically prevent
>> it.
> 
> I have drop_caches in my suspend-to-disk script so that the hibernation
> image is kept at minimum and suspend times are as small as possible.
> 
> Would that be a valid use-case?

Sounds fairly valid to me.  But, it's also one that would not be harmed
or disrupted in any way because of a single additional printk() during
each suspend-to-disk operation.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 23:06:00 +0200
Borislav Petkov  wrote:

> On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> > Well who knows. Could be that people's vm *does* suck. Or they have
> > some particularly peculiar worklosd or requirement[*]. Or their VM
> > *used* to suck, and the drop_caches is not really needed any more but
> > it's there in vendor-provided code and they can't practically prevent
> > it.
> 
> I have drop_caches in my suspend-to-disk script so that the hibernation
> image is kept at minimum and suspend times are as small as possible.

hm, that sounds smart.

> Would that be a valid use-case?

I'd say so, unless we change the kernel to do that internally.  We do
have the hibernation-specific shrink_all_memory() in the vmscan code. 
We didn't see fit to document _why_ that exists, but IIRC it's there to
create enough free memory for hibernation to be able to successfully
complete, but no more.

Who owns hibernaton nowadays?  Rafael, I guess?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
> Well who knows. Could be that people's vm *does* suck. Or they have
> some particularly peculiar worklosd or requirement[*]. Or their VM
> *used* to suck, and the drop_caches is not really needed any more but
> it's there in vendor-provided code and they can't practically prevent
> it.

I have drop_caches in my suspend-to-disk script so that the hibernation
image is kept at minimum and suspend times are as small as possible.

Would that be a valid use-case?

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 13:28:19 -0700
Dave Hansen  wrote:

> On 10/24/2012 12:54 PM, Andrew Morton wrote:
> > hmpf.  This patch worries me.  If there are people out there who are
> > regularly using drop_caches because the VM sucks, it seems pretty
> > obnoxious of us to go dumping stuff into their syslog.  What are they
> > supposed to do?  Stop using drop_caches?
> 
> People use drop_caches because they _think_ the VM sucks, or they
> _think_ they're "tuning" their system.  _They_ are supposed to stop
> using drop_caches. :)

Well who knows.  Could be that people's vm *does* suck.  Or they have
some particularly peculiar worklosd or requirement[*].  Or their VM
*used* to suck, and the drop_caches is not really needed any more but
it's there in vendor-provided code and they can't practically prevent
it.

[*] If your workload consists of having to handle large bursts of data
with minimum latency and then waiting around for another burst, it
makes sense to drop all your cached data between bursts.

> What kind of interface _is_ it in the first place?  Is it really a
> production-level thing that we expect users to be poking at?  Or, is it
> a rarely-used debugging and benchmarking knob which is fair game for us
> to tweak like this?

It was a rarely-used mainly-developer-only thing which, apparently, real
people found useful at some point in the past.  Perhaps we should never
have offered it.

> Do we have any valid uses of drop_caches where the printk() would truly
> _be_ disruptive?  Are those cases where we _also_ have real kernel bugs
> or issues that we should be working?  If it disrupts them and they go to
> their vendor or the community directly, it gives us at least a shot at
> fixing the real problems (or fixing the "invalid" use).

Heaven knows - I'm just going from what Michal has told me and various
rumors which keep surfacing on the internet ;)

> Adding taint, making this a single-shot printk, or adding vmstat
> counters are all good ideas.  I guess I think the disruption is a
> feature because I hope it will draw some folks out of the woodwork.

I had a "send mail to a...@zip.com.au" printk in 3c59x.c many years
ago.  For about two months.  It took *years* before I stopped getting
emails ;)

Gee, I dunno.  I have issues with it :( We could do
printk_ratelimited(one-hour) but I suspect that would defeat Michal's
purpose.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 12:54 PM, Andrew Morton wrote:
> hmpf.  This patch worries me.  If there are people out there who are
> regularly using drop_caches because the VM sucks, it seems pretty
> obnoxious of us to go dumping stuff into their syslog.  What are they
> supposed to do?  Stop using drop_caches?

People use drop_caches because they _think_ the VM sucks, or they
_think_ they're "tuning" their system.  _They_ are supposed to stop
using drop_caches. :)

What kind of interface _is_ it in the first place?  Is it really a
production-level thing that we expect users to be poking at?  Or, is it
a rarely-used debugging and benchmarking knob which is fair game for us
to tweak like this?

Do we have any valid uses of drop_caches where the printk() would truly
_be_ disruptive?  Are those cases where we _also_ have real kernel bugs
or issues that we should be working?  If it disrupts them and they go to
their vendor or the community directly, it gives us at least a shot at
fixing the real problems (or fixing the "invalid" use).

Adding taint, making this a single-shot printk, or adding vmstat
counters are all good ideas.  I guess I think the disruption is a
feature because I hope it will draw some folks out of the woodwork.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 08:29:45 +0200
Michal Hocko  wrote:

> > >
> > > + printk(KERN_NOTICE "%s (%d): dropped kernel caches: %d\n",
> > > + current->comm, task_pid_nr(current), 
> > > sysctl_drop_caches);
> > 
> > urgh.  Are we really sure we want to do this?  The system operators who
> > are actually using this thing will hate us :(
> 
> I have no problems with lowering the priority (how do you see
> KERN_INFO?) but shouldn't this message kick them that they are doing
> something wrong? Or if somebody uses that for "benchmarking" to have a
> clean table before start is this really that invasive?

hmpf.  This patch worries me.  If there are people out there who are
regularly using drop_caches because the VM sucks, it seems pretty
obnoxious of us to go dumping stuff into their syslog.  What are they
supposed to do?  Stop using drop_caches?  But that would unfix the
problem which they fixed with drop_caches in the first case.

And they might not even have control over the code - they need to go
back to their supplier and say "please send me a new version", along
with all the additional costs and risks involed in an update.

> > More friendly alternatives might be:
> > 
> > - Taint the kernel.  But that will only become apparent with an oops
> >   trace or similar.
> > 
> > - Add a drop_caches counter and make that available in /proc/vmstat,
> >   show_mem() output and perhaps other places.
> 
> We would loose timing and originating process name in both cases which
> can be really helpful while debugging. It is fair to say that we could
> deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
> already and we do collect them often but this is not the case all of the
> time and sometimes it is important to know _who_ is doing all this.

But how important is all that?  The main piece of information the
kernel developer wants is "this guy is using drop_caches a lot".  All
the other info is peripheral and can be gathered by other means if so
desired.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Michal Hocko

On Tue 23-10-12 16:45:46, Andrew Morton wrote:
> On Fri, 12 Oct 2012 14:57:08 +0200
> Michal Hocko  wrote:
> 
> > Hi,
> > I would like to resurrect the following Dave's patch. The last time it
> > has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
> > didn't seem to be any strong opposition. 
> > Kosaki was worried about possible excessive logging when somebody drops
> > caches too often (but then he claimed he didn't have a strong opinion
> > on that) but I would say opposite. If somebody does that then I would
> > really like to know that from the log when supporting a system because
> > it almost for sure means that there is something fishy going on. It is
> > also worth mentioning that only root can write drop caches so this is
> > not an flooding attack vector.
> > I am bringing that up again because this can be really helpful when
> > chasing strange performance issues which (surprise surprise) turn out to
> > be related to artificially dropped caches done because the admin thinks
> > this would help...
> > 
> > I have just refreshed the original patch on top of the current mm tree
> > but I could live with KERN_INFO as well if people think that KERN_NOTICE
> > is too hysterical.
> > ---
> > >From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
> > From: Dave Hansen 
> > Date: Fri, 12 Oct 2012 14:30:54 +0200
> > Subject: [PATCH] add some drop_caches documentation and info messsge
> > 
> > There is plenty of anecdotal evidence and a load of blog posts
> > suggesting that using "drop_caches" periodically keeps your system
> > running in "tip top shape".  Perhaps adding some kernel
> > documentation will increase the amount of accurate data on its use.
> > 
> > If we are not shrinking caches effectively, then we have real bugs.
> > Using drop_caches will simply mask the bugs and make them harder
> > to find, but certainly does not fix them, nor is it an appropriate
> > "workaround" to limit the size of the caches.
> > 
> > It's a great debugging tool, and is really handy for doing things
> > like repeatable benchmark runs.  So, add a bit more documentation
> > about it, and add a little KERN_NOTICE.  It should help developers
> > who are chasing down reclaim-related bugs.
> > 
> > ...
> >
> > +   printk(KERN_NOTICE "%s (%d): dropped kernel caches: %d\n",
> > +   current->comm, task_pid_nr(current), 
> > sysctl_drop_caches);
> 
> urgh.  Are we really sure we want to do this?  The system operators who
> are actually using this thing will hate us :(

I have no problems with lowering the priority (how do you see
KERN_INFO?) but shouldn't this message kick them that they are doing
something wrong? Or if somebody uses that for "benchmarking" to have a
clean table before start is this really that invasive?

> More friendly alternatives might be:
> 
> - Taint the kernel.  But that will only become apparent with an oops
>   trace or similar.
> 
> - Add a drop_caches counter and make that available in /proc/vmstat,
>   show_mem() output and perhaps other places.

We would loose timing and originating process name in both cases which
can be really helpful while debugging. It is fair to say that we could
deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
already and we do collect them often but this is not the case all of the
time and sometimes it is important to know _who_ is doing all this.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Michal Hocko

On Tue 23-10-12 16:45:46, Andrew Morton wrote:
 On Fri, 12 Oct 2012 14:57:08 +0200
 Michal Hocko mho...@suse.cz wrote:
 
  Hi,
  I would like to resurrect the following Dave's patch. The last time it
  has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
  didn't seem to be any strong opposition. 
  Kosaki was worried about possible excessive logging when somebody drops
  caches too often (but then he claimed he didn't have a strong opinion
  on that) but I would say opposite. If somebody does that then I would
  really like to know that from the log when supporting a system because
  it almost for sure means that there is something fishy going on. It is
  also worth mentioning that only root can write drop caches so this is
  not an flooding attack vector.
  I am bringing that up again because this can be really helpful when
  chasing strange performance issues which (surprise surprise) turn out to
  be related to artificially dropped caches done because the admin thinks
  this would help...
  
  I have just refreshed the original patch on top of the current mm tree
  but I could live with KERN_INFO as well if people think that KERN_NOTICE
  is too hysterical.
  ---
  From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
  From: Dave Hansen d...@linux.vnet.ibm.com
  Date: Fri, 12 Oct 2012 14:30:54 +0200
  Subject: [PATCH] add some drop_caches documentation and info messsge
  
  There is plenty of anecdotal evidence and a load of blog posts
  suggesting that using drop_caches periodically keeps your system
  running in tip top shape.  Perhaps adding some kernel
  documentation will increase the amount of accurate data on its use.
  
  If we are not shrinking caches effectively, then we have real bugs.
  Using drop_caches will simply mask the bugs and make them harder
  to find, but certainly does not fix them, nor is it an appropriate
  workaround to limit the size of the caches.
  
  It's a great debugging tool, and is really handy for doing things
  like repeatable benchmark runs.  So, add a bit more documentation
  about it, and add a little KERN_NOTICE.  It should help developers
  who are chasing down reclaim-related bugs.
  
  ...
 
  +   printk(KERN_NOTICE %s (%d): dropped kernel caches: %d\n,
  +   current-comm, task_pid_nr(current), 
  sysctl_drop_caches);
 
 urgh.  Are we really sure we want to do this?  The system operators who
 are actually using this thing will hate us :(

I have no problems with lowering the priority (how do you see
KERN_INFO?) but shouldn't this message kick them that they are doing
something wrong? Or if somebody uses that for benchmarking to have a
clean table before start is this really that invasive?

 More friendly alternatives might be:
 
 - Taint the kernel.  But that will only become apparent with an oops
   trace or similar.
 
 - Add a drop_caches counter and make that available in /proc/vmstat,
   show_mem() output and perhaps other places.

We would loose timing and originating process name in both cases which
can be really helpful while debugging. It is fair to say that we could
deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
already and we do collect them often but this is not the case all of the
time and sometimes it is important to know _who_ is doing all this.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 08:29:45 +0200
Michal Hocko mho...@suse.cz wrote:

  
   + printk(KERN_NOTICE %s (%d): dropped kernel caches: %d\n,
   + current-comm, task_pid_nr(current), 
   sysctl_drop_caches);
  
  urgh.  Are we really sure we want to do this?  The system operators who
  are actually using this thing will hate us :(
 
 I have no problems with lowering the priority (how do you see
 KERN_INFO?) but shouldn't this message kick them that they are doing
 something wrong? Or if somebody uses that for benchmarking to have a
 clean table before start is this really that invasive?

hmpf.  This patch worries me.  If there are people out there who are
regularly using drop_caches because the VM sucks, it seems pretty
obnoxious of us to go dumping stuff into their syslog.  What are they
supposed to do?  Stop using drop_caches?  But that would unfix the
problem which they fixed with drop_caches in the first case.

And they might not even have control over the code - they need to go
back to their supplier and say please send me a new version, along
with all the additional costs and risks involed in an update.

  More friendly alternatives might be:
  
  - Taint the kernel.  But that will only become apparent with an oops
trace or similar.
  
  - Add a drop_caches counter and make that available in /proc/vmstat,
show_mem() output and perhaps other places.
 
 We would loose timing and originating process name in both cases which
 can be really helpful while debugging. It is fair to say that we could
 deduce the timing if we are collecting /proc/meminfo or /proc/vmstat
 already and we do collect them often but this is not the case all of the
 time and sometimes it is important to know _who_ is doing all this.

But how important is all that?  The main piece of information the
kernel developer wants is this guy is using drop_caches a lot.  All
the other info is peripheral and can be gathered by other means if so
desired.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 12:54 PM, Andrew Morton wrote:
 hmpf.  This patch worries me.  If there are people out there who are
 regularly using drop_caches because the VM sucks, it seems pretty
 obnoxious of us to go dumping stuff into their syslog.  What are they
 supposed to do?  Stop using drop_caches?

People use drop_caches because they _think_ the VM sucks, or they
_think_ they're tuning their system.  _They_ are supposed to stop
using drop_caches. :)

What kind of interface _is_ it in the first place?  Is it really a
production-level thing that we expect users to be poking at?  Or, is it
a rarely-used debugging and benchmarking knob which is fair game for us
to tweak like this?

Do we have any valid uses of drop_caches where the printk() would truly
_be_ disruptive?  Are those cases where we _also_ have real kernel bugs
or issues that we should be working?  If it disrupts them and they go to
their vendor or the community directly, it gives us at least a shot at
fixing the real problems (or fixing the invalid use).

Adding taint, making this a single-shot printk, or adding vmstat
counters are all good ideas.  I guess I think the disruption is a
feature because I hope it will draw some folks out of the woodwork.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 13:28:19 -0700
Dave Hansen d...@linux.vnet.ibm.com wrote:

 On 10/24/2012 12:54 PM, Andrew Morton wrote:
  hmpf.  This patch worries me.  If there are people out there who are
  regularly using drop_caches because the VM sucks, it seems pretty
  obnoxious of us to go dumping stuff into their syslog.  What are they
  supposed to do?  Stop using drop_caches?
 
 People use drop_caches because they _think_ the VM sucks, or they
 _think_ they're tuning their system.  _They_ are supposed to stop
 using drop_caches. :)

Well who knows.  Could be that people's vm *does* suck.  Or they have
some particularly peculiar worklosd or requirement[*].  Or their VM
*used* to suck, and the drop_caches is not really needed any more but
it's there in vendor-provided code and they can't practically prevent
it.

[*] If your workload consists of having to handle large bursts of data
with minimum latency and then waiting around for another burst, it
makes sense to drop all your cached data between bursts.

 What kind of interface _is_ it in the first place?  Is it really a
 production-level thing that we expect users to be poking at?  Or, is it
 a rarely-used debugging and benchmarking knob which is fair game for us
 to tweak like this?

It was a rarely-used mainly-developer-only thing which, apparently, real
people found useful at some point in the past.  Perhaps we should never
have offered it.

 Do we have any valid uses of drop_caches where the printk() would truly
 _be_ disruptive?  Are those cases where we _also_ have real kernel bugs
 or issues that we should be working?  If it disrupts them and they go to
 their vendor or the community directly, it gives us at least a shot at
 fixing the real problems (or fixing the invalid use).

Heaven knows - I'm just going from what Michal has told me and various
rumors which keep surfacing on the internet ;)

 Adding taint, making this a single-shot printk, or adding vmstat
 counters are all good ideas.  I guess I think the disruption is a
 feature because I hope it will draw some folks out of the woodwork.

I had a send mail to a...@zip.com.au printk in 3c59x.c many years
ago.  For about two months.  It took *years* before I stopped getting
emails ;)



Gee, I dunno.  I have issues with it :( We could do
printk_ratelimited(one-hour) but I suspect that would defeat Michal's
purpose.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
 Well who knows. Could be that people's vm *does* suck. Or they have
 some particularly peculiar worklosd or requirement[*]. Or their VM
 *used* to suck, and the drop_caches is not really needed any more but
 it's there in vendor-provided code and they can't practically prevent
 it.

I have drop_caches in my suspend-to-disk script so that the hibernation
image is kept at minimum and suspend times are as small as possible.

Would that be a valid use-case?

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Wed, 24 Oct 2012 23:06:00 +0200
Borislav Petkov b...@alien8.de wrote:

 On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
  Well who knows. Could be that people's vm *does* suck. Or they have
  some particularly peculiar worklosd or requirement[*]. Or their VM
  *used* to suck, and the drop_caches is not really needed any more but
  it's there in vendor-provided code and they can't practically prevent
  it.
 
 I have drop_caches in my suspend-to-disk script so that the hibernation
 image is kept at minimum and suspend times are as small as possible.

hm, that sounds smart.

 Would that be a valid use-case?

I'd say so, unless we change the kernel to do that internally.  We do
have the hibernation-specific shrink_all_memory() in the vmscan code. 
We didn't see fit to document _why_ that exists, but IIRC it's there to
create enough free memory for hibernation to be able to successfully
complete, but no more.

Who owns hibernaton nowadays?  Rafael, I guess?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 02:06 PM, Borislav Petkov wrote:
 On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
 Well who knows. Could be that people's vm *does* suck. Or they have
 some particularly peculiar worklosd or requirement[*]. Or their VM
 *used* to suck, and the drop_caches is not really needed any more but
 it's there in vendor-provided code and they can't practically prevent
 it.
 
 I have drop_caches in my suspend-to-disk script so that the hibernation
 image is kept at minimum and suspend times are as small as possible.
 
 Would that be a valid use-case?

Sounds fairly valid to me.  But, it's also one that would not be harmed
or disrupted in any way because of a single additional printk() during
each suspend-to-disk operation.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Rafael J. Wysocki

On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
 On Wed, 24 Oct 2012 23:06:00 +0200
 Borislav Petkov b...@alien8.de wrote:
 
  On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
   Well who knows. Could be that people's vm *does* suck. Or they have
   some particularly peculiar worklosd or requirement[*]. Or their VM
   *used* to suck, and the drop_caches is not really needed any more but
   it's there in vendor-provided code and they can't practically prevent
   it.
  
  I have drop_caches in my suspend-to-disk script so that the hibernation
  image is kept at minimum and suspend times are as small as possible.
 
 hm, that sounds smart.
 
  Would that be a valid use-case?
 
 I'd say so, unless we change the kernel to do that internally.  We do
 have the hibernation-specific shrink_all_memory() in the vmscan code. 
 We didn't see fit to document _why_ that exists, but IIRC it's there to
 create enough free memory for hibernation to be able to successfully
 complete, but no more.

That's correct.

 Who owns hibernaton nowadays?  Rafael, I guess?

I'm still maintaining it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread KOSAKI Motohiro

 I have drop_caches in my suspend-to-disk script so that the hibernation
 image is kept at minimum and suspend times are as small as possible.
 
 hm, that sounds smart.
 
 Would that be a valid use-case?
 
 I'd say so, unless we change the kernel to do that internally.  We do
 have the hibernation-specific shrink_all_memory() in the vmscan code. 
 We didn't see fit to document _why_ that exists, but IIRC it's there to
 create enough free memory for hibernation to be able to successfully
 complete, but no more.

shrink_all_memory() drop minimum memory to be needed from hibernation.
that's trade off matter.

- drop all page cache
  pros.
   speed up hibernation time
  cons.
   after go back from hibernation, system works very slow a while until
   system will get enough file cache.

- drop minimum page cache
  pros.
   system works quickly when go back from hibernation.
  cons.
   relative large hibernation time


So, I'm not fun change hibernation default. hmmm... Does adding tracepint 
instead of printk
makes sense?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Borislav Petkov

On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
 Sounds fairly valid to me. But, it's also one that would not be harmed
 or disrupted in any way because of a single additional printk() during
 each suspend-to-disk operation.

Btw,

back to the drop_caches patch. How about we hide the drop_caches
interface behind some mm debugging option in Kernel Hacking? Assuming
we don't need it otherwise on production kernels. Probably make it
depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.

And then also add it to /proc/vmstat, in addition.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Dave Hansen

On 10/24/2012 03:48 PM, Borislav Petkov wrote:
 On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
 Sounds fairly valid to me. But, it's also one that would not be harmed
 or disrupted in any way because of a single additional printk() during
 each suspend-to-disk operation.
 
 back to the drop_caches patch. How about we hide the drop_caches
 interface behind some mm debugging option in Kernel Hacking? Assuming
 we don't need it otherwise on production kernels. Probably make it
 depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.
 
 And then also add it to /proc/vmstat, in addition.

That effectively means removing it from the kernel since distros ship
with those config options off.  We don't want to do that since there
_are_ valid, occasional uses like benchmarking that we want to be
consistent.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread KOSAKI Motohiro

On Wed, Oct 24, 2012 at 6:57 PM, Dave Hansen d...@linux.vnet.ibm.com wrote:
 On 10/24/2012 03:48 PM, Borislav Petkov wrote:
 On Wed, Oct 24, 2012 at 02:18:38PM -0700, Dave Hansen wrote:
 Sounds fairly valid to me. But, it's also one that would not be harmed
 or disrupted in any way because of a single additional printk() during
 each suspend-to-disk operation.

 back to the drop_caches patch. How about we hide the drop_caches
 interface behind some mm debugging option in Kernel Hacking? Assuming
 we don't need it otherwise on production kernels. Probably make it
 depend on CONFIG_DEBUG_VM like CONFIG_DEBUG_VM_RB or so.

 And then also add it to /proc/vmstat, in addition.

 That effectively means removing it from the kernel since distros ship
 with those config options off.  We don't want to do that since there
 _are_ valid, occasional uses like benchmarking that we want to be
 consistent.

Agreed. we don't want to remove valid interface never.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-24 Thread Andrew Morton

On Thu, 25 Oct 2012 00:04:46 +0200 Rafael J. Wysocki r...@sisk.pl wrote:

 On Wednesday 24 of October 2012 14:13:03 Andrew Morton wrote:
  On Wed, 24 Oct 2012 23:06:00 +0200
  Borislav Petkov b...@alien8.de wrote:
  
   On Wed, Oct 24, 2012 at 01:48:36PM -0700, Andrew Morton wrote:
Well who knows. Could be that people's vm *does* suck. Or they have
some particularly peculiar worklosd or requirement[*]. Or their VM
*used* to suck, and the drop_caches is not really needed any more but
it's there in vendor-provided code and they can't practically prevent
it.
   
   I have drop_caches in my suspend-to-disk script so that the hibernation
   image is kept at minimum and suspend times are as small as possible.
  
  hm, that sounds smart.
  
   Would that be a valid use-case?
  
  I'd say so, unless we change the kernel to do that internally.  We do
  have the hibernation-specific shrink_all_memory() in the vmscan code. 
  We didn't see fit to document _why_ that exists, but IIRC it's there to
  create enough free memory for hibernation to be able to successfully
  complete, but no more.
 
 That's correct.

Well, my point was: how about the idea of reclaiming clean pagecache
(and inodes, dentries, etc) before hibernation so we read/write less
disk data?

Given that it's so easy to do from the hibernation script, I guess
there's not much point...


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-23 Thread Andrew Morton

On Fri, 12 Oct 2012 14:57:08 +0200
Michal Hocko  wrote:

> Hi,
> I would like to resurrect the following Dave's patch. The last time it
> has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
> didn't seem to be any strong opposition. 
> Kosaki was worried about possible excessive logging when somebody drops
> caches too often (but then he claimed he didn't have a strong opinion
> on that) but I would say opposite. If somebody does that then I would
> really like to know that from the log when supporting a system because
> it almost for sure means that there is something fishy going on. It is
> also worth mentioning that only root can write drop caches so this is
> not an flooding attack vector.
> I am bringing that up again because this can be really helpful when
> chasing strange performance issues which (surprise surprise) turn out to
> be related to artificially dropped caches done because the admin thinks
> this would help...
> 
> I have just refreshed the original patch on top of the current mm tree
> but I could live with KERN_INFO as well if people think that KERN_NOTICE
> is too hysterical.
> ---
> >From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
> From: Dave Hansen 
> Date: Fri, 12 Oct 2012 14:30:54 +0200
> Subject: [PATCH] add some drop_caches documentation and info messsge
> 
> There is plenty of anecdotal evidence and a load of blog posts
> suggesting that using "drop_caches" periodically keeps your system
> running in "tip top shape".  Perhaps adding some kernel
> documentation will increase the amount of accurate data on its use.
> 
> If we are not shrinking caches effectively, then we have real bugs.
> Using drop_caches will simply mask the bugs and make them harder
> to find, but certainly does not fix them, nor is it an appropriate
> "workaround" to limit the size of the caches.
> 
> It's a great debugging tool, and is really handy for doing things
> like repeatable benchmark runs.  So, add a bit more documentation
> about it, and add a little KERN_NOTICE.  It should help developers
> who are chasing down reclaim-related bugs.
> 
> ...
>
> + printk(KERN_NOTICE "%s (%d): dropped kernel caches: %d\n",
> + current->comm, task_pid_nr(current), 
> sysctl_drop_caches);

urgh.  Are we really sure we want to do this?  The system operators who
are actually using this thing will hate us :(


More friendly alternatives might be:

- Taint the kernel.  But that will only become apparent with an oops
  trace or similar.

- Add a drop_caches counter and make that available in /proc/vmstat,
  show_mem() output and perhaps other places.

I suspect the /proc/vmstat counter will suffice - if someone is having
vm issues, we'll be seeing their /proc/vmstat at some stage and if the
drop_caches counter is high, that's enough to get suspicious?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-23 Thread Andrew Morton

On Fri, 12 Oct 2012 14:57:08 +0200
Michal Hocko mho...@suse.cz wrote:

 Hi,
 I would like to resurrect the following Dave's patch. The last time it
 has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
 didn't seem to be any strong opposition. 
 Kosaki was worried about possible excessive logging when somebody drops
 caches too often (but then he claimed he didn't have a strong opinion
 on that) but I would say opposite. If somebody does that then I would
 really like to know that from the log when supporting a system because
 it almost for sure means that there is something fishy going on. It is
 also worth mentioning that only root can write drop caches so this is
 not an flooding attack vector.
 I am bringing that up again because this can be really helpful when
 chasing strange performance issues which (surprise surprise) turn out to
 be related to artificially dropped caches done because the admin thinks
 this would help...
 
 I have just refreshed the original patch on top of the current mm tree
 but I could live with KERN_INFO as well if people think that KERN_NOTICE
 is too hysterical.
 ---
 From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
 From: Dave Hansen d...@linux.vnet.ibm.com
 Date: Fri, 12 Oct 2012 14:30:54 +0200
 Subject: [PATCH] add some drop_caches documentation and info messsge
 
 There is plenty of anecdotal evidence and a load of blog posts
 suggesting that using drop_caches periodically keeps your system
 running in tip top shape.  Perhaps adding some kernel
 documentation will increase the amount of accurate data on its use.
 
 If we are not shrinking caches effectively, then we have real bugs.
 Using drop_caches will simply mask the bugs and make them harder
 to find, but certainly does not fix them, nor is it an appropriate
 workaround to limit the size of the caches.
 
 It's a great debugging tool, and is really handy for doing things
 like repeatable benchmark runs.  So, add a bit more documentation
 about it, and add a little KERN_NOTICE.  It should help developers
 who are chasing down reclaim-related bugs.
 
 ...

 + printk(KERN_NOTICE %s (%d): dropped kernel caches: %d\n,
 + current-comm, task_pid_nr(current), 
 sysctl_drop_caches);

urgh.  Are we really sure we want to do this?  The system operators who
are actually using this thing will hate us :(


More friendly alternatives might be:

- Taint the kernel.  But that will only become apparent with an oops
  trace or similar.

- Add a drop_caches counter and make that available in /proc/vmstat,
  show_mem() output and perhaps other places.

I suspect the /proc/vmstat counter will suffice - if someone is having
vm issues, we'll be seeing their /proc/vmstat at some stage and if the
drop_caches counter is high, that's enough to get suspicious?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-15 Thread Dave Hansen

On 10/12/2012 05:57 AM, Michal Hocko wrote:
> I would like to resurrect the following Dave's patch. The last time it
> has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
> didn't seem to be any strong opposition. 
> Kosaki was worried about possible excessive logging when somebody drops
> caches too often (but then he claimed he didn't have a strong opinion
> on that) but I would say opposite. If somebody does that then I would
> really like to know that from the log when supporting a system because
> it almost for sure means that there is something fishy going on. It is
> also worth mentioning that only root can write drop caches so this is
> not an flooding attack vector.

Just read through the patch again.  Still looks great to me.

Thanks for bringing it up again, Michal!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-15 Thread Kamezawa Hiroyuki


(2012/10/12 21:57), Michal Hocko wrote:

Hi,
I would like to resurrect the following Dave's patch. The last time it
has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
didn't seem to be any strong opposition.
Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion
on that) but I would say opposite. If somebody does that then I would
really like to know that from the log when supporting a system because
it almost for sure means that there is something fishy going on. It is
also worth mentioning that only root can write drop caches so this is
not an flooding attack vector.
I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...

I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.
---
 From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
From: Dave Hansen 
Date: Fri, 12 Oct 2012 14:30:54 +0200
Subject: [PATCH] add some drop_caches documentation and info messsge

There is plenty of anecdotal evidence and a load of blog posts
suggesting that using "drop_caches" periodically keeps your system
running in "tip top shape".  Perhaps adding some kernel
documentation will increase the amount of accurate data on its use.

If we are not shrinking caches effectively, then we have real bugs.
Using drop_caches will simply mask the bugs and make them harder
to find, but certainly does not fix them, nor is it an appropriate
"workaround" to limit the size of the caches.

It's a great debugging tool, and is really handy for doing things
like repeatable benchmark runs.  So, add a bit more documentation
about it, and add a little KERN_NOTICE.  It should help developers
who are chasing down reclaim-related bugs.

[mho...@suse.cz: refreshed to current -mm tree]
Signed-off-by: Dave Hansen 
Reviewed-by: KAMEZAWA Hiroyuki 
Acked-by: Michal Hocko 


Acked-by: KAMEZAWA Hiroyuki 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-15 Thread Kamezawa Hiroyuki


(2012/10/12 21:57), Michal Hocko wrote:

Hi,
I would like to resurrect the following Dave's patch. The last time it
has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
didn't seem to be any strong opposition.
Kosaki was worried about possible excessive logging when somebody drops
caches too often (but then he claimed he didn't have a strong opinion
on that) but I would say opposite. If somebody does that then I would
really like to know that from the log when supporting a system because
it almost for sure means that there is something fishy going on. It is
also worth mentioning that only root can write drop caches so this is
not an flooding attack vector.
I am bringing that up again because this can be really helpful when
chasing strange performance issues which (surprise surprise) turn out to
be related to artificially dropped caches done because the admin thinks
this would help...

I have just refreshed the original patch on top of the current mm tree
but I could live with KERN_INFO as well if people think that KERN_NOTICE
is too hysterical.
---
 From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
From: Dave Hansen d...@linux.vnet.ibm.com
Date: Fri, 12 Oct 2012 14:30:54 +0200
Subject: [PATCH] add some drop_caches documentation and info messsge

There is plenty of anecdotal evidence and a load of blog posts
suggesting that using drop_caches periodically keeps your system
running in tip top shape.  Perhaps adding some kernel
documentation will increase the amount of accurate data on its use.

If we are not shrinking caches effectively, then we have real bugs.
Using drop_caches will simply mask the bugs and make them harder
to find, but certainly does not fix them, nor is it an appropriate
workaround to limit the size of the caches.

It's a great debugging tool, and is really handy for doing things
like repeatable benchmark runs.  So, add a bit more documentation
about it, and add a little KERN_NOTICE.  It should help developers
who are chasing down reclaim-related bugs.

[mho...@suse.cz: refreshed to current -mm tree]
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
Reviewed-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
Acked-by: Michal Hocko mho...@suse.cz


Acked-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-15 Thread Dave Hansen

On 10/12/2012 05:57 AM, Michal Hocko wrote:
 I would like to resurrect the following Dave's patch. The last time it
 has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
 didn't seem to be any strong opposition. 
 Kosaki was worried about possible excessive logging when somebody drops
 caches too often (but then he claimed he didn't have a strong opinion
 on that) but I would say opposite. If somebody does that then I would
 really like to know that from the log when supporting a system because
 it almost for sure means that there is something fishy going on. It is
 also worth mentioning that only root can write drop caches so this is
 not an flooding attack vector.

Just read through the patch again.  Still looks great to me.

Thanks for bringing it up again, Michal!

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-12 Thread KOSAKI Motohiro

On Fri, Oct 12, 2012 at 8:57 AM, Michal Hocko  wrote:
> Hi,
> I would like to resurrect the following Dave's patch. The last time it
> has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
> didn't seem to be any strong opposition.
> Kosaki was worried about possible excessive logging when somebody drops
> caches too often (but then he claimed he didn't have a strong opinion
> on that) but I would say opposite. If somebody does that then I would
> really like to know that from the log when supporting a system because
> it almost for sure means that there is something fishy going on. It is
> also worth mentioning that only root can write drop caches so this is
> not an flooding attack vector.
> I am bringing that up again because this can be really helpful when
> chasing strange performance issues which (surprise surprise) turn out to
> be related to artificially dropped caches done because the admin thinks
> this would help...
>
> I have just refreshed the original patch on top of the current mm tree
> but I could live with KERN_INFO as well if people think that KERN_NOTICE
> is too hysterical.
> ---
> From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
> From: Dave Hansen 
> Date: Fri, 12 Oct 2012 14:30:54 +0200
> Subject: [PATCH] add some drop_caches documentation and info messsge
>
> There is plenty of anecdotal evidence and a load of blog posts
> suggesting that using "drop_caches" periodically keeps your system
> running in "tip top shape".  Perhaps adding some kernel
> documentation will increase the amount of accurate data on its use.
>
> If we are not shrinking caches effectively, then we have real bugs.
> Using drop_caches will simply mask the bugs and make them harder
> to find, but certainly does not fix them, nor is it an appropriate
> "workaround" to limit the size of the caches.
>
> It's a great debugging tool, and is really handy for doing things
> like repeatable benchmark runs.  So, add a bit more documentation
> about it, and add a little KERN_NOTICE.  It should help developers
> who are chasing down reclaim-related bugs.
>
> [mho...@suse.cz: refreshed to current -mm tree]
> Signed-off-by: Dave Hansen 
> Reviewed-by: KAMEZAWA Hiroyuki 
> Acked-by: Michal Hocko 

Looks fine.

Acked-by: KOSAKI Motohiro 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] add some drop_caches documentation and info messsge

2012-10-12 Thread KOSAKI Motohiro

On Fri, Oct 12, 2012 at 8:57 AM, Michal Hocko mho...@suse.cz wrote:
 Hi,
 I would like to resurrect the following Dave's patch. The last time it
 has been posted was here https://lkml.org/lkml/2010/9/16/250 and there
 didn't seem to be any strong opposition.
 Kosaki was worried about possible excessive logging when somebody drops
 caches too often (but then he claimed he didn't have a strong opinion
 on that) but I would say opposite. If somebody does that then I would
 really like to know that from the log when supporting a system because
 it almost for sure means that there is something fishy going on. It is
 also worth mentioning that only root can write drop caches so this is
 not an flooding attack vector.
 I am bringing that up again because this can be really helpful when
 chasing strange performance issues which (surprise surprise) turn out to
 be related to artificially dropped caches done because the admin thinks
 this would help...

 I have just refreshed the original patch on top of the current mm tree
 but I could live with KERN_INFO as well if people think that KERN_NOTICE
 is too hysterical.
 ---
 From 1f4058be9b089bc9d43d71bc63989335d7637d8d Mon Sep 17 00:00:00 2001
 From: Dave Hansen d...@linux.vnet.ibm.com
 Date: Fri, 12 Oct 2012 14:30:54 +0200
 Subject: [PATCH] add some drop_caches documentation and info messsge

 There is plenty of anecdotal evidence and a load of blog posts
 suggesting that using drop_caches periodically keeps your system
 running in tip top shape.  Perhaps adding some kernel
 documentation will increase the amount of accurate data on its use.

 If we are not shrinking caches effectively, then we have real bugs.
 Using drop_caches will simply mask the bugs and make them harder
 to find, but certainly does not fix them, nor is it an appropriate
 workaround to limit the size of the caches.

 It's a great debugging tool, and is really handy for doing things
 like repeatable benchmark runs.  So, add a bit more documentation
 about it, and add a little KERN_NOTICE.  It should help developers
 who are chasing down reclaim-related bugs.

 [mho...@suse.cz: refreshed to current -mm tree]
 Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
 Reviewed-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Acked-by: Michal Hocko mho...@suse.cz

Looks fine.

Acked-by: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

64 matches

Mail list logo