subject:"Break 2.4 VM in five easy steps"

Re: Break 2.4 VM in five easy steps

2001-06-12 Thread Bernd Jendrissek


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Jun 11, 2001 at 04:04:45PM -0300, Rik van Riel wrote:
> On Mon, 11 Jun 2001, Maciej Zenczykowski wrote:
> > On Fri, 8 Jun 2001, Pavel Machek wrote:
> >
> > > That modulo is likely slower than dereference.
> > >
> > > > +   if (count % 256 == 0) {
> >
> > You are forgetting that this case should be converted to and 255
> > or a plain byte reference by any optimizing compiler

You read too much into my choice - 256 is a random number ;)

> What matters is that this thing calls schedule() unconditionally
> every 256th time.  Checking current->need_resched will only call
> schedule if it is needed ... not only that, but it will also
> call schedule FASTER if it is needed.

I will try this later today, but it seems right enough.

generic_file_write seems to do enough other work that a dereference
vs. and-255 shouldn't be too bad...

Bernd Jendrissek
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7Jciz/FmLrNfLpjMRAmI9AKCm2EYziCzG0qrobFooGLf3kepb/wCbBQf6
nXmD/OZNhGttwQejZtYi3ic=
=rWL2
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Rik van Riel

On Mon, 11 Jun 2001, Maciej Zenczykowski wrote:
> On Fri, 8 Jun 2001, Pavel Machek wrote:
>
> > That modulo is likely slower than dereference.
> >
> > > +   if (count % 256 == 0) {
>
> You are forgetting that this case should be converted to and 255
> or a plain byte reference by any optimizing compiler

Not relevant.

What matters is that this thing calls schedule() unconditionally
every 256th time.  Checking current->need_resched will only call
schedule if it is needed ... not only that, but it will also
call schedule FASTER if it is needed.

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Maciej Zenczykowski

On Fri, 8 Jun 2001, Pavel Machek wrote:

> That modulo is likely slower than dereference.
>
> > +   if (count % 256 == 0) {

You are forgetting that this case should be converted to and 255 or a
plain byte reference by any optimizing compiler - and gcc surely is,
on x86 this code can be reduced to around 2 cycles (Pentium: mov, or, jnz,
with preceding code intertwined to cancel stalls and jnz being likely in
the code buffer)...

Maciek

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

> If this solves your problem, use it; if your name is Linus or Alan,
> ignore or do it right please.

Well I guess you should do CONDITIONAL_SCHEDULE (if it is not defined
as macro, do if (current->need_resched) schedule()).

That modulo is likely slower than dereference.

> diff -u -r1.1 -r1.2
> --- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
> +++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
> @@ -2599,6 +2599,11 @@
> char *kaddr;
> int deactivate = 1;
>  
> +   /* bernd-hack: give other processes a chance to run */
> +   if (count % 256 == 0) {
> +   schedule();
> +   }
> +
> /*
>  * Try to find the page in the cache. If it isn't there,
>  * allocate a free page.
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.4 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
> 
> iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
> /Kod15Nx7AzaM4v0WhOgp88=
> =pyr6
> -END PGP SIGNATURE-
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

> But if the page in memory is 'dirty', you can't be efficient with swapping
> *in* the page.  The page on disk is invalid and should be released, or am I
> missing something?

Yes. You are missing fragmentation. This keeps it low.
Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-10 Thread Rob Landley


>I realize that assembly is platform-specific. Being 
>that I use the IA32 class machine, that's what I 
>would write for. Others who use other platforms could
>do the deed for their native language.

Meaning we'd still need a good C implementation anyway
for the 75% of platforms nobody's going to get around
to writing an assembly implementation for this year,
so we might as well do that first, eh?

As for IA32 being everywhere, 16 bit 8086 was
everywhere until 1990 or so.  And 64 bitness is right
around the corner (iTanic is a pointless way of
de-optimizing for memory bus bandwidth, which is your
real bottleneck and not whatever happens inside a chip
you've clock multiplied by a factor of 12 or more. 
But x86-64 looks seriously cool if AMD would get off
their rear and actually implement sledgehammer in
silicon within our lifetimes.  And that's probably
transmeta's way of going 64 bit eventually too.  (And
that was obvious even BEFORE the cross licensing
agreement was announced.))

And interestingly, an assembly routine optimized for
386 assembly just might get beaten by C code compiled
for Athlon optimization.  It's not JUST "IA32". 
Memory management code probably has to know about the
PAE addressing extensions, different translation
lookaside buffer versions, and interacting with the
wonderful wide world of DMA.  Luckily in kernel we
just don't do floating point (MMX/3DNow/whatever it
was they're so proud of in Pentium 4 whose acronym
I've forgotten at the moment.  Not SLS, that was a
linux distribution...)

If your'e a dyed in the wool assembly hacker, go help
the GCC/EGCS folks make a better compiler.  They could
use you.  The kernel isn't the place for assembly
optimization.

>Being that most users are on the IA32 platform, I'm 
>sure they wouldn't reject an assembly solution to 
>this problem.

If it's unreadable to C hackers, so that nobody
understands it, so that it's black magic that
positively invites subtle bugs from other code that
has to interface with it...

Yes they darn well WOULD reject it.  Simplicity and
clarity are actually slightly MORE important than raw
performance, since if you just six months the midrange
hardware gets 30% faster.

The ONLY assembly that's left in the kernel is the
stuff that's unavoidable, like boot sectors and the
setup code that bootstraps the first kernel init
function in C, or perhaps the occasional driver that's
so amazingly timing dependent it's effectively
real-time programming at the nanosecond level.  (And
for most of those, they've either faked a C solution
or restricted the assembly to 5 lines in the middle of
a bunch of C code.  Memo: this is the kind of thing
where profanity gets into kernel comments.)  And of
course there are a few assembly macros for half-dozen
line things like spinlocks that either can't be done
any other way or are real bottleneck cases where the
cost of the extra opacity (which is a major cost, that
is definitely taken into consideration) honestly is
worth it.

> As for kernel acceptance, that's an
>issue for the political eggheads. Not my forte. :-)

The problem in this case is an O(n^2) or worse
algorithm is being used.  Converting it to assembly
isn't going to fix something that gets exponentially
worse, it just means that instead of blowing up at 2
gigs it now blows up at 6 gigs.  That's not a long
term solution.

If eliminating 5 lines of assembly is a good thing,
rewriting an entire subsystem in assembly isn't going
to happen.  Trust us on this one.

Rob

__
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Zlatko Calusic

Mike Galbraith <[EMAIL PROTECTED]> writes:

> On Fri, 8 Jun 2001, John Stoffel wrote:
> 
> > Mike> OK, riddle me this.  If this test is a crummy test, just how is
> > Mike> it that I was able to warn Rik in advance that when 2.4.5 was
> > Mike> released, he should expect complaints?  How did I _know_ that?
> > Mike> The answer is that I fiddle with Rik's code a lot, and I test
> > Mike> with this test because it tells me a lot.  It may not tell you
> > Mike> anything, but it does me.
> >
> > I never said it was a crummy test, please do not read more into my
> > words than was written.  What I was trying to get across is that just
> > one test (such as a compile of the kernel) isn't perfect at showing
> > where the problems are with the VM sub-system.
> 
> Hmm...
> 
> Tobias> Could you please explain what is good about this test?  I
> Tobias> understand that it will stress the VM, but will it do so in a
> Tobias> realistic and relevant way?
> 
> I agree, this isn't really a good test case.  I'd rather see what
> 
> happens when you fire up a gimp session to edit an image which is
> *almost* the size of RAM, or even just 50% the size of ram.  Then how
> does that affect your other processes that are running at the same
> time?
> 
> ...but anyway, yes it just one test from any number of possibles.

One great test that I'm using regularly to see what's goin' on, is at
http://lxr.linux.no/. It is a cool utility to cross reference your
Linux kernel source tree, and in the mean time eat gobs of memory, do
lots of I/O, and burn many CPU cycles (all at the same time). Ideal
test, if you ask me and if anybody has the time, it would be nice to
see different timing numbers when run on different kernels. Just make
sure you run it on the same kernel tree to make reproducable results.
It has three passes, and the third one is the most interesting one
(use vmstat 1 to see why). When run with 64MB RAM configuration, it
would swap heavily, with 128MB somewhat, and at 192MB maybe not
(depending on the other applications running at the same time).

Try it, it is a nice utility, and a great test. :)
-- 
Zlatko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Mike A. Harris

On Sat, 9 Jun 2001, Rik van Riel wrote:

>> Why are half the people here trying to hide behind this diskspace
>> is cheap argument?  If we rely on that, then Linux sucks shit.
>
>Never mind them, I haven't seen any of them contribute
>VM code, even ;)

Nor have I, but I think you guys working on it will get it
cleaned up eventually.  What bugs me is people trying to pretend
that it isn't important to fix, or that spending money to get
newer hardware is acceptable solution.

>OTOH, disk space _is_ cheap, so the other VM - performance
>related - VM bugs do have a somewhat higher priority at the
>moment.

Yes, it is cheap.  It isn't always an acceptable workaround
though, so I'm glad you guys are working on it - even if we have
to wait a bit.

I have faith in the system.  ;o)

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Mike A. Harris wrote:

> Why are half the people here trying to hide behind this diskspace
> is cheap argument?  If we rely on that, then Linux sucks shit.

Never mind them, I haven't seen any of them contribute
VM code, even ;)

OTOH, disk space _is_ cheap, so the other VM - performance
related - VM bugs do have a somewhat higher priority at the
moment.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On 6 Jun 2001, Eric W. Biederman wrote:
> Derek Glidden <[EMAIL PROTECTED]> writes:
> 
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.  
> 
> The interesting thing from other reports is that it appears to be
> kswapd using up CPU resources.

This part is being worked on, expect a solution for this thing
soon...


Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Derek Glidden wrote:

> Or are you saying that if someone is unhappy with a particular
> situation, they should just keep their mouth shut and accept it?

There are lots of options ...

1) wait until somebody fixes the problem
2) fix the problem yourself
3) start infinite flamewars and make developers
   so sick of the problem nobody wants to fix it
4) pay someone to fix the problem ;)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Rik van Riel


On Wed, 6 Jun 2001, Sean Hunter wrote:

> A working VM would have several differences from what we have in my
> opinion, among which are:
> - It wouldn't require 8GB of swap on my large boxes
> - It wouldn't suffer from the "bounce buffer" bug on my
>   large boxes
> - It wouldn't cause the disk drive on my laptop to be
>   _constantly_ in use even when all I have done is spawned a
>   shell session and have no large apps or daemons running.
> - It wouldn't kill things saying it was OOM unless it was OOM.

I fully agree these problems need to be fixed. I just wish I
had the time to tackle all of them right now ;)

We should be close to getting the 3rd problem fixed and the
deadlock problem with the bounce buffers seems to be fixed
already.

Getting reclaiming of swap space and OOM fixed is a matter
of time ... I hope I'll have that time in the near future.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mark Hahn


> reads the RTC device.  The patched RTC driver can then
> measure the elapsed time between the interrupt and the
> read from userspace.  Voila: latency.

interesting, but I'm not sure there's much advantage over
doing it entirely in user-space with the normal /dev/rtc:

http://brain.mcmaster.ca/~hahn/realfeel.c

it just prints out the raw time difference from when
rtc should have woken up the program.  you can do your own histogram;
for summary purposes, something like stdev is probably best.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike A. Harris

On 6 Jun 2001, Miles Lane wrote:

>> Precicely.  Saying 8x RAM doesn't change it either.  Sometime
>> next week I'm going to purposefully put a new 60Gb disk in on a
>> separate controller as pure swap on top of 256Mb of RAM.  My
>> guess is after bootup, and login, I'll have 48Gb of stuff in
>> swap "just in case".
>
>Mike and others, I am getting tired of your comments.  Sheesh.

And I'm tired of having people tell me, or tell others to buy a
faster computer or more RAM to work around a real technical
problem.  If a dual 1Ghz system with 1Gb of RAM and 60GB of disk
space broken across 3 U160 drives is not a modern fast
workstation I don't know what is.  My 300Mhz system however works
on its own stuff, and doesn't need upgrading.

>The various developers who actually work on the VM have already
>acknowledged the issues and are exploring fixes, including at
>least one patch that already exists.

Precicely, which underscores what I'm saying: The problem is
acknowledged, and being worked on by talented hackers knowing
what they are doing - so why must people keep saying "get more
disk space, it is cheap?" et al.?  That is totally nonuseful
advice in most cases.  Many have pointed out already for example
how impossible that would be in a 500 computer webserver farm.

>It seems clear that the uproar from the people who are having
>trouble with the new VM's handling of swap space have been
>heard and folks are going to fix these problems.  It may not
>happen today or tomorrow, but soon.  What the heck else do you
>want?

I agree with you.  What I want, is when someone talks about this
stuff or inquires about it, for people to stop telling them that
their computer is out of date and they should upgrade it as that
is bogus advice.  "It worked fine yesterday, why should I
upgrade" reigns supreme.

>Making enflammatory remarks about the current situation does
>nothing to help get the problems fixed, it just wastes our time
>and bandwidth.

It's not like there is someone forcing you to read it though.

>So please, if you have new facts that you want to offer that
>will help us characterize and understand these VM issues better
>or discover new problems, feel free to share them.  But if you
>just want to rant, I, for one, would rather you didn't.

Point noted, however that isn't going to stop anyone from
speaking their personal opinion on things.  Freedom of speech.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Sat, 9 Jun 2001, Jonathan Morton wrote:

> >> On the subject of Mike Galbraith's kernel compilation test, how much
> >> physical RAM does he have for his machine, what type of CPU is it, and what
> >> (approximate) type of device does he use for swap?  I'll see if I can
> >> partially duplicate his results at this end.  So far all my tests have been
> >> done with a fast CPU - perhaps I should try the P166/MMX or even try
> >> loading linux-pmac onto my 8100.
> >
> >It's a PIII/500 with one ide disk.
>
> ...with how much RAM?  That's the important bit.

Duh! :) I'm a dipstick.  128mb.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Fri, 8 Jun 2001, Marcelo Tosatti wrote:

> On Fri, 8 Jun 2001, John Stoffel wrote:
>
> > More importantly, a *repeatable* set of tests is what is needed to
> > test the VM and get consistent results from run to run, so you can see
> > how your changes are impacting performance.  The kernel compile
> > doesn't really have any one process grow to a large fraction of
> > memory, so dropping in a compile which *does* is a good thing.
>
> I agree with you.
>
> Mike, I'm sure you have noticed that stock kernel gives much better
> results than mine or Jonathan's patch.

I noticed that Jonathan brought back waiting.. that (among others)
made me very interested.

> Now the stock kernel gives us crappy interactivity compared to my patch.
> (Note: my patch still does not gives me the interactivity I want under
> high VM loads, but I hope to get there soon).

(And that's why)  Among other things (yes, I do love throughput) I've
poked at the interactivity problem. I can't improve it anymore without
doing some strategic waiting :(  I used to be able to help it a little
by doing a careful roll-up in scrub size as load builds.. trying to
smooth the transition from latency oriented to hammer down throughput.

> BTW, we are talking with the OSDL (http://www.osdlab.org) guys about a
> possibility to setup a test system which would run a different variety of
> benchmarks to give us results of different kinds of workloads. If that
> ever happens, we'll probably get rid of most of this testing problems.

Excellent!

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Fri, 8 Jun 2001, Tobias Ringstrom wrote:

> On Fri, 8 Jun 2001, Mike Galbraith wrote:
> > On Fri, 8 Jun 2001, Tobias Ringstrom wrote:
> > > On Fri, 8 Jun 2001, Mike Galbraith wrote:
> > > > I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> > > > while testing some other stuff today.
> > >
> > > Could you please explain what is good about this test?  I understand that
> > > it will stress the VM, but will it do so in a realistic and relevant way?
> >
> > Can you explain what is bad about this test? ;)  It spins the same VM wheels
>
> I think a load of ~30 is quit uncommon, and therefor it is unclear to me
> that it would be a test that would be repesentative of most normal loads.

It's not supposed to be repesentative.  It's supposed to take the box
rapidly (but not instantly) from idle through lo->medium->high and
maintain solid throughput.

> > as any other load does.  What's the difference if I have a bunch of httpd
> > allocating or a bunch of cc1/as/ld?  This load has a modest cachable data
> > set and is compute bound.. and above all gives very repeatable results.
>
> Not a big difference.  The difference I was thinking abount is the
> difference between spawning lots of processes allocating, using and
> freeing lots of memory, compared to a case where you have a few processes
> touching a lot of already allocated pages in some pattern.  I was
> wondering whether optimizing for your case would be good or bad for the
> other case.  I know, I know, I should do more testing myself.  And I
> should probably not ask you, since you really really like your test,
> and you will probably just say yes... ;-)

It's not a matter of optimizing for my case.. that would be horrible.
It's a matter of is the vm capable of rapid and correct responses.

> At home, I'm running a couple of computers.  One of them is a slow
> computer running Linux, serving mail, NFS, SMB, etc.  I'm usually logged
> in on a couple of virtual consoles.  On this machine, I do not mind if all
> shells, daemons and other idle processes are beeing swapped out in favor
> of disk cache for the NFS and SMB serving.  In fact, that is a very good
> thing, and I want it that way.
>
> Another maching is my desktop machine.  When using this maching, I really
> hate when my emacsen, browsers, xterms, etc are swapped out just to give
> me some stupid disk cache for my xmms or compilations.  I do not care if a
> kernel compile is a little slower as long as my applications are snappy.
>
> How could Linux predict this?  It is a matter of taste, IMHO.

I have no idea.  It would be _wonderful_ if it could detect interactive
tasks and give them preferencial treatment.

> > I use it to watch reaction to surge.  I watch for the vm to build to a
> > solid maximum throughput without thrashing.  That's the portion of VM
> > that I'm interested in, so that's what I test.  Besides :) I simply don't
> > have the hardware to try to simulate hairy chested server loads.  There
> > are lots of folks with hairy chested boxes.. they should test that stuff.
>
> Agreed.  More testing is needed.  Now if we would have those knobs and
> wheels to turn, we could perhaps also tune our systems to behave as we
> like them, and submit that as well.  Right now you need to be a kernel
> hacker, and see through all the magic with shm, mmap, a bunch of caches,
> page lists, etc.  I'd give a lot for a nice picture (or state diagram)
> showing the lifetime of a page, but I have not found such a picture
> anywhere.  Besides, the VM seems to change every new release anyway.
>
> > I've been repeating ~this test since 2.0 times, and have noticed a 1:1
> > relationship.  When I notice that my box is ~happy doing this load test,
> > I also notice very few VM gripes hitting the list.
>
> Ok, but as you say, we need more tests.
>
> > > Isn't the interesting case when you have a number of processes using lots
> > > of memory, but only a part of all that memory is beeing actively used, and
> > > that memory fits in RAM.  In that case, the VM should make sure that the
> > > not used memory is swapped out.  In RAM you should have the used memory,
> > > but also disk cache if there is any RAM left.  Does the current VM handle
> > > this case fine yet?  IMHO, this is the case most people care about.  It is
> > > definately the case I care about, at least. :-)
> >
> > The interesting case is _every_ case.  Try seeing my particular test as
> > a simulation of a small classroom box with 30 students compiling their
> > assignments and it'll suddenly become quite realistic.  You'll notice
> > by the numbers I post that I was very careful to not overload the box in
> > a rediculous manner when selecting the total size of the job.. it's just
> > a heavily loaded box.  This test does not overload my IO resources, so
> > it tests the VM's ability to choose and move the right stuff at the right
> > time to get the job done with a minimum of additional overhead.
>
> I did not understand th

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


>> On the subject of Mike Galbraith's kernel compilation test, how much
>> physical RAM does he have for his machine, what type of CPU is it, and what
>> (approximate) type of device does he use for swap?  I'll see if I can
>> partially duplicate his results at this end.  So far all my tests have been
>> done with a fast CPU - perhaps I should try the P166/MMX or even try
>> loading linux-pmac onto my 8100.
>
>It's a PIII/500 with one ide disk.

...with how much RAM?  That's the important bit.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Sat, 9 Jun 2001, Jonathan Morton wrote:

> On the subject of Mike Galbraith's kernel compilation test, how much
> physical RAM does he have for his machine, what type of CPU is it, and what
> (approximate) type of device does he use for swap?  I'll see if I can
> partially duplicate his results at this end.  So far all my tests have been
> done with a fast CPU - perhaps I should try the P166/MMX or even try
> loading linux-pmac onto my 8100.

It's a PIII/500 with one ide disk.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Rik van Riel


On Fri, 8 Jun 2001, Mike Galbraith wrote:
> On Fri, 8 Jun 2001, John Stoffel wrote:

> > I agree, this isn't really a good test case.  I'd rather see what
> > happens when you fire up a gimp session to edit an image which is
> > *almost* the size of RAM, or even just 50% the size of ram.
> 
> OK, riddle me this.  If this test is a crummy test, just how is it

Personally, I'd like to see BOTH of these tests, and many many
more.

Preferably, handed to the VM hackers in various colourful
graphs that allow even severely undercaffeinated hackers to
see how things changed for the good or the bad between kernel
revisions.

cheers,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Andrew Morton

Jonathan Morton wrote:
> 
> [ Re-entering discussion after too long a day and a long sleep... ]
> 
> >> There is the problem in terms of some people want pure interactive
> >> performance, while others are looking for throughput over all else,
> >> but those are both extremes of the spectrum.  Though I suspect
> >> raw throughput is the less wanted (in terms of numbers of systems)
> >> than keeping interactive response good during VM pressure.
> >
> >And this raises a very very important point: raw throughtput wins
> >enterprise-like benchmarks, and the enterprise people are the ones who pay
> >most of hackers here. (including me and Rik)
> 
> Very true.  As well as the fact that interactivity is much harder to
> measure.  The question is, what is interactivity (from the kernel's
> perspective)?  It usually means small(ish) processes with intermittent
> working-set and CPU requirements.  These types of process can safely be
> swapped out when not immediately in use, but the kernel has to be able to
> page them in quite quickly when needed.  Doing that under heavy load is
> very non-trivial.

For the low-latency stuff, latency can be defined as
the worst-case time to schedule a userspace process in
response to an interrupt.

That metric is also appropriate in this case, (latency equals
interactivity), although here you don't need to be so fanatical
about the *worst case*.  A few scheduling blips here are less
fatal.

I have tools to measure latency (aka interactivity).  At
http://www.uow.edu.au/~andrewm/linux/schedlat.html#downloads
there is a kernel patch called `rtc-debug' which causes
the PC RTC to generate a stream of interrupts.  A user-space
task called `amlat' responds to those interrupts and
reads the RTC device.  The patched RTC driver can then
measure the elapsed time between the interrupt and the
read from userspace.  Voila: latency.

When you close the RTC device (by killing amlat), the RTC
driver will print out a histogram of the latencies.

`amlat' at present gives itself SCHED_RR policy and
runs under mlockall() - for your testing you'll need
to delete those lines.

So.  Simple apply rtc-debug, run `amlat' and kill it
when you've finished the workload.

The challenge will be to relate the latency histogram
to human-perceived interactivity.   I'm not sure of
the best way of doing that.  Perhaps monitor the 90th
percentile, and aim to keep it below 100 milliseconds.
Also, `amlat' should do a bit of disk I/O as well.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


[ Re-entering discussion after too long a day and a long sleep... ]

>> There is the problem in terms of some people want pure interactive
>> performance, while others are looking for throughput over all else,
>> but those are both extremes of the spectrum.  Though I suspect
>> raw throughput is the less wanted (in terms of numbers of systems)
>> than keeping interactive response good during VM pressure.
>
>And this raises a very very important point: raw throughtput wins
>enterprise-like benchmarks, and the enterprise people are the ones who pay
>most of hackers here. (including me and Rik)

Very true.  As well as the fact that interactivity is much harder to
measure.  The question is, what is interactivity (from the kernel's
perspective)?  It usually means small(ish) processes with intermittent
working-set and CPU requirements.  These types of process can safely be
swapped out when not immediately in use, but the kernel has to be able to
page them in quite quickly when needed.  Doing that under heavy load is
very non-trivial.

It can also mean multimedia applications with a continuous (maybe small)
working set, a continuous but not 100% CPU usage, and the special property
that the user WILL notice if this process gets swapped out even briefly.
mpg123 and XMMS fall into this category, and I sometimes tried running
these alongside my compilation tests to see how they fared.  I think I had
it going fairly well towards the end, with mpg123 stuttering relatively
rarely and briefly while VM load was high.

On the subject of Mike Galbraith's kernel compilation test, how much
physical RAM does he have for his machine, what type of CPU is it, and what
(approximate) type of device does he use for swap?  I'll see if I can
partially duplicate his results at this end.  So far all my tests have been
done with a fast CPU - perhaps I should try the P166/MMX or even try
loading linux-pmac onto my 8100.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Marcelo Tosatti



On Fri, 8 Jun 2001, John Stoffel wrote:

> 
> Marcelo> Now the stock kernel gives us crappy interactivity compared
> Marcelo> to my patch.  (Note: my patch still does not gives me the
> Marcelo> interactivity I want under high VM loads, but I hope to get
> Marcelo> there soon).
> 
> This raises the important question, how can we objectively measure
> interactive response in the kernel and relate it to the user's
> perceived interactive response?  If we could come up with some sort of
> testing system that would show us this, it would help alot, since we
> could just have people run tests in a more automatic and repeatable
> manner.
> 
> And I think it would also help us automatically tune the Kernel, since
> it would have a knowledge of it's own performance.  
> 
> There is the problem in terms of some people want pure interactive
> performance, while others are looking for throughput over all else,
> but those are both extremes of the spectrum.  Though I suspect
> raw throughput is the less wanted (in terms of numbers of systems)
> than keeping interactive response good during VM pressure.

And this raises a very very important point: raw throughtput wins
enterprise-like benchmarks, and the enterprise people are the ones who pay
most of hackers here. (including me and Rik)

We have to be careful about that. 

> I have zero knowledge of how we could do this, but giving the kernel
> some counters, even if only for use during debugging runs, which would
> give us some objective feedback on performance would be a big win.
> 
> Having people just send in reports of "I ran X,Y,Z and it was slow"
> doesn't help us, since it's so hard to re-create their environment so
> you can run tests against it.

Lets wait for some test system to be set up (eg the OSDL thing).

Once thats done, I'm sure we will find out some way of doing it. 

Well, good weekend for you too. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread John Stoffel



Marcelo> Now the stock kernel gives us crappy interactivity compared
Marcelo> to my patch.  (Note: my patch still does not gives me the
Marcelo> interactivity I want under high VM loads, but I hope to get
Marcelo> there soon).

This raises the important question, how can we objectively measure
interactive response in the kernel and relate it to the user's
perceived interactive response?  If we could come up with some sort of
testing system that would show us this, it would help alot, since we
could just have people run tests in a more automatic and repeatable
manner.

And I think it would also help us automatically tune the Kernel, since
it would have a knowledge of it's own performance.  

There is the problem in terms of some people want pure interactive
performance, while others are looking for throughput over all else,
but those are both extremes of the spectrum.  Though I suspect
raw throughput is the less wanted (in terms of numbers of systems)
than keeping interactive response good during VM pressure.

I have zero knowledge of how we could do this, but giving the kernel
some counters, even if only for use during debugging runs, which would
give us some objective feedback on performance would be a big win.

Having people just send in reports of "I ran X,Y,Z and it was slow"
doesn't help us, since it's so hard to re-create their environment so
you can run tests against it.

Anyway, enjoy the weekend all.

John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
 [EMAIL PROTECTED] - http://www.lucent.com - 978-952-7548
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Marcelo Tosatti



On Fri, 8 Jun 2001, John Stoffel wrote:

> 
> Mike> OK, riddle me this.  If this test is a crummy test, just how is
> Mike> it that I was able to warn Rik in advance that when 2.4.5 was
> Mike> released, he should expect complaints?  How did I _know_ that?
> Mike> The answer is that I fiddle with Rik's code a lot, and I test
> Mike> with this test because it tells me a lot.  It may not tell you
> Mike> anything, but it does me.
> 
> I never said it was a crummy test, please do not read more into my
> words than was written.  What I was trying to get across is that just
> one test (such as a compile of the kernel) isn't perfect at showing
> where the problems are with the VM sub-system.
> 
> Jonathan Morton has been using another large compile to also test the
> sub-system, and it includes a compile which puts a large, single
> process pressure on the VM.  I consider this to be a more
> representative test of how the VM deals with pressure.  
> 
> The kernel compile is an ok test of basic VM handling, but from what
> I've been hearing on linux-kernel and linux-mm is that the VM goes to
> crap when you have a mix of stuff running, and one (or more) processes
> starts up or grows much larger and starts impacting the system
> performance.
> 
> I'm also not knocking your contributions to this discussion, so stop
> being so touchy.  I was trying to contribute and say (albeit poorly)
> that a *mix* of tests is needed to test the VM.
> 
> More importantly, a *repeatable* set of tests is what is needed to
> test the VM and get consistent results from run to run, so you can see
> how your changes are impacting performance.  The kernel compile
> doesn't really have any one process grow to a large fraction of
> memory, so dropping in a compile which *does* is a good thing.

I agree with you. 

Mike, I'm sure you have noticed that stock kernel gives much better
results than mine or Jonathan's patch.

Now the stock kernel gives us crappy interactivity compared to my patch.
(Note: my patch still does not gives me the interactivity I want under
high VM loads, but I hope to get there soon).

BTW, we are talking with the OSDL (http://www.osdlab.org) guys about a
possibility to setup a test system which would run a different variety of
benchmarks to give us results of different kinds of workloads. If that
ever happens, we'll probably get rid of most of this testing problems.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
> On Fri, 8 Jun 2001, Tobias Ringstrom wrote:
> > On Fri, 8 Jun 2001, Mike Galbraith wrote:
> > > I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> > > while testing some other stuff today.
> >
> > Could you please explain what is good about this test?  I understand that
> > it will stress the VM, but will it do so in a realistic and relevant way?
>
> Can you explain what is bad about this test? ;)  It spins the same VM wheels

I think a load of ~30 is quit uncommon, and therefor it is unclear to me
that it would be a test that would be repesentative of most normal loads.

> as any other load does.  What's the difference if I have a bunch of httpd
> allocating or a bunch of cc1/as/ld?  This load has a modest cachable data
> set and is compute bound.. and above all gives very repeatable results.

Not a big difference.  The difference I was thinking abount is the
difference between spawning lots of processes allocating, using and
freeing lots of memory, compared to a case where you have a few processes
touching a lot of already allocated pages in some pattern.  I was
wondering whether optimizing for your case would be good or bad for the
other case.  I know, I know, I should do more testing myself.  And I
should probably not ask you, since you really really like your test,
and you will probably just say yes... ;-)

At home, I'm running a couple of computers.  One of them is a slow
computer running Linux, serving mail, NFS, SMB, etc.  I'm usually logged
in on a couple of virtual consoles.  On this machine, I do not mind if all
shells, daemons and other idle processes are beeing swapped out in favor
of disk cache for the NFS and SMB serving.  In fact, that is a very good
thing, and I want it that way.

Another maching is my desktop machine.  When using this maching, I really
hate when my emacsen, browsers, xterms, etc are swapped out just to give
me some stupid disk cache for my xmms or compilations.  I do not care if a
kernel compile is a little slower as long as my applications are snappy.

How could Linux predict this?  It is a matter of taste, IMHO.

> I use it to watch reaction to surge.  I watch for the vm to build to a
> solid maximum throughput without thrashing.  That's the portion of VM
> that I'm interested in, so that's what I test.  Besides :) I simply don't
> have the hardware to try to simulate hairy chested server loads.  There
> are lots of folks with hairy chested boxes.. they should test that stuff.

Agreed.  More testing is needed.  Now if we would have those knobs and
wheels to turn, we could perhaps also tune our systems to behave as we
like them, and submit that as well.  Right now you need to be a kernel
hacker, and see through all the magic with shm, mmap, a bunch of caches,
page lists, etc.  I'd give a lot for a nice picture (or state diagram)
showing the lifetime of a page, but I have not found such a picture
anywhere.  Besides, the VM seems to change every new release anyway.

> I've been repeating ~this test since 2.0 times, and have noticed a 1:1
> relationship.  When I notice that my box is ~happy doing this load test,
> I also notice very few VM gripes hitting the list.

Ok, but as you say, we need more tests.

> > Isn't the interesting case when you have a number of processes using lots
> > of memory, but only a part of all that memory is beeing actively used, and
> > that memory fits in RAM.  In that case, the VM should make sure that the
> > not used memory is swapped out.  In RAM you should have the used memory,
> > but also disk cache if there is any RAM left.  Does the current VM handle
> > this case fine yet?  IMHO, this is the case most people care about.  It is
> > definately the case I care about, at least. :-)
>
> The interesting case is _every_ case.  Try seeing my particular test as
> a simulation of a small classroom box with 30 students compiling their
> assignments and it'll suddenly become quite realistic.  You'll notice
> by the numbers I post that I was very careful to not overload the box in
> a rediculous manner when selecting the total size of the job.. it's just
> a heavily loaded box.  This test does not overload my IO resources, so
> it tests the VM's ability to choose and move the right stuff at the right
> time to get the job done with a minimum of additional overhead.

I did not understand those numbers when I saw them the first time.  Now, I
must say that your test does not look as silly as it did before.

> The current VM handles things generally well imho, but has problems
> regulating itself under load.  My test load hits the VM right in it's
> weakest point (not _that_ weak, but..) by starting at zero and building
> rapidly to max.. and keeping it _right there_.
>
> > I'm not saying that it's a completely uninteresting case when your active
> > memory is bigger than you RAM of course, but perhaps there should be other
> > algorithms handling that case, such as putting some of the swa

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith

On Fri, 8 Jun 2001, John Stoffel wrote:

> Mike> OK, riddle me this.  If this test is a crummy test, just how is
> Mike> it that I was able to warn Rik in advance that when 2.4.5 was
> Mike> released, he should expect complaints?  How did I _know_ that?
> Mike> The answer is that I fiddle with Rik's code a lot, and I test
> Mike> with this test because it tells me a lot.  It may not tell you
> Mike> anything, but it does me.
>
> I never said it was a crummy test, please do not read more into my
> words than was written.  What I was trying to get across is that just
> one test (such as a compile of the kernel) isn't perfect at showing
> where the problems are with the VM sub-system.

Hmm...

Tobias> Could you please explain what is good about this test?  I
Tobias> understand that it will stress the VM, but will it do so in a
Tobias> realistic and relevant way?

I agree, this isn't really a good test case.  I'd rather see what

happens when you fire up a gimp session to edit an image which is
*almost* the size of RAM, or even just 50% the size of ram.  Then how
does that affect your other processes that are running at the same
time?

...but anyway, yes it just one test from any number of possibles.

> Jonathan Morton has been using another large compile to also test the
> sub-system, and it includes a compile which puts a large, single
> process pressure on the VM.  I consider this to be a more
> representative test of how the VM deals with pressure.

What does 'more representative' mean given that the VM must react to
every situation it runs into?

> The kernel compile is an ok test of basic VM handling, but from what

Now we're communicating.  I never said it was more than that ;-)

> I've been hearing on linux-kernel and linux-mm is that the VM goes to
> crap when you have a mix of stuff running, and one (or more) processes
> starts up or grows much larger and starts impacting the system
> performance.
>
> I'm also not knocking your contributions to this discussion, so stop
> being so touchy.  I was trying to contribute and say (albeit poorly)
> that a *mix* of tests is needed to test the VM.

Yes, more people need to test. I don't need to do all of those other
tests (no have right toys), more people need to do repeatable tests.

> More importantly, a *repeatable* set of tests is what is needed to
> test the VM and get consistent results from run to run, so you can see
> how your changes are impacting performance.  The kernel compile
> doesn't really have any one process grow to a large fraction of
> memory, so dropping in a compile which *does* is a good thing.

I know I'm only watching basic functionality.  I'm watching basic
functionality with one very consistant test run very consistantly.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread John Stoffel



Mike> OK, riddle me this.  If this test is a crummy test, just how is
Mike> it that I was able to warn Rik in advance that when 2.4.5 was
Mike> released, he should expect complaints?  How did I _know_ that?
Mike> The answer is that I fiddle with Rik's code a lot, and I test
Mike> with this test because it tells me a lot.  It may not tell you
Mike> anything, but it does me.

I never said it was a crummy test, please do not read more into my
words than was written.  What I was trying to get across is that just
one test (such as a compile of the kernel) isn't perfect at showing
where the problems are with the VM sub-system.

Jonathan Morton has been using another large compile to also test the
sub-system, and it includes a compile which puts a large, single
process pressure on the VM.  I consider this to be a more
representative test of how the VM deals with pressure.  

The kernel compile is an ok test of basic VM handling, but from what
I've been hearing on linux-kernel and linux-mm is that the VM goes to
crap when you have a mix of stuff running, and one (or more) processes
starts up or grows much larger and starts impacting the system
performance.

I'm also not knocking your contributions to this discussion, so stop
being so touchy.  I was trying to contribute and say (albeit poorly)
that a *mix* of tests is needed to test the VM.

More importantly, a *repeatable* set of tests is what is needed to
test the VM and get consistent results from run to run, so you can see
how your changes are impacting performance.  The kernel compile
doesn't really have any one process grow to a large fraction of
memory, so dropping in a compile which *does* is a good thing.

John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
 [EMAIL PROTECTED] - http://www.lucent.com - 978-952-7548

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Fri, 8 Jun 2001, John Stoffel wrote:

> > "Tobias" == Tobias Ringstrom <[EMAIL PROTECTED]> writes:
>
> Tobias> On Fri, 8 Jun 2001, Mike Galbraith wrote:
>
> >> I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> >> while testing some other stuff today.
>
> Tobias> Could you please explain what is good about this test?  I
> Tobias> understand that it will stress the VM, but will it do so in a
> Tobias> realistic and relevant way?
>
> I agree, this isn't really a good test case.  I'd rather see what
> happens when you fire up a gimp session to edit an image which is
> *almost* the size of RAM, or even just 50% the size of ram.  Then how
> does that affect your other processes that are running at the same
> time?

OK, riddle me this.  If this test is a crummy test, just how is it
that I was able to warn Rik in advance that when 2.4.5 was released,
he should expect complaints?  How did I _know_ that?  The answer is
that I fiddle with Rik's code a lot, and I test with this test because
it tells me a lot.  It may not tell you anything, but it does me.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith

On Fri, 8 Jun 2001, Tobias Ringstrom wrote:

> On Fri, 8 Jun 2001, Mike Galbraith wrote:
> > I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> > while testing some other stuff today.
>
> Could you please explain what is good about this test?  I understand that
> it will stress the VM, but will it do so in a realistic and relevant way?

Can you explain what is bad about this test? ;)  It spins the same VM wheels
as any other load does.  What's the difference if I have a bunch of httpd
allocating or a bunch of cc1/as/ld?  This load has a modest cachable data
set and is compute bound.. and above all gives very repeatable results.

I use it to watch reaction to surge.  I watch for the vm to build to a
solid maximum throughput without thrashing.  That's the portion of VM
that I'm interested in, so that's what I test.  Besides :) I simply don't
have the hardware to try to simulate hairy chested server loads.  There
are lots of folks with hairy chested boxes.. they should test that stuff.

I've been repeating ~this test since 2.0 times, and have noticed a 1:1
relationship.  When I notice that my box is ~happy doing this load test,
I also notice very few VM gripes hitting the list.

> Isn't the interesting case when you have a number of processes using lots
> of memory, but only a part of all that memory is beeing actively used, and
> that memory fits in RAM.  In that case, the VM should make sure that the
> not used memory is swapped out.  In RAM you should have the used memory,
> but also disk cache if there is any RAM left.  Does the current VM handle
> this case fine yet?  IMHO, this is the case most people care about.  It is
> definately the case I care about, at least. :-)

The interesting case is _every_ case.  Try seeing my particular test as
a simulation of a small classroom box with 30 students compiling their
assignments and it'll suddenly become quite realistic.  You'll notice
by the numbers I post that I was very careful to not overload the box in
a rediculous manner when selecting the total size of the job.. it's just
a heavily loaded box.  This test does not overload my IO resources, so
it tests the VM's ability to choose and move the right stuff at the right
time to get the job done with a minimum of additional overhead.

The current VM handles things generally well imho, but has problems
regulating itself under load.  My test load hits the VM right in it's
weakest point (not _that_ weak, but..) by starting at zero and building
rapidly to max.. and keeping it _right there_.

> I'm not saying that it's a completely uninteresting case when your active
> memory is bigger than you RAM of course, but perhaps there should be other
> algorithms handling that case, such as putting some of the swapping
> processes to sleep for some time, especially if you have lots of processes
> competing for the memory. I may be wrong, but it seems to me that your
> testcase falls into this second category (also known as thrashing).

Thrashing?  Let's look some numbers. (not the ugly ones, the ~ok ones;)

real9m12.198s  make -j 30 bzImage
user7m41.290s
sys 0m34.840s
user  :   0:07:47.69  76.8%  page in :   452632
nice  :   0:00:00.00   0.0%  page out:   399847
system:   0:01:17.08  12.7%  swap in :75338
idle  :   0:01:03.97  10.5%  swap out:88291

real8m6.994s  make bzImage
user7m34.350s
sys 0m26.550s
user  :   0:07:37.52  78.4%  page in :90546
nice  :   0:00:00.00   0.0%  page out:18164
system:   0:01:26.13  14.8%  swap in :1
idle  :   0:00:39.69   6.8%  swap out:0

...look at cpu utilization.  One minute +tiny change to complete the
large job vs the small (VM footprint) job.

The box is not thrashing, it's working it's little silicon butt off.
What I'm testing is the VM's ability to handle load without thrashing
so badly that it loses throughput bigtime, stalls itself whatever..
it's ability to regulate itself.  I consider a minute and a half to
be ~acceptable, a minute to be good, and 30 seconds to be excellent.
That's just my own little VM performance thermometer.

> An at last, a humble request:  Every problem I've had with the VM has been
> that it either swapped out too many processes and used too much cache, or
> the other way around.  I'd really enjoy a way to tune this behaviour, if
> possible.

Tunables aren't really practical in VM (imho).  If there were a dozen
knobs, you'd have to turn a dozen knobs a dozen times a day.  VM has
to be self regulating.

In case you can't tell (the length of this reply) I like my fovorite
little generic throughput test a LOT :-)

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread John Stoffel

> "Tobias" == Tobias Ringstrom <[EMAIL PROTECTED]> writes:

Tobias> On Fri, 8 Jun 2001, Mike Galbraith wrote:

>> I gave this a shot at my favorite vm beater test (make -j30 bzImage)
>> while testing some other stuff today.

Tobias> Could you please explain what is good about this test?  I
Tobias> understand that it will stress the VM, but will it do so in a
Tobias> realistic and relevant way?

I agree, this isn't really a good test case.  I'd rather see what
happens when you fire up a gimp session to edit an image which is
*almost* the size of RAM, or even just 50% the size of ram.  Then how
does that affect your other processes that are running at the same
time?  

This testing could even be automated with the script-foo stuff to get
consistent results across runs, which is the prime requirement of any
sort of testing.  

On another issue, in swap.c we have two defines for buffer_mem and
page_cache, but the first maxes out at 60%, while the cache maxes out
at 75%.  Shouldn't they both be lower numbers?  Or at least equally
sized?

I've set my page_cache maximum to be 60, I'll be trying to test it
over the weekend, but good weather will keep me outside doing other
stuff...

Thanks,
John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
 [EMAIL PROTECTED] - http://www.lucent.com - 978-952-7548
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
> I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> while testing some other stuff today.

Could you please explain what is good about this test?  I understand that
it will stress the VM, but will it do so in a realistic and relevant way?

Isn't the interesting case when you have a number of processes using lots
of memory, but only a part of all that memory is beeing actively used, and
that memory fits in RAM.  In that case, the VM should make sure that the
not used memory is swapped out.  In RAM you should have the used memory,
but also disk cache if there is any RAM left.  Does the current VM handle
this case fine yet?  IMHO, this is the case most people care about.  It is
definately the case I care about, at least. :-)

I'm not saying that it's a completely uninteresting case when your active
memory is bigger than you RAM of course, but perhaps there should be other
algorithms handling that case, such as putting some of the swapping
processes to sleep for some time, especially if you have lots of processes
competing for the memory. I may be wrong, but it seems to me that your
testcase falls into this second category (also known as thrashing).

An at last, a humble request:  Every problem I've had with the VM has been
that it either swapped out too many processes and used too much cache, or
the other way around.  I'd really enjoy a way to tune this behaviour, if
possible.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike Galbraith


On Fri, 8 Jun 2001, Jonathan Morton wrote:

> http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch
>
> Try this.  I can't guarantee it's SMP-safe yet (I'm leaving the gurus to
> that, but they haven't told me about any errors in the past hour so I'm
> assuming they aren't going to find anything glaringly wrong...), but you
> might like to see if your performance improves with it.  It also fixes the
> OOM-killer bug, which you refer to above.
>
> Some measurements, from my own box (1GHz Athlon, 256Mb RAM):
>
> For the following benchmarks, physical memory availability was reduced
> according to the parameter in the left column.  The benchmark is the
> wall-clock time taken to compile MySQL.
>
> mem=  2.4.5   earlier tweaks  now
> 48M   8m30s   6m30s   5m58s
> 32M   unknown 2h15m   12m34s
>
> The following was performed with all 256Mb RAM available.  This is
> compilation of MySQL using make -j 15.
>
> kernel:   2.4.5   now
> time: 6m30s   6m15s
> peak swap:190M70M
>
> For the following test, the 256Mb swap partition on my IDE drive was
> disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive.  This
> is compilation of MySQL using make -j 20.
>
> kernel:   2.4.5   now
> time: 7m20s   6m30s
> peak swap:370M254M
>
> Draw your own conclusions.  :)

(ok;)

Hi,

I gave this a shot at my favorite vm beater test (make -j30 bzImage)
while testing some other stuff today.

seven identical runs, six slightly different kernels plus yours.

real11m23.522s  2.4.5.vm-update-2
user7m59.170s
sys 0m37.030s
user  :   0:08:07.06  65.6%  page in :   642402
nice  :   0:00:00.00   0.0%  page out:   676820
system:   0:02:09.44  17.4%  swap in :   105965
idle  :   0:02:05.66  16.9%  swap out:   162603

real10m9.512s  2.4.5.virgin
user7m55.520s
sys 0m35.460s
user  :   0:08:02.66  72.2%  page in :   535186
nice  :   0:00:00.00   0.0%  page out:   377992
system:   0:01:37.78  14.6%  swap in :99445
idle  :   0:01:28.14  13.2%  swap out:81926

real10m48.939s 2.4.5.virgin+reclaim.marcelo
user7m54.960s
sys 0m36.240s
user  :   0:08:02.33  68.0%  page in :   566239
nice  :   0:00:00.00   0.0%  page out:   431874
system:   0:01:56.02  16.4%  swap in :   108633
idle  :   0:01:50.61  15.6%  swap out:96415

real9m54.466s 2.4.5.virgin+reclaim.mike (icky 'bleeder valve')
user7m57.370s
sys 0m35.890s
user  :   0:08:04.74  74.1%  page in :   527678
nice  :   0:00:00.00   0.0%  page out:   405259
system:   0:01:12.01  11.0%  swap in :98616
idle  :   0:01:37.47  14.9%  swap out:91492

real9m12.198s  2.4.5.tweak
user7m41.290s
sys 0m34.840s
user  :   0:07:47.69  76.8%  page in :   452632
nice  :   0:00:00.00   0.0%  page out:   399847
system:   0:01:17.08  12.7%  swap in :75338
idle  :   0:01:03.97  10.5%  swap out:88291

real9m41.563s  2.4.5.tweak+reclaim.marcelo
user7m59.880s
sys 0m34.690s
user  :   0:08:07.22  73.4%  page in :   515433
nice  :   0:00:00.00   0.0%  page out:   545762
system:   0:01:35.34  14.4%  swap in :88425
idle  :   0:01:21.11  12.2%  swap out:   125967

real9m47.682s  2.4.5.tweak+reclaim.mike
user8m2.190s
sys 0m34.550s
user  :   0:08:09.57  75.7%  page in :   513166
nice  :   0:00:00.00   0.0%  page out:   473539
system:   0:01:20.27  12.4%  swap in :83127
idle  :   0:01:16.89  11.9%  swap out:   108886

Conclusion:

Your patch hits the cache too hard and pays through the nose for
doing so.. at least under this hefty weight load it does.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Bernd Jendrissek

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Jun 07, 2001 at 03:38:35PM -0600, Brian D Heaton wrote:
>   Maybe i'm missing something.  I just tried this (with the 262144k/1
> and 128k/2048 params) and my results are within .1s of each other.  This is
> without any special patches.  Am I doing something wrong

Oh, I don't mean the time elapsed, It's that nothing _else_ can happen
while dd is hogging the kernel.

> Oh yes -
> 
> SMP - dual PIII866/133

Yes, this is what you are doing wrong ;)

My hypothesis is that in your case, one cpu gets pegged copying pages
from /dev/zero into dd's buffer, while the other cpu can do things like
updating mouse cursors, run setiathome, etc.

What happens if you do *two* dd-tortures with huge buffers at the same
time?  And then, please don't happen to have a quad box!

I don't know if my symptom (loss of interactivity on heavy writing) is
related to swapoff -a causing the same symptom on deeply-swapped boxes.

BTW keep in mind my 4-liner is based more on voodoo than on analysis.

Bernd Jendrissek
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7IICU/FmLrNfLpjMRAnpTAJ48/jAFxZqfxUf2NXT0O542KDbNOwCfaoZo
Q2xaNE4GBqnbn/cl2vrRxLc=
=4sGO
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread C. Martins

  In my everyday desktop workstation (PII 350) I have 64MB of RAM and use 300MB of 
swap, 150MB on 
each hard disk. After upgrading to 2.4, and maintaining the same set of applications 
(KDE, Netscape
& friends), the machine performance is _definitely_ much worse, in terms of 
responsiveness and 
throughput. Most of applications just take much longer to load, and once you've made 
something
that required more memory for a while (like compiling a kernel, opening a large JPEG 
in gimp, etc)
it takes lots of time to come back to normal. Strangely, with 2.4 the workstation just 
feels that
someone stole the 64MB DIMM and put in a 16MB one!!
  One thing I find strange is that with 2.4 if you run top or something similar you 
notice that
memory allocated for cache is almost always using more than half total RAM. I don't 
remember seeing
this with 2.2 kernel series...

  Anyway I think there is something really broken with respect to 2.4 VM. It is just 
NOT acceptable
that when running the same set of apps and type of work and you upgrade your kernel, 
your hardware
no longer is up to the job, when it fited perfectly right before. This is just MS way 
of solving
problems here.

  Best regards

 Claudio Martins 

On Wed, Jun 06, 2001 at 06:58:39AM -0700, Gerhard Mack wrote:
> 
> I have several boxes with 2x ram as swap and performance still sucks
> compared to 2.2.17.  
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Jonathan Morton

At 12:29 am +0100 8/6/2001, Shane Nay wrote:
>(VM report at Marcelo Tosatti's request.  He has mentioned that rather than
>complaining about the VM that people mention what there experiences were.  I
>have tried to do so in the way that he asked.)

>> By performance you mean interactivity or throughput?
>
>Interactivity.  I don't have any throughput needs to speak of.
>
>I just ran a barage of tests on my machine, and the smallest it would ever
>make the cache was 16M, it would prefer to kill processes rather than make
>the cache smaller than that.

http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch

Try this.  I can't guarantee it's SMP-safe yet (I'm leaving the gurus to
that, but they haven't told me about any errors in the past hour so I'm
assuming they aren't going to find anything glaringly wrong...), but you
might like to see if your performance improves with it.  It also fixes the
OOM-killer bug, which you refer to above.

Some measurements, from my own box (1GHz Athlon, 256Mb RAM):

For the following benchmarks, physical memory availability was reduced
according to the parameter in the left column.  The benchmark is the
wall-clock time taken to compile MySQL.

mem=2.4.5   earlier tweaks  now
48M 8m30s   6m30s   5m58s
32M unknown 2h15m   12m34s

The following was performed with all 256Mb RAM available.  This is
compilation of MySQL using make -j 15.

kernel: 2.4.5   now
time:   6m30s   6m15s
peak swap:  190M70M

For the following test, the 256Mb swap partition on my IDE drive was
disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive.  This
is compilation of MySQL using make -j 20.

kernel: 2.4.5   now
time:   7m20s   6m30s
peak swap:  370M254M

Draw your own conclusions.  :)

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay


(VM report at Marcelo Tosatti's request.  He has mentioned that rather than 
complaining about the VM that people mention what there experiences were.  I 
have tried to do so in the way that he asked.)

> 1) Describe what you're running. (your workload)

A lot of daemons, all on a private network so there is no throughput load on 
them.  About 13 rxvt's, freeamp actively playing music at all times, xemacs 
with 25 active buffers, a few instances of vi, opera, no "desktop env", just 
windowmaker.  (Though I have a few KDE2 apps open, and one or two GTK based 
apps open, so lots of library code swapping in and out I imagine)  Now what 
I've noticed lately is this, with 2.4.2 my machine would lock quite 
frequently when I was compiling code and had other apps that were allocing 
memory.  With 2.4.5 I haven't had that behaviour, but I've been much lighter 
on my machine.  (I was doing full toolchain builds with 2.4.2 when I had the 
real problems)  But processes were still running when the machine would 
lock..., like the mp3 player was still playing I noticed one time.  With 
2.4.5 (not -ac) I haven't had any deadlocks, but the system seems very 
sluggish at acute moments .  While doing absolutely nothing processor 
intensive (I've been loading up top and ps'ing with regularity when this 
happens, looking for kswapd going crazy), when I switch between workspaces 
the refresh is much more sluggish on occasion, like I can watch windows 
appear.  Almost like a micro freeze really.
(AMD T-Bird 1.333Mhz 256MB-DDR)

> 2) Describe what you're feeling. (eg "interactivity is crap when I run
> this or that thing", etc)

Freeing memory takes *forever*, but I think that's a function of how I'm 
allocing in this polygon rendering routine I'm working on.  Like literally 
sucks up vast numbers of cycles and makes picogui totally unusable.  But I 
think this is unrelated to the kernel..., I think that's just because I 
haven't implemented re-use in memory structures for the polygon routine.  
(It's malloc/freeing massive numbers of small chunks of memory rather than 
doing it's own memory management, probably related to glibc memory 
organization)

Here's a vmstat line after a 8 days of uptime and before contrived mem tests:
   procs  memoryswap  io system 
cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us  sy  
id
 1  0  0  0   3056   7856 121872   0   0 7 4   3716   1   0  
40


> If we need more info than that I'll request in private.
>
> Also send this reports to the linux-mm list, so other VM hackers can also
> get those reports and we avoid traffic on lk.

> By performance you mean interactivity or throughput?

Interactivity.  I don't have any throughput needs to speak of.

I just ran a barage of tests on my machine, and the smallest it would ever 
make the cache was 16M, it would prefer to kill processes rather than make 
the cache smaller than that.

Contrived stressor program: (pseudo code)
fork(); fork(); fork(); fork();  //16 total processes
for (i=0;i Just do what I described above.

Done :).

Thanks,
Shane Nay.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

> On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
> > On Thu, 7 Jun 2001, Shane Nay wrote:
> > > (Oh, BTW, I really appreciate the work that people have done on the VM,
> > > but folks that are just talking..., well, think clearly before you impact
> > > other people that are writing code.)
> >
> > If all the people talking were reporting results we would be really happy.
> >
> > Seriously, we really lack VM reports.
> 
> Okay, I've had some problems with the VM on my machine, what is the most 
> usefull way to compile reports for you?  

1) Describe what you're running. (your workload)
2) Describe what you're feeling. (eg "interactivity is crap when I run
this or that thing", etc) 

If we need more info than that I'll request in private. 

Also send this reports to the linux-mm list, so other VM hackers can also
get those reports and we avoid traffic on lk.

> I have modified the kernel for a few different ports fixing bugs, and
> device drivers, etc., but the VM is all greek to me, I can just see
> that caching is hyper aggressive and doesn't look like it's going back
> to the pool..., which results in sluggish performance.

By performance you mean interactivity or throughput? 

> Now I know from the work that I've done that anecdotal information is
> almost never even remotely usefull.  

If we need more info, we will request. 

> Therefore is there any body of information that I can read up on to
> create a usefull set of data points for you or other VM hackers to
> look at?  (Or maybe some report in the past that you thought was
> especially usefull?)

Just do what I described above. 

Thanks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh


"Eric W. Biederman" wrote:

> LA Walsh <[EMAIL PROTECTED]> writes:
>
> > Now for whatever reason, since 2.4, I consistently use at least
> > a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
> > like nscd running 7 copies that take 72M.  Seems like overkill for
> > a laptop.
>
> So the question becomes why you are seeing an increased swap usage.
> Currently there are two canidates in the 2.4.x code path.
>
> 1) Delayed swap deallocation, when a program exits after it
>has gone into swap it's swap usage is not freed. Ouch.

---
Double ouch.  Swap is backing a non-existent program?

>
>
> 2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
>that was in the swap cache was written to the the page in the swap
>space would be removed.  In 2.4.x the location in swap space is
>retained with the goal of getting more efficient swap-ins.


But if the page in memory is 'dirty', you can't be efficient with swapping
*in* the page.  The page on disk is invalid and should be released, or am I
missing something?

> Neither of the known canidates from increasing the swap load applies
> when you aren't swapping in the first place.  They may aggrevate the
> usage of swap when you are already swapping but they do not cause
> swapping themselves.  This is why the intial recommendation for
> increased swap space size was made.  If you are swapping we will use
> more swap.
>
> However what pushes your laptop over the edge into swapping is an
> entirely different question.  And probably what should be solved.


On my laptop, it is insignificant and to my knowledge has no measurable
impact.  It seems like there is always 3-5 Meg used in swap no matter what's
running (or not) on the system.

> > I think that is the point -- it was supported in 2.2, it is, IMO,
> > a serious regression that it is not supported in 2.4.
>
> The problem with this general line of arguing is that it lumps a whole
> bunch of real issues/regressions into one over all perception.  Since
> there are multiple reasons people are seeing problems, they need to be
> tracked down with specifics.

---
Uhhh, yeah, sorta -- it's addressing the statement that a "new requirement of
2.4 is to have double the swap space".  If everyone agrees that's a problem, then
yes, we can go into specifics of what is causing or contributing to the problem.
It's getting past the attitude of some people that 2xMem for swap is somehow
'normal and acceptable -- deal with it".  In my case, seems like 10Mb of swap would
be all that would generally be used (I don't think I've ever seen swap usage over 7Mb)
on a 512M system.  To be told "oh, your wrong, you *should* have 1Gig or you are
operating in an 'unsupported' or non-standard configuration".  I find that very
user-unfriendly.


>
> The swapoff case comes down to dead swap pages in the swap cache.
> Which greatly increases the number of swap pages slows the system
> down, but since these pages are trivial to free we don't generate any
> I/O so don't wait for I/O and thus never enter the scheduler.  Making
> nothing else in the system runnable.

---
I haven't ever *noticed* this on my machine but that could be
because there isn't much in swap to begin with?  Could be I was
just blissfully ignorant of the time it took to do a swapoff.
Hmmmlet's see...  Just tried it.  I didn't get a total lock up,
but cursor movement was definitely jerky:
> time sudo swapoff -a

real0m10.577s
user0m0.000s
sys 0m9.430s

Looking at vmstat, the needed space was taken mostly out of the
page cache (86M->81.8M) and about 700K each out of free and buff.


> Your case is significantly different.  I don't know if you are seeing
> any issues with swapping at all.  With a 5M usage it may simply be
> totally unused pages being pushed out to the swap space.

---
Probably -- I guess the page cache and disk buffers put enough pressure to
push some things off to swap.

-linda
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay

Uh, last I checked on my linux based embedded device I didn't want to swap to 
flash.  Hmm.., now why was that..., oh, that's right, it's *much* more 
expensive than memory, oh yes, and it actually gets FRIED when you write to a 
block more than 100k times.  Oh, what was that other thing..., oh yes, and 
its SOLDERED ON THE BOARD.  Damn..., guess I just lost a grand or so.

Seriously folks, Linux isn't just for big webservers...

Thanks,
Shane Nay.
(Oh, BTW, I really appreciate the work that people have done on the VM, but 
folks that are just talking..., well, think clearly before you impact other 
people that are writing code.)

On Wednesday 06 June 2001 02:57, Dr S.M. Huen wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
>
> Do I understand you correctly?
> ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> drives.
>
> It will cost you 19x as much to put the RAM in as to put the
> developer's recommended amount of swap space to back up that RAM.  The
> developers gave their reasons for this design some time ago and if the
> ONLY problem was that it required you to allocate more swap, why should
> it be a priority item to fix it for those that refuse to do so?   By all
> means fix it urgently where it doesn't work when used as advised but
> demanding priority to fixing a problem encountered when a user refuses to
> use it in the manner specified seems very unreasonable.  If you can afford
> 4GB RAM, you certainly can afford 8GB swap.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay

On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
> On Thu, 7 Jun 2001, Shane Nay wrote:
> > (Oh, BTW, I really appreciate the work that people have done on the VM,
> > but folks that are just talking..., well, think clearly before you impact
> > other people that are writing code.)
>
> If all the people talking were reporting results we would be really happy.
>
> Seriously, we really lack VM reports.

Okay, I've had some problems with the VM on my machine, what is the most 
usefull way to compile reports for you?  I have modified the kernel for a few 
different ports fixing bugs, and device drivers, etc., but the VM is all 
greek to me, I can just see that caching is hyper aggressive and doesn't look 
like it's going back to the pool..., which results in sluggish performance.  
Now I know from the work that I've done that anecdotal information is almost 
never even remotely usefull.  Therefore is there any body of information that 
I can read up on to create a usefull set of data points for you or other VM 
hackers to look at?  (Or maybe some report in the past that you thought was 
especially usefull?)

Thank You,
Shane Nay.
(I have in the past had many problems with the VM on embedded machines as 
well, but I'm not actively working on any right this second..., though my 
Psion is sitting next to me begging for me to run some VM tests on it :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

> (Oh, BTW, I really appreciate the work that people have done on the VM, but 
> folks that are just talking..., well, think clearly before you impact other 
> people that are writing code.)

If all the people talking were reporting results we would be really happy. 

Seriously, we really lack VM reports.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Miles Lane


On 07 Jun 2001 11:49:47 -0400, Derek Glidden wrote:
> Miles Lane wrote:
> > 
> > So please, if you have new facts that you want to offer that
> > will help us characterize and understand these VM issues better
> > or discover new problems, feel free to share them.  But if you
> > just want to rant, I, for one, would rather you didn't.
> 
> *sigh*
> 
> Not to prolong an already pointless thread, but that really was the
> intent of my original message.  I had figured out a specific way, with
> easy-to-follow steps, to make the VM misbehave under very certain
> conditions.  I even offered to help figure out a solution in any way I
> could, considering I'm not familiar with kernel code.
> 
> However, I guess this whole "too much swap" issue has a lot of people on
> edge and immediately assumed I was talking about this subject, without
> actually reading my original message.

Actually, I think your original message was useful.  It has
spurred a reevaluation of some design assumptions implicit in the VM
in the 2.4 series and has also surfaced some bugs.  It was not you
who I felt was sending enflammatory remarks, it was the folks who
have been bellyaching about the current swap disk space requirements
without offering any new information to help developers remedy
the situation.

So, thanks for bringing the topic up.  :-)

Cheers,
Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Mike Galbraith wrote:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Mike Galbraith <[EMAIL PROTECTED]> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time.  That might help.  Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks.  It might just be a matter of everything
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
> 
> That did it.

What about including this workaround in the kernel ? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread José Luis Domingo López


On Thursday, 07 June 2001, at 09:23:42 +0200,
Helge Hafting wrote:

> Derek Glidden wrote:
> > 
> > Helge Hafting wrote:
> [...]
> The machine froze 10 seconds or so at the end of the minute, I can
> imagine that biting with bigger swap.
> 
Same behavior here with a Pentium III 600, 128 MB RAM and 128 MB of swap.
Filled mem and swap with the infamous glob() "bug" (ls ../*/.. etc.), made
swapoff, and the machine kept very responsive except for the last 10-15
seconds before swapoff ends.

Even scrolling complex pages with Mozilla 0.9 worked smoothly :).

-- 
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)
 
jdomingo EN internautas PUNTO org  => ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  => Spam at your own risk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

Helge Hafting <[EMAIL PROTECTED]> writes:

> A problem with this is that normal paging-in is allowed to page other
> things out as well.  But you can't have that when swap is about to
> be turned off.  My guess is that swapoff functionality was perceived to
> be so seldom used that they didn't bother too much with scheduling 
> or efficiency.

There is some truth in that.  You aren't allowed to allocate new pages
in the swap space currently being removed however.  The current swap
off code removes pages from the current swap space without breaking
any sharing between swap pages.  Depending on your load this may be
important.  Fixing swapoff to be more efficient while at the same time
keeping sharing between pages is tricky.  That under loads that are
easy to trigger in 2.4 swapoff never sleeps is a big bug.

> I don't have the same problem myself though.  Shutting down with
> 30M or so in swap never take unusual time on 2.4.x kernels here,
> with a 300MHz processor.  I did a test while typing this letter,
> almost filling the 96M swap partition with 88M.  swapoff
> took 1 minute at 100% cpu.  This is long, but the machine was responsive
> most of that time.  I.e. no worse than during a kernel compile.
> The machine froze 10 seconds or so at the end of the minute, I can
> imagine that biting with bigger swap.

O.k. so at some point you actually wait for I/O and other process get
a chance to run.  On the larger machines we never wait for I/O and
thus never schedule at all.

The problem is now understood.  Now we just need to fix it.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

LA Walsh <[EMAIL PROTECTED]> writes:

> Now for whatever reason, since 2.4, I consistently use at least
> a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
> like nscd running 7 copies that take 72M.  Seems like overkill for
> a laptop.

So the question becomes why you are seeing an increased swap usage.
Currently there are two canidates in the 2.4.x code path.

1) Delayed swap deallocation, when a program exits after it
   has gone into swap it's swap usage is not freed. Ouch.

2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
   that was in the swap cache was written to the the page in the swap
   space would be removed.  In 2.4.x the location in swap space is
   retained with the goal of getting more efficient swap-ins.

Neither of the known canidates from increasing the swap load applies
when you aren't swapping in the first place.  They may aggrevate the
usage of swap when you are already swapping but they do not cause
swapping themselves.  This is why the intial recommendation for
increased swap space size was made.  If you are swapping we will use
more swap.

However what pushes your laptop over the edge into swapping is an
entirely different question.  And probably what should be solved.

> I think that is the point -- it was supported in 2.2, it is, IMO,
> a serious regression that it is not supported in 2.4.

The problem with this general line of arguing is that it lumps a whole
bunch of real issues/regressions into one over all perception.  Since
there are multiple reasons people are seeing problems, they need to be
tracked down with specifics.

The swapoff case comes down to dead swap pages in the swap cache.
Which greatly increases the number of swap pages slows the system
down, but since these pages are trivial to free we don't generate any
I/O so don't wait for I/O and thus never enter the scheduler.  Making
nothing else in the system runnable.

Your case is significantly different.  I don't know if you are seeing 
any issues with swapping at all.  With a 5M usage it may simply be
totally unused pages being pushed out to the swap space.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Derek Glidden


Miles Lane wrote:
> 
> So please, if you have new facts that you want to offer that
> will help us characterize and understand these VM issues better
> or discover new problems, feel free to share them.  But if you
> just want to rant, I, for one, would rather you didn't.

*sigh*

Not to prolong an already pointless thread, but that really was the
intent of my original message.  I had figured out a specific way, with
easy-to-follow steps, to make the VM misbehave under very certain
conditions.  I even offered to help figure out a solution in any way I
could, considering I'm not familiar with kernel code.

However, I guess this whole "too much swap" issue has a lot of people on
edge and immediately assumed I was talking about this subject, without
actually reading my original message.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On Thu, 7 Jun 2001, Bulent Abali wrote:

> I happened to saw this one with debugger attached serial port.
> The system was alive.  I think I was watching the free page count and
> it was decreasing very slowly may be couple pages per second.  Bigger
> the swap usage longer it takes to do swapoff.  For example, if I had
> 1GB in the swap space then it would take may be an half hour to shutdown...

I took a ~300ms ktrace snapshot of the no IO spot with 2.4.4.ikd..

  % TOTALTOTAL USECSAVG/CALL   NCALLS
  0.0693% 208.540.40  517 c012d4b9 __free_pages
  0.0755% 227.341.01  224 c012cb67 __free_pages_ok
  ...
 34.7195%  104515.150.95   110049 c012de73 unuse_vma
 53.3435%  160578.37  303.55  529 c012dd38 __swap_free
Total entries: 131051  Total usecs:301026.93 Idle: 0.00%

Andrew Morton could be right about that loop not being wonderful.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

"Eric W. Biederman" wrote:

> There are cetain scenario's where you can't avoid virtual mem =
> min(RAM,swap). Which is what I was trying to say, (bad formula).  What
> happens is that pages get referenced  evenly enough and quickly enough
> that you simply cannot reuse the on disk pages.  Basically in the
> worst case all of RAM is pretty much in flight doing I/O.  This is
> true of all paging systems.

So, if I understand, you are talking about thrashing behavior
where your active set is larger than physical ram.  If that
is the case then requiring 2X+ swap for "better" performance
is reasonable.  However, if your active set is truely larger
than your physical memory on a consistant basis, in this day,
the solution is usually "add more RAM".  I may be wrong, but
my belief is that with today's computers people are used to having
enough memory to do their normal tasks and that swap is for
"peak loads" that don't occur on a sustained basis.  Of course
I imagine that this is my belief as it is my own practice/view.
I want to have considerably more memory than my normal working
set.  Swap on my laptop disk is *slow*.  It's a low-power, low-RPM,
slow seek rate all to conserve power (difference between spinning/off
= 1W).  So I have 50% of my phys mem on swap -- because I want to
'feel' it when I goto swap and start looking for memory hogs.
For me, the pathological case is touching swap *at all*.  So the
idea of the entire active set being >=phys mem is already broken
on my setup.  Thus my expectation of swap only as 'warning'/'buffer'
zone.

Now for whatever reason, since 2.4, I consistently use at least
a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
like nscd running 7 copies that take 72M.  Seems like overkill for
a laptop.

> However just because in the worst case virtual mem = min(RAM,swap), is
> no reason other cases should use that much swap.  If you are doing a
> lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
> well, because frequently you can save on I/O operations by simply
> reusing the existing swap page.

---
Agreed.  But planning your swap space for a worst
case scenario that you never hit is wasteful.  My worst
case is using any swap.  The system should be able to live
with swap=1/2*phys in my situation.  I don't think I'm
unique in this respect.

> It's a theoretical worst case and they all have it.  In practice it is
> very hard to find a work load where practically every page in the
> system is close to the I/O point howerver.

---
Well exactly the point.  It was in such situations in some older
systems that some programs were swapped out and temporarily made
unavailable for running (they showed up in the 'w' space in vmstat).

> Except for removing pages that aren't used paging with swap < RAM is
> not useful.  Simply removing pages that aren't in active use but might
> possibly be used someday is a common case, so it is worth supporting.

---
I think that is the point -- it was supported in 2.2, it is, IMO,
a serious regression that it is not supported in 2.4.

-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech., Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bulent Abali




>> O.k.  I think I'm ready to nominate the dead swap pages for the big
>> 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
>> instead of being IO bound?  Just wanting to understand this the cheap
way :)
>
>There's no IO being done whatsoever (that I can see with only a blinky).
>I can fire up ktrace and find out exactly what's going on if that would
>be helpful.  Eating the dead swap pages from the active page list prior
>to swapoff cures all but a short freeze.  Eating the rest (few of those)
>might cure the rest, but I doubt it.
>
>-Mike

1)  I second Mike's observation.  swapoff either from command line or
during
shutdown, just hangs there.  No disk I/O is being done as I could see
from the blinkers.  This is not a I/O boundness issue.  It is more like
a deadlock.

I happened to saw this one with debugger attached serial port.
The system was alive.  I think I was watching the free page count and
it was decreasing very slowly may be couple pages per second.  Bigger
the swap usage longer it takes to do swapoff.  For example, if I had
1GB in the swap space then it would take may be an half hour to shutdown...


2)  Now why I would have 1 GB in the swap space, that is another problem.
Here is what I observe and it doesn't make much sense to me.
Let's say I have 1GB of memory and plenty of swap.  And let's
say there is process with little less than 1GB size.  Suppose the system
starts swapping because it is short few megabytes of memory.
Within *seconds* of swapping, I see that the swap disk usage balloons to
nearly 1GB. Nearly entire memory moves in to the page cache.  If you
run xosview you will know what I mean.  Memory usage suddenly turns from
green to red :-).   And I know for a fact that my disk cannot do 1GB per
second :-). The SHARE column of the big process in "top" goes up by
hundreds
of megabytes.
So it appears to me that MM is marking the whole process memory to be
swapped out and probably reserving nearly 1 GB in the swap space and
furthermore moves entire process pages to apparently to the page cache.
You would think that if you are short by few MB of memory MM would put
few MB worth of pages in the swap. But it wants to move entire processes
in to swap.

When the 1GB process exits, the swap usage doesn't change (dead swap
pages?).
And shutdown or swapoff will take forever due to #1 above.

Bulent




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bernd Jendrissek

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message

First things first: 1) Please Cc: me when responding, 2) apologies for
dropping any References: headers, 3) sorry for bad formatting

"Jeffrey W. Baker" wrote:
> On Tue, 5 Jun 2001, Derek Glidden wrote: 
> > This isn't trying to test extreme low-memory pressure, just how the 
> > system handles recovering from going somewhat into swap, which is
> > a real 
> > day-to-day problem for me, because I often run a couple of apps
> > that 
> > most of the time live in RAM, but during heavy computation runs,
> > can go 
> > a couple hundred megs into swap for a few minutes at a time.
> > Whenever 
> > that happens, my machine always starts acting up afterwards, so I 
> > started investigating and found some really strange stuff going on. 

Has anyone else noticed the difference between
 dd if=/dev/zero of=bigfile bs=16384k count=1
and
 dd if=/dev/zero of=bigfile bs=8k count=2048
deleting 'bigfile' each time before use?  (You with lots of memory may
(or may not!) want to try bs=262144k)

Once, a few months ago, I thought I traced this to the loop at line ~2597
in linux/mm/filemap.c:generic_file_write
  2593  remove_suid(inode);
  2594  inode->i_ctime = inode->i_mtime = CURRENT_TIME;
  2595  mark_inode_dirty_sync(inode);
  2596  
  2597  while (count) {
  2598  unsigned long index, offset;
  2599  char *kaddr;
  2600  int deactivate = 1;
...
  2659  
  2660  if (status < 0)
  2661  break;
  2662  }
  2663  *ppos = pos;
  2664  
  2665  if (cached_page)

It appears to me that pseudo-spins (it *does* do useful work) in this
loop for as long as there are pages available.

BTW while the big-bs dd is running, the disk is active.  I assume that
writes are indeed scheduled and start happening even while we're still
dirtying pages?

Does this freezing effect occur on SMP machines too?  Oops, had access
to one until this morning :(  Would an SMP box still have a 'spare'
cpu which isn't dirtying pages like crazy, and can therefore do things
like updating mouse cursors, etc.?

Bernd Jendrissek

P.S. here's my patch that cures this one symptom; it smells and looks
ugly, I know, but at least my mouse cursor doesn't jump across the whole
screen when I do the dd=torture.

I have no idea if this is right or not, whether I'm allowed to call
schedule inside generic_file_write or not, etc.  And the '256' is
just random - small enough to let the cursor move, but large enough
to do work between schedule()s.

If this solves your problem, use it; if your name is Linus or Alan,
ignore or do it right please.

diff -u -r1.1 -r1.2
--- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
+++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
@@ -2599,6 +2599,11 @@
char *kaddr;
int deactivate = 1;

+   /* bernd-hack: give other processes a chance to run */
+   if (count % 256 == 0) {
+   schedule();
+   }
+
/*
 * Try to find the page in the cache. If it isn't there,
 * allocate a free page.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
/Kod15Nx7AzaM4v0WhOgp88=
=pyr6
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

Linus Torvalds <[EMAIL PROTECTED]> writes:

> On 7 Jun 2001, Eric W. Biederman wrote:
> 
> No - I suspect that we're not actually doing all that much IO at all, and
> the real reason for the lock-up is just that the current algorithm is so
> bad that when it starts to act exponentially worse it really _is_ taking
> minutes of CPU time following pointers and generally not being very nice
> on the CPU cache etc..

Hmm.  Unless I am mistaken the complexity is O(SwapPages*VMSize)
Which is very bad, but no where near exponentially horrible.

> The bulk of the work is walking the process page tables thousands and
> thousands of times. Expensive.

Definitely.  I played following the page tables in a good way a while
back, and even when you do it right the process is slow.  Is 
if (need_resched) {
schedule();
}
A good idiom to use when you know you have a loop that will take a
long time.  Because even if we do this right we should do our best to
avoid starving other processes in the system 

Hmm.  There is a nasty case with turning the walk inside out.  When we
read a page into RAM there could still be other users of that page
that still refer to the swap entry.  So we cannot immediately remove
the page from the swap cache.  Unless we want to break sharing and
increase the demands upon the virtual memory when we are shrinking
it...  

> > If this is going on I think we need to look at our delayed
> > deallocation policy a little more carefully.
> 
> Agreed. I already talked in private with some people about just
> re-visiting the issue of the lazy de-allocation. It has nice properties,
> but it certainly appears as if the nasty cases just plain outweigh the
> advantages.

I'm trying to remember the advantages.  Besides not having to care
that a page is a swap page in free_pte.  If there really is some value
in not handling the pages there (and I seem to recall something about
pages under I/O).  It might at least be worth putting the pages on
their own LRU list.  So that kswapd can cruch through the list
whenever it wakes up and gives a bunch of free pages.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > On 7 Jun 2001, Eric W. Biederman wrote:
> >
> > > Does this improve the swapoff speed or just allow other programs to
> > > run at the same time?  If it is still slow under that kind of load it
> > > would be interesting to know what is taking up all time.
> > >
> > > If it is no longer slow a patch should be made and sent to Linus.
> >
> > No, it only cures the freeze.  The other appears to be the slow code
> > pointed out by Andrew Morton being tickled by dead swap pages.
>
> O.k.  I think I'm ready to nominate the dead swap pages for the big
> 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
> instead of being IO bound?  Just wanting to understand this the cheap way :)

There's no IO being done whatsoever (that I can see with only a blinky).
I can fire up ktrace and find out exactly what's going on if that would
be helpful.  Eating the dead swap pages from the active page list prior
to swapoff cures all but a short freeze.  Eating the rest (few of those)
might cure the rest, but I doubt it.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Linus Torvalds

On 7 Jun 2001, Eric W. Biederman wrote:

> [EMAIL PROTECTED] (Linus Torvalds) writes:
> > 
> > Somebody interested in trying the above add? And looking for other more
> > obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
> > it bearable and bring it to the 2.2.x levels. 
> 
> At little bit.  The one really bad behavior of not letting any other
> processes run seems to be fixed with an explicit:
> if (need_resched) {
> schedule();
> }
> 
> What I can't figure out is why this is necessary.  Because we should
> be sleeping in alloc_pages if nowhere else.

No - I suspect that we're not actually doing all that much IO at all, and
the real reason for the lock-up is just that the current algorithm is so
bad that when it starts to act exponentially worse it really _is_ taking
minutes of CPU time following pointers and generally not being very nice
on the CPU cache etc..

The bulk of the work is walking the process page tables thousands and
thousands of times. Expensive.

> If this is going on I think we need to look at our delayed
> deallocation policy a little more carefully.

Agreed. I already talked in private with some people about just
re-visiting the issue of the lazy de-allocation. It has nice properties,
but it certainly appears as if the nasty cases just plain outweigh the
advantages.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 7 Jun 2001, Eric W. Biederman wrote:
> 
> > Does this improve the swapoff speed or just allow other programs to
> > run at the same time?  If it is still slow under that kind of load it
> > would be interesting to know what is taking up all time.
> >
> > If it is no longer slow a patch should be made and sent to Linus.
> 
> No, it only cures the freeze.  The other appears to be the slow code
> pointed out by Andrew Morton being tickled by dead swap pages.

O.k.  I think I'm ready to nominate the dead swap pages for the big
2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
instead of being IO bound?  Just wanting to understand this the cheap way :)

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

[EMAIL PROTECTED] (Linus Torvalds) writes:
> 
> Somebody interested in trying the above add? And looking for other more
> obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
> it bearable and bring it to the 2.2.x levels. 

At little bit.  The one really bad behavior of not letting any other
processes run seems to be fixed with an explicit:
if (need_resched) {
schedule();
}

What I can't figure out is why this is necessary.  Because we should
be sleeping in alloc_pages if nowhere else.

I suppose if the bulk of our effort really is freeing dead swap cache
pages we can spin without sleeping, and never let another process run
because we are busily recycling dead swap cache pages. Does this sound
right? 

If this is going on I think we need to look at our delayed
deallocation policy a little more carefully.   I suspect we should
have code in kswapd actively removing these dead swap cache pages. 
After we get the latency improvements in exit these pages do
absolutely nothing for us except clog up the whole system, and
generally give the 2.4 VM a bad name.

Anyone care to check my analysis? 

> Is anybody interested in making "swapoff()" better? Please speak up..

Interested.   But finding the time...

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > On 6 Jun 2001, Eric W. Biederman wrote:
> >
> > > Mike Galbraith <[EMAIL PROTECTED]> writes:
> > >
> > > > > If you could confirm this by calling swapoff sometime other than at
> > > > > reboot time.  That might help.  Say by running top on the console.
> > > >
> > > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > > switch is nogo...
> > > >
> > > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > > utterly comatose for those 4 seconds though.
> > >
> > > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > > if (need_resched) schedule();
> > > It should be outside all of the locks.  It might just be a matter of
> > everything
> >
> > > serializing on the SMP locks, and the kernel refusing to preempt itself.
> >
> > That did it.
>
> Does this improve the swapoff speed or just allow other programs to
> run at the same time?  If it is still slow under that kind of load it
> would be interesting to know what is taking up all time.
>
> If it is no longer slow a patch should be made and sent to Linus.

No, it only cures the freeze.  The other appears to be the slow code
pointed out by Andrew Morton being tickled by dead swap pages.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Helge Hafting

Derek Glidden wrote:
> 
> Helge Hafting wrote:
> >
> > The drive is inactive because it isn't needed, the machine is
> > running loops on data in memory.  And it is unresponsive because
> > nothing else is scheduled, maybe "swapoff" is easier to implement
> 
> I don't quite get what you're saying.  If the system becomes
> unresponsive because  the VM swap recovery parts of the kernel are
> interfering with the kernel scheduler then that's also bad because there
> absolutely *are* other processes that should be getting time, like the
> console windows/shells at which I'm logged in.  If they aren't getting
> it specifically because the VM is preventing them from receiving
> execution time, then that's another bug.
> 
Sure.  The kernel doing a big job without scheduling anything 
is a problem.

> I'm not familiar enough with the swapping bits of the kernel code, so I
> could be totally wrong, but turning off a swap file/partition should
> just call the same parts of the VM subsystem that would normally try to
> recover swap space under memory pressure.  

A problem with this is that normal paging-in is allowed to page other
things out as well.  But you can't have that when swap is about to
be turned off.  My guess is that swapoff functionality was perceived to
be so seldom used that they didn't bother too much with scheduling 
or efficiency.

I don't have the same problem myself though.  Shutting down with
30M or so in swap never take unusual time on 2.4.x kernels here,
with a 300MHz processor.  I did a test while typing this letter,
almost filling the 96M swap partition with 88M.  swapoff
took 1 minute at 100% cpu.  This is long, but the machine was responsive
most of that time.  I.e. no worse than during a kernel compile.
The machine froze 10 seconds or so at the end of the minute, I can
imagine that biting with bigger swap.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Mike Galbraith <[EMAIL PROTECTED]> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time.  That might help.  Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks.  It might just be a matter of
> everything
> 
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
> 
> That did it.

Does this improve the swapoff speed or just allow other programs to
run at the same time?  If it is still slow under that kind of load it
would be interesting to know what is taking up all time.

If it is no longer slow a patch should be made and sent to Linus.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman

LA Walsh <[EMAIL PROTECTED]> writes:

> "Eric W. Biederman" wrote:
> 
> > The hard rule will always be that to cover all pathological cases swap
> > must be greater than RAM.  Because in the worse case all RAM will be
> > in thes swap cache.  That this is more than just the worse case in 2.4
> > is problematic.  I.e. In the worst case:
> > Virtual Memory = RAM + (swap - RAM).
> 
> Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
> more than 256M of programs.  I don't want it to swap -- its a special, weird
> condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
> something I never want to use.  IRIX runs just fine with swap Irix, your Virtual Memory = RAM + swap.  Seems like the Linux kernel requires
> more swap than other old OS's (SunOS3 (virtual mem = min(mem,swap)).
> I *thought* I remember that restriction being lifted in SunOS4 when they
> upgraded the VM.  Even though I worked there for 6 years, that was
> 6 years ago...

There are cetain scenario's where you can't avoid virtual mem =
min(RAM,swap). Which is what I was trying to say, (bad formula).  What
happens is that pages get referenced  evenly enough and quickly enough
that you simply cannot reuse the on disk pages.  Basically in the
worst case all of RAM is pretty much in flight doing I/O.  This is
true of all paging systems.

However just because in the worst case virtual mem = min(RAM,swap), is
no reason other cases should use that much swap.  If you are doing a
lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
well, because frequently you can save on I/O operations by simply
reusing the existing swap page.

> 
> > You can't improve the worst case.  We can improve the worst case that
> > many people are facing.
> 
> ---
> Other OS's don't have this pathological 'worst case' scenario.  Even
> my Windows [vm]box seems to operate fine with swap virtual space closely approximates physical + disk memory.

It's a theoretical worst case and they all have it.  In practice it is
very hard to find a work load where practically every page in the
system is close to the I/O point howerver.

Except for removing pages that aren't used paging with swap < RAM is
not useful.  Simply removing pages that aren't in active use but might
possibly be used someday is a common case, so it is worth supporting.

> 
> > It's worth complaining about.  It is also worth digging into and find
> > out what the real problem is.  I have a hunch that this hole
> > conversation on swap sizes being irritating is hiding the real
> > problem.
> 
> ---
> Okay, admission of ignorance.  When we speak of "swap space",
> is this term inclusive of both demand paging space and
> swap-out-entire-programs space or one or another?

Linux has no method to swap out an entire program so when I speak of
swapping I'm actually thinking paging.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On 6 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > > If you could confirm this by calling swapoff sometime other than at
> > > reboot time.  That might help.  Say by running top on the console.
> >
> > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > switch is nogo...
> >
> > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > utterly comatose for those 4 seconds though.
>
> At the top of the while(1) loop in try_to_unuse what happens if you put in.
> if (need_resched) schedule();
> It should be outside all of the locks.  It might just be a matter of everything
> serializing on the SMP locks, and the kernel refusing to preempt itself.

That did it.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Miles Lane

On 06 Jun 2001 20:34:49 -0400, Mike A. Harris wrote:
> On Wed, 6 Jun 2001, Derek Glidden wrote:
> 
> >>  Derek> overwhelmed.  On the system I'm using to write this, with
> >>  Derek> 512MB of RAM and 512MB of swap, I run two copies of this
> >>
> >> Please see the following message on the kernel mailing list,
> >>
> >> 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap
> >> Message-Id: <[EMAIL PROTECTED]>
> >
> >Yes, I'm aware of this.
> >
> >However, I still believe that my original problem report is a BUG.  No
> >matter how much swap I have, or don't have, and how much is or isn't
> >being used, running "swapoff" and forcing the VM subsystem to reclaim
> >unused swap should NOT cause my machine to feign death for several
> >minutes.
> >
> >I can easily take 256MB out of this machine, and then I *will* have
> >twice as much swap as RAM and I can still cause the exact same
> >behaviour.
> >
> >It's a bug, and no number of times saying "You need twice as much swap
> >as RAM" will change that fact.
> 
> Precicely.  Saying 8x RAM doesn't change it either.  Sometime
> next week I'm going to purposefully put a new 60Gb disk in on a
> separate controller as pure swap on top of 256Mb of RAM.  My
> guess is after bootup, and login, I'll have 48Gb of stuff in
> swap "just in case".

Mike and others, I am getting tired of your comments.  Sheesh.  
The various developers who actually work on the VM have already
acknowledged the issues and are exploring fixes, including at 
least one patch that already exists.  It seems clear that the 
uproar from the people who are having trouble with the new VM's 
handling of swap space have been heard and folks are going to 
fix these problems.  It may not happen today or tomorrow, but 
soon.  What the heck else do you want?

Making enflammatory remarks about the current situation does 
nothing to help get the problems fixed, it just wastes our time 
and bandwidth.

So please, if you have new facts that you want to offer that
will help us characterize and understand these VM issues better
or discover new problems, feel free to share them.  But if you
just want to rant, I, for one, would rather you didn't.

Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike A. Harris


On Wed, 6 Jun 2001, Derek Glidden wrote:

>>  Derek> overwhelmed.  On the system I'm using to write this, with
>>  Derek> 512MB of RAM and 512MB of swap, I run two copies of this
>>
>> Please see the following message on the kernel mailing list,
>>
>> 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap
>> Message-Id: <[EMAIL PROTECTED]>
>
>Yes, I'm aware of this.
>
>However, I still believe that my original problem report is a BUG.  No
>matter how much swap I have, or don't have, and how much is or isn't
>being used, running "swapoff" and forcing the VM subsystem to reclaim
>unused swap should NOT cause my machine to feign death for several
>minutes.
>
>I can easily take 256MB out of this machine, and then I *will* have
>twice as much swap as RAM and I can still cause the exact same
>behaviour.
>
>It's a bug, and no number of times saying "You need twice as much swap
>as RAM" will change that fact.

Precicely.  Saying 8x RAM doesn't change it either.  Sometime
next week I'm going to purposefully put a new 60Gb disk in on a
separate controller as pure swap on top of 256Mb of RAM.  My
guess is after bootup, and login, I'll have 48Gb of stuff in
swap "just in case".



--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike A. Harris


On Wed, 6 Jun 2001, android wrote:

>associated with that mindset that made Microsoft such a [fill in the blank].
>As for the 2.4 VM problem, what are you doing with your machine that's
>making it use up so much memory? I have several processes running
>on mine all the time, including a slew in X, and I have yet to see
>significant swap activity.

Try _compiling_ XFree86.  Watch the machine nosedive.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike A. Harris

On Wed, 6 Jun 2001, Dr S.M. Huen wrote:

>> For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
>>
>
>Do I understand you correctly?
>ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
>at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
>drives.

Linux is all about technical correctness, and doing the job
properly.  It isn't about "there is a bug in the kernel, but that
is ok because a 8Gb swapfile only costs $2"

Why are half the people here trying to hide behind this diskspace
is cheap argument?  If we rely on that, then Linux sucks shit.

The problem IMHO is widely acknowledged by those who matter as an
official BUG, and that is that.  It is also acknowledged widely
by those who can fix the problem that it will be fixed in time.

So technically speaking - the kernel has a widely known
bug/misfeature, which is acknowledged by core kernel developers
as needing fixing, and that it will get fixed at some point.

Saying it is a nonissue due to the cost of hardware resources is
just plain Microsoft attitude and holds absolutely zero technical
merit.

It *IS* an issue, because it is making Linux suck, and is causing
REAL WORLD PROBLEMS.  The use 2x RAM is nothing more than a
bandaid workaround, so don't claim that it is the proper fix due
to big wallet size.

I have 2.2 doing a software build that takes 40 minutes with
256Mb of RAM, and 1G of swap.  The same build on 2.4 takes 60
minutes.  That is 4x RAM for swap.

Lowering the swap down to 2x RAM makes no difference in the
numbers, down to 1x RAM the 2.4 build slows down horrendously,
and droping the swap to 20Mb makes it die completely in 2.4.

2.4 is fine for a firewall, or certain other applications, but
regardless of the amount of SWAP,  I'll take the 40minute build
using 2.2 over the 60minute build using 2.4 anyday.

This is the real world.  And no cost isn't an issue to me.
Putting another 80Gb drive in this box for swap isn't going to
help the work get done any faster.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Kai Henningsen


[EMAIL PROTECTED] (Alexander Viro)  wrote on 06.06.01 in 
<[EMAIL PROTECTED]>:

> On Wed, 6 Jun 2001, Sean Hunter wrote:
>
> > This is completely bogus. I am not saying that I can't afford the swap.
> > What I am saying is that it is completely broken to require this amount
> > of swap given the boundaries of efficient use.
>
> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.

As a "will break without" rule, I'd consider a kernel with that property  
completely unsuitable for production use. I certainly don't remember  
thinking of that as more than a recommendation back when I used commercial  
Unices (SysVsomething).

MfG Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


At 11:27 pm +0100 6/6/2001, android wrote:
>> >I'd be happy to write a new routine in assembly
>>
>>I sincerely hope you're joking.
>>
>>It's the algorithm that needs fixing, not the implementation of that
>>algorithm.  Writing in assembler?  Hope you're proficient at writing in
>>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
>>architectures we support these days.  And you darn well better hope every
>>other kernel hacker is as proficient as that, to be able to read it.

>As for the algorithm, I'm sure that
>whatever method is used to handle page swapping, it has to comply with
>the kernel's memory management scheme already in place. That's why I would
>need the details so that I wouldn't create more problems than already present.

Have you actually been following this thread?  The algorithm has been
discussed and at least one alternative brought forward.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Robert Love

On 06 Jun 2001 15:27:57 -0700, android wrote:
> >I sincerely hope you're joking.
>
> I realize that assembly is platform-specific. Being that I use the IA32 class
> machine, that's what I would write for. Others who use other platforms could
> do the deed for their native language.

no, look at the code. it is not going to benefit from assembly (assuming
you can even implement it cleanly in assembly).  its basically an
iteration of other function calls.

doing a new implementation in assembly for each platform is not
feasible, anyhow. this is the sort of thing that needs to be uniform.

this really has nothing to do with the "iron" of the computer -- its a
loop to check and free swap pages. assembly will not provide benefit.

-- 
Robert M. Love
[EMAIL PROTECTED]
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Antoine


hi,

I have a problem with kswapd, it takes suddenly 98 % CPU and crash my server
I dono why, I have a linux kernel 2.2.17 debian distro if anyone can help me
... thx ;)

Antoine

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



> >I'd be happy to write a new routine in assembly
>
>I sincerely hope you're joking.
>
>It's the algorithm that needs fixing, not the implementation of that
>algorithm.  Writing in assembler?  Hope you're proficient at writing in
>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
>architectures we support these days.  And you darn well better hope every
>other kernel hacker is as proficient as that, to be able to read it.
I realize that assembly is platform-specific. Being that I use the IA32 class
machine, that's what I would write for. Others who use other platforms could
do the deed for their native language. As for the algorithm, I'm sure that
whatever method is used to handle page swapping, it has to comply with
the kernel's memory management scheme already in place. That's why I would
need the details so that I wouldn't create more problems than already present.
Being that most users are on the IA32 platform, I'm sure they wouldn't reject
an assembly solution to this problem. As for kernel acceptance, that's an
issue for the political eggheads. Not my forte. :-)

  -- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>I'd be happy to write a new routine in assembly

I sincerely hope you're joking.

It's the algorithm that needs fixing, not the implementation of that
algorithm.  Writing in assembler?  Hope you're proficient at writing in
x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
architectures we support these days.  And you darn well better hope every
other kernel hacker is as proficient as that, to be able to read it.

IOW, no chance.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread LA Walsh


"Eric W. Biederman" wrote:

> The hard rule will always be that to cover all pathological cases swap
> must be greater than RAM.  Because in the worse case all RAM will be
> in thes swap cache.  That this is more than just the worse case in 2.4
> is problematic.  I.e. In the worst case:
> Virtual Memory = RAM + (swap - RAM).

Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
more than 256M of programs.  I don't want it to swap -- its a special, weird
condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
something I never want to use.  IRIX runs just fine with swap You can't improve the worst case.  We can improve the worst case that
> many people are facing.

---
Other OS's don't have this pathological 'worst case' scenario.  Even
my Windows [vm]box seems to operate fine with swap It's worth complaining about.  It is also worth digging into and find
> out what the real problem is.  I have a hunch that this hole
> conversation on swap sizes being irritating is hiding the real
> problem.

---
Okay, admission of ignorance.  When we speak of "swap space",
is this term inclusive of both demand paging space and
swap-out-entire-programs space or one or another?
-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



>Is anybody interested in making "swapoff()" better? Please speak up..
>
> Linus

I'd be happy to write a new routine in assembly, if I had a clue as to how
the VM algorithm works in Linux. What should swapoff  do if all physical
memory is in use? How does the swapping algorithm balance against
cache memory? Can someone point me to where I can find the exact
details of the VM mechanism in Linux? Thanks!

   -- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Derek Glidden  <[EMAIL PROTECTED]> wrote:
>
>After reading the messages to this list for the last couple of weeks and
>playing around on my machine, I'm convinced that the VM system in 2.4 is
>still severely broken.  

Now, this may well be true, but what you actually demonstrated is that
"swapoff()" is extremely (and I mean _EXTREMELY_) inefficient, to the
point that it can certainly be called broken.

It got worse in 2.4.x not so much due to any generic VM worseness, as
due to the fact that the much more persistent swap cache behaviour in
2.4.x just exposes the fundamental inefficiencies of "swapoff()" more
clearly.  I don't think the swapoff() algorithm itself has changed, it's
just that the algorithm was always exponential, I think (and because of
the persistent swap cache, the "n" in the algorithm became much bigger). 

So this is really a separate problem from the general VM balancing
issues. Go and look at the "try_to_unuse()" logic, and wince. 

I'd love to have somebody look a bit more at swap-off.  It may well be,
for example, that swap-off does not correctly notice dead swap-pages at
all - somebody should verify that it doesn't try to read in and
"try_to_unuse()" dead swap entries.  That would make the inefficiency
show up even more clearly. 

(Quick look gives the following: right now try_to_unuse() in
mm/swapfile.c does something like

lock_page(page);
if (PageSwapCache(page))
delete_from_swap_cache_nolock(page);
UnlockPage(page);
read_lock(&tasklist_lock);
for_each_task(p)
unuse_process(p->mm, entry, page);
read_unlock(&tasklist_lock);
shmem_unuse(entry, page);
/* Now get rid of the extra reference to the temporary
   page we've been using. */
page_cache_release(page);

and we should trivially notice that if the page count is 1, it cannot be
mapped in any process, so we should maybe add something like

lock_page(page);
if (PageSwapCache(page))
delete_from_swap_cache_nolock(page);
UnlockPage(page);
+   if (page_count(page) == 1)
+   goto nothing_to_do;
read_lock(&tasklist_lock);
for_each_task(p)
unuse_process(p->mm, entry, page);
read_unlock(&tasklist_lock);
shmem_unuse(entry, page);
+
+   nothing_to_do:
+
/* Now get rid of the extra reference to the temporary
   page we've been using. */
page_cache_release(page);

which should (assuming I got the page count thing right - I'v eobviously
not tested the above change) make sure that we don't spend tons of time
on dead swap pages. 

Somebody interested in trying the above add? And looking for other more
obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
it bearable and bring it to the 2.2.x levels. 

The _real_ fix is to really make "swapoff()" work the other way around -
go through each process and look for swap entries in the page tables
_first_, and bring all entries for that device in sanely, and after
everything is brought in just drop all the swap cache pages for that
device. 

The current swapoff() thing is really a quick hack that has lived on
since early 1992 with quick hacks to make it work with the big VM
changes that have happened since. 

That would make swapoff be O(n) in VM size (and you can easily do some
further micro-optimizations at that time by avoiding shared mappings
with backing store and other things that cannot have swap info involved)

Is anybody interested in making "swapoff()" better? Please speak up..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Daniel Phillips


On Wednesday 06 June 2001 20:27, Eric W. Biederman wrote:
> The hard rule will always be that to cover all pathological cases
> swap must be greater than RAM.  Because in the worse case all RAM
> will be in thes swap cache.

Could you explain in very simple terms how the worst case comes about?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden


Mike Galbraith wrote:
> 
> Can you try the patch below to see if it helps?  If you watch
> with vmstat, you should see swap shrinking after your test.
> Let is shrink a while and then see how long swapoff takes.
> Under a normal load, it'll munch a handfull of them at least
> once a second and keep them from getting annoying. (theory;)

Hi Mike,
I'll give that patch a spin this evening after work when I have time to
patch and recompile the kernel.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Derek Glidden <[EMAIL PROTECTED]> writes:
> >
> >
> > > The problem I reported is not that 2.4 uses huge amounts of swap but
> > > that trying to recover that swap off of disk under 2.4 can leave the
> > > machine in an entirely unresponsive state, while 2.2 handles identical
> > > situations gracefully.
> > >
> >
> > The interesting thing from other reports is that it appears to be kswapd
> > using up CPU resources.  Not the swapout code at all.  So it appears
> > to be a fundamental VM issue.  And calling swapoff is just a good way
> > to trigger it.
> >
> > If you could confirm this by calling swapoff sometime other than at
> > reboot time.  That might help.  Say by running top on the console.
> 
> The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> switch is nogo...
> 
> After running his memory hog, swapoff took 18 seconds.  I hacked a
> bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> utterly comatose for those 4 seconds though.

At the top of the while(1) loop in try_to_unuse what happens if you put in.
if (need_resched) schedule(); 
It should be outside all of the locks.  It might just be a matter of everything
serializing on the SMP locks, and the kernel refusing to preempt itself.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden

"Eric W. Biederman" wrote:
> 
> Derek Glidden <[EMAIL PROTECTED]> writes:
> 
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
> 
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources.  Not the swapout code at all.  So it appears
> to be a fundamental VM issue.  And calling swapoff is just a good way
> to trigger it.
> 
> If you could confirm this by calling swapoff sometime other than at
> reboot time.  That might help.  Say by running top on the console.

That's exactly what my original test was doing.  I think it was Jeffrey
Baker complaining about "swapoff" at reboot.  See my original post that
started this thread and follow the "five easy steps."  :)  I'm sucking
down a lot of swap, although not all that's available which is something
I am specifically trying to avoid - I wanted to stress the VM/swap
recovery procedure, not "out of RAM and swap" memory pressure - and then
running 'swapoff' from an xterm or a console.

The problem with being able to see what's eating up CPU resources is
that the whole machine stops responding for me to tell.  consoles stop
updating, the X display freezes, keyboard input is locked out, etc.  As
far as anyone can tell, for several minutes, the whole machine is locked
up. (except, strangely enough, the machine will still respond to ping) 
I've tried running 'top' to see what task is taking up all the CPU time,
but the system hangs before it shows anything meaningful.  I have been
able to tell that it hits 100% "system" utilization very quickly though.

I did notice that the first thing sys_swapoff() does is call
lock_kernel() ... so if sys_swapoff() takes a long time, I imagine
things will get very unresponsive quickly.  (But I'm not intimately
familiar with the various kernel locks, so I don't know what
granularity/atomicity/whatever lock_kernel() enforces.)

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On 6 Jun 2001, Eric W. Biederman wrote:

> Derek Glidden <[EMAIL PROTECTED]> writes:
>
>
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
>
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources.  Not the swapout code at all.  So it appears
> to be a fundamental VM issue.  And calling swapoff is just a good way
> to trigger it.
>
> If you could confirm this by calling swapoff sometime other than at
> reboot time.  That might help.  Say by running top on the console.

The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
switch is nogo...

After running his memory hog, swapoff took 18 seconds.  I hacked a
bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
utterly comatose for those 4 seconds though.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



>Furthermore, I am not demanding anything, much less "priority fixing"
>for this bug. Its my personal opinion that this is the most critical bug
>in the 2.4 series, and if I had the time and skill, this is what I would
>be working on. Because I don't have the time and skill, I am perfectly
>happy to wait until those that do fix the problem. To say it isn't a
>problem because I can buy more disk is nonsense, and its that sort of
>thinking that leads to constant need to upgrade hardware in the
>proprietary OS world.
>
>Sean

This would reflect the Microsoft way of programming:
If there's a bug in the system, don't fix it, but upgrade your hardware.
Why do you think the requirements for Windows is so great?
Most of their code is very inefficient. I'm sure they programmed
their kernel in Visual Basic. The worst part is that they get
paid to do this! I program in Linux because I don't want to be
associated with that mindset that made Microsoft such a [fill in the blank].
As for the 2.4 VM problem, what are you doing with your machine that's
making it use up so much memory? I have several processes running
on mine all the time, including a slew in X, and I have yet to see
significant swap activity.

   -- Ted

P.S. My faithful Timex Sinclair from the 80's never had swap :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On Tue, 5 Jun 2001, Derek Glidden wrote:

> After reading the messages to this list for the last couple of weeks and
> playing around on my machine, I'm convinced that the VM system in 2.4 is
> still severely broken.

...

Hi,

Can you try the patch below to see if it helps?  If you watch
with vmstat, you should see swap shrinking after your test.
Let is shrink a while and then see how long swapoff takes.
Under a normal load, it'll munch a handfull of them at least
once a second and keep them from getting annoying. (theory;)

-Mike


--- linux-2.4.5.ac5/mm/vmscan.c.org Sat Jun  2 07:37:16 2001
+++ linux-2.4.5.ac5/mm/vmscan.c Wed Jun  6 18:29:02 2001
@@ -1005,6 +1005,53 @@
return ret;
 }

+int deadswap_reclaim(unsigned int priority)
+{
+   struct list_head * page_lru;
+   struct page * page;
+   int maxscan = nr_active_pages >> priority;
+   int nr_reclaim = 0;
+
+   /* Take the lock while messing with the list... */
+   spin_lock(&pagemap_lru_lock);
+   while (maxscan-- > 0 && (page_lru = active_list.prev) != &active_list) {
+   page = list_entry(page_lru, struct page, lru);
+
+   /* Wrong page on list?! (list corruption, should not happen) */
+   if (!PageActive(page)) {
+   printk("VM: refill_inactive, wrong page on list.\n");
+   list_del(page_lru);
+   nr_active_pages--;
+   continue;
+   }
+
+   if (PageSwapCache(page) &&
+   (page_count(page) - !!page->buffers) == 1 &&
+   swap_count(page) == 1) {
+   if (page->buffers || TryLockPage(page)) {
+   ClearPageReferenced(page);
+   ClearPageDirty(page);
+   page->age = 0;
+   deactivate_page_nolock(page);
+   } else {
+   page_cache_get(page);
+   spin_unlock(&pagemap_lru_lock);
+   delete_from_swap_cache_nolock(page);
+   spin_lock(&pagemap_lru_lock);
+   UnlockPage(page);
+   page_cache_release(page);
+   }
+   nr_reclaim++;
+   continue;
+   }
+   list_del(page_lru);
+   list_add(page_lru, &active_list);
+   }
+   spin_unlock(&pagemap_lru_lock);
+
+   return nr_reclaim;
+}
+
 DECLARE_WAIT_QUEUE_HEAD(kreclaimd_wait);
 /*
  * Kreclaimd will move pages from the inactive_clean list to the
@@ -1027,7 +1074,7 @@
 * We sleep until someone wakes us up from
 * page_alloc.c::__alloc_pages().
 */
-   interruptible_sleep_on(&kreclaimd_wait);
+   interruptible_sleep_on_timeout(&kreclaimd_wait, HZ);

/*
 * Move some pages from the inactive_clean lists to
@@ -1051,6 +1098,7 @@
}
pgdat = pgdat->node_next;
} while (pgdat);
+   deadswap_reclaim(4);
}
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman

Derek Glidden <[EMAIL PROTECTED]> writes:

> The problem I reported is not that 2.4 uses huge amounts of swap but
> that trying to recover that swap off of disk under 2.4 can leave the
> machine in an entirely unresponsive state, while 2.2 handles identical
> situations gracefully.  
> 

The interesting thing from other reports is that it appears to be kswapd
using up CPU resources.  Not the swapout code at all.  So it appears
to be a fundamental VM issue.  And calling swapoff is just a good way
to trigger it. 

If you could confirm this by calling swapoff sometime other than at
reboot time.  That might help.  Say by running top on the console.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mark Salisbury

On Wed, 06 Jun 2001, Dr S.M. Huen wrote:
> The whole screaming match is about whether a drastic degradation on using
> swap with less than the 2*RAM swap specified by the developers should lead
> one to conclude that a kernel is "broken".

I would argue that any system that performs substantially worse with swap==1xRAM
than a system with swap==0xRAM is fundamentally broken.  it seems that w/
todays 2.4.x kernel, people running programs totalling LESS THAN their physical
dram are having swap problems.  they should not even be using 1 byte of swap.

the whole point of swapping pages is to give you more memory to execute
programs.

if I want to execute 140MB of programs+kernel on a system with 128 MB of ram,
I should be able to do the job effectively with ANY amount of "total memory"
exceeding 140MB. not some hokey 128MB RAM + 256MB swap just because the kernel
it too fscked up to deal with a small swap file.

-- 
/***
**   Mark Salisbury | Mercury Computer Systems**
***/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden


"Eric W. Biederman" wrote:
> 
> > Or are you saying that if someone is unhappy with a particular
> > situation, they should just keep their mouth shut and accept it?
> 
> It's worth complaining about.  It is also worth digging into and find
> out what the real problem is.  I have a hunch that this hole
> conversation on swap sizes being irritating is hiding the real
> problem.

I totally agree with this, and want to reiterate that the original
problem I posted has /nothing/ to do with the "swap == 2*RAM" issue.

The problem I reported is not that 2.4 uses huge amounts of swap but
that trying to recover that swap off of disk under 2.4 can leave the
machine in an entirely unresponsive state, while 2.2 handles identical
situations gracefully.  

I'm annoyed by 2.4's "requirement" of too much swap, but I consider that
less a bug and more a severe design flaw.  

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Dr S.M. Huen

On Wed, 6 Jun 2001, Kurt Roeckx wrote:

> On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
> > On Wed, 6 Jun 2001, Sean Hunter wrote:
> > 
> > > 
> > > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> > > 
> > 
> > Do I understand you correctly?
> > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> > drives.
> 
> Maybe you really should reread the statements people made about
> this before.
> 
I think you might do with a more careful quoting or reading of the thread
yourself before casting such aspersions.

I did not recommend swap use. I argued that it was not reasonable to
reject a  2*RAM swap requirement on cost grounds.  There are those who do
not think this argument adequate because of grounds other than
hardware cost (e.g. retrofitting existing farms, laptops with zillions of
OSes etc.)

> 
> That swap = 2 * RAM is just a guideline, you really should look
> at what applications you run, and how memory they use.  If you
> choise your RAM so that all application can always be in memory
> at all time, there is no need for swap.  If they can't be, the
> rule might help you.
> 
I think the whole argument of the thread is against you here.  It seems
that if you do NOT provide 2*RAM you get into trouble much earlier than
you expect (a few argue that even if you do you get trouble).  If it were
just a guideline that gracefully degraded your performance the other lot
wouldn't be screaming.

The whole screaming match is about whether a drastic degradation on using
swap with less than the 2*RAM swap specified by the developers should lead
one to conclude that a kernel is "broken".

To conclude, this is not a hypothetical argument about whether to operate
completely in core.  There's not a person on LKML who doesn't know running
in RAM is better than running swapping.   It is one where users do swap
but allocate a size smaller than that recommended and are adversely
affected.  It is about whether a kernel that reacts this way could be
regarded as stable.  Answe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman

Derek Glidden <[EMAIL PROTECTED]> writes:

> John Alvord wrote:
> > 
> > On Wed, 06 Jun 2001 11:31:28 -0400, Derek Glidden
> > <[EMAIL PROTECTED]> wrote:
> > 
> > >
> > >I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
> > >this problem.  I expect this sort of behaviour from academics - ignoring
> > >real actual problems being reported by real actual people really and
> > >actually experiencing and reporting them because "technically" or
> > >"theoretically" they "shouldn't be an issue" or because "the "literature
> > >[documentation] says otherwise - but not from this group.
> > 
> > There have been multiple comments that a fix for the problem is
> > forthcoming. Is there some reason you have to keep talking about it?
> 
> Because there have been many more comments that "The rule for 2.4 is
> 'swap == 2*RAM' and that's the way it is" and "disk space is cheap -
> just add more" than there have been "this is going to be fixed" which is
> extremely discouraging and doesn't instill me with all sorts of
> confidence that this problem is being taken seriously.

The hard rule will always be that to cover all pathological cases swap
must be greater than RAM.  Because in the worse case all RAM will be
in thes swap cache.  That this is more than just the worse case in 2.4
is problematic.  I.e. In the worst case: 
Virtual Memory = RAM + (swap - RAM).

You can't improve the worst case.  We can improve the worst case that
many people are facing.

> Or are you saying that if someone is unhappy with a particular
> situation, they should just keep their mouth shut and accept it?

It's worth complaining about.  It is also worth digging into and find
out what the real problem is.  I have a hunch that this hole
conversation on swap sizes being irritating is hiding the real
problem.  

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread José Luis Domingo López

On Wednesday, 06 June 2001, at 10:19:30 +0200,
Xavier Bestel wrote:

> On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote:
> > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote:
> [...]
> Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M
> swapfile to your box)
>
I'm not a kernel guru, neither I can even try to understand how an
operating system's memory management is designed or behaves. But I've some
questions and thoughs:

1. Is swap=2xRAM a desing issue, or just a recommendation to get best
results _based_ on current VM subsystem status ?
2. Wouldn't performance drop quickly when VM starts to swap
processes/pages to disk, instead of keeping them on RAM ?. Maybe having a
couple of GB worth of processes on disk is not very wyse.
3. Shouldn't an ideal VM manage swap space as an extension of system's RAM
(of course, taking into account that RAM is much faster than HD, and
nothing should be on swap if there is room enough on RAM ?.
4. Wouldn't you say that "adding more swap" (maybe 2xRAM is a
recommendation, maybe a temporary fix, maybe a design decission) is the
M$-way of fixing things ?. If there is a _real_ need for more swap to get
a well baheving system, let's add swap. But we shouldn't hide inner desing
and/or implementation problems under the "cheap multigigabyte disks"
argument.
5. AFAIK, kernel developers are well aware of current 2.4.x problems in
some areas. I don't think insisting on certain problems without providing
ideas, testing, support, and limiting to just blaming the authors is the
best way to go. Maybe kernel hackers are the most interested of all in
fixing all these issues ASAP.

Just some thoughts from someone unable to write C code and help fix this
mess ;).

--
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)

jdomingo EN internautas PUNTO org  => ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  => Spam at your own risk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Kurt Roeckx

On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> 
> > 
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> > 
> 
> Do I understand you correctly?
> ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> drives.

Maybe you really should reread the statements people made about
this before.

One of them being, that if you're not using swap in 2.2, it won't
need any in 2.4 either.

2.4 will use more swap in case it does use it.  It now works more
like other UNIX variants where the rule is that swap = 2 * RAM.

That swap = 2 * RAM is just a guideline, you really should look
at what applications you run, and how memory they use.  If you
choise your RAM so that all application can always be in memory
at all time, there is no need for swap.  If they can't be, the
rule might help you.

I think someone said that the swap should be large enough to hold
all application that are running on swapspace, that is, in case
you want to use swap.

Disk maybe be alot cheaper than RAM, but it's also alot slower.

Kurt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Remi Turk


On Wed, Jun 06, 2001 at 06:48:32AM -0400, Alexander Viro wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> 
> > This is completely bogus. I am not saying that I can't afford the swap.
> > What I am saying is that it is completely broken to require this amount
> > of swap given the boundaries of efficient use. 
> 
> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.
> 
> Not that being more kind on swap would be a bad thing, but that rule for
> amount of swap is pretty common. ISTR similar for (very old) SCO, so it's
> not just BSD world. How are modern Missed'em'V variants in that respect, BTW?

Although I don't have any swap-trouble myself, what I think
most people are having problems with is not that Linux
doesn't have the "you-dont-need-2xRAM-size-swap-if-you-swap-at-all
feature", but that it lost it in 2.4.

-- 
Linux 2.4.5-ac9 #5 Wed Jun 6 18:30:24 CEST 2001
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread dean gaudet

On Wed, 6 Jun 2001, Alexander Viro wrote:

> On Wed, 6 Jun 2001, Sean Hunter wrote:
>
> > This is completely bogus. I am not saying that I can't afford the swap.
> > What I am saying is that it is completely broken to require this amount
> > of swap given the boundaries of efficient use.
>
> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.
>
> Not that being more kind on swap would be a bad thing, but that rule for
> amount of swap is pretty common. ISTR similar for (very old) SCO, so it's
> not just BSD world. How are modern Missed'em'V variants in that respect, BTW?

frequently when building out a solaris web farm you have to just bite it
and throw away half your disk for swap that will never be used.  it's got
pessimistic memory allocation by default.

you can do something with mmap() to get an optimistic allocation, but i
didn't trust making this change to apache when i was involved with a farm
like this... i didn't want to be debugging any potential low memory
problems.

-dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jeffrey W. Baker


On 6 Jun 2001, Eric W. Biederman wrote:

> "Jeffrey W. Baker" <[EMAIL PROTECTED]> writes:
>
> > On Tue, 5 Jun 2001, Derek Glidden wrote:
> >
> > >
> > > After reading the messages to this list for the last couple of weeks and
> > > playing around on my machine, I'm convinced that the VM system in 2.4 is
> > > still severely broken.
> > >
> > > This isn't trying to test extreme low-memory pressure, just how the
> > > system handles recovering from going somewhat into swap, which is a real
> > > day-to-day problem for me, because I often run a couple of apps that
> > > most of the time live in RAM, but during heavy computation runs, can go
> > > a couple hundred megs into swap for a few minutes at a time.  Whenever
> > > that happens, my machine always starts acting up afterwards, so I
> > > started investigating and found some really strange stuff going on.
> >
> > I reboot each of my machines every week, to take them offline for
> > intrusion detection.  I use 2.4 because I need advanced features of
> > iptables that ipchains lacks.  Because the 2.4 VM is so broken, and
> > because my machines are frequently deeply swapped, they can sometimes take
> > over 30 minutes to shutdown.  They hang of course when the shutdown rc
> > script turns off the swap.  The first few times this happened I assumed
> > they were dead.
>
> Interesting.  Is it constant disk I/O?  Or constant CPU utilization.
> In any case you should be able to comment that line out of your shutdown
> rc script and be in perfectly good shape.

Well I can't exactly run top(1) at shutdown time, but the disks aren't
running at all.  Either the system is using the CPUs, or it is blocked
waiting for something to happen.

You're right about swapoff, we removed it from our shutdown script.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread dean gaudet


On Wed, 6 Jun 2001, Dr S.M. Huen wrote:

> If you can afford 4GB RAM, you certainly can afford 8GB swap.

this is a completely crap argument.

you should study the economics of managing a farm of thousands of machines
some day.

when you do this, you'll also learn to consider the power requirements
(8W+ per 3.5" disk) which you need to bring to each rack, supply backup
UPS/generator power for, and exhaust through your air conditioning for
each of these useless swap disks.

plus you'll also learn to consider the wages for the unlucky person who
has to go around to every box in a farm, open it up, and install another
disk.

plus you'll learn that the time this person spent installing new disks
wasn't spent installing new systems, which means you couldn't bring as
many customers on line this month, which means you may not make revenue
targets.

plus you'll learn that every time you open a box that's been in production
for a while, there's a small, but noticeable, chance that it won't reboot.
so your normal monthly failure rate will go from the 2% range up to the 5%
range.

-dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Ben Greear

Richard Gooch wrote:
> 
> Daniel Phillips writes:
> > On Wednesday 06 June 2001 10:54, Sean Hunter wrote:
> >
> > > > Did you try to put twice as much swap as you have RAM ? (e.g. add a
> > > > 512M swapfile to your box)
> > > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying
> > > > that anything less won't do any good: 2.4 overallocates swap even
> > > > if it doesn't use it all. So in your case you just have enough swap
> > > > to map your RAM, and nothing to really swap your apps.
> > >
> > > For large memory boxes, this is ridiculous.  Should I have 8GB of
> > > swap?
> 
> Sure. It's cheap. If you don't mind slumming it, go and buy a 20 GB
> IDE drive for US$65. I know RAM has gotten a lot cheaper lately (US$66
> for a 512 MiB PC133 DIMM), but it's still far more expensive. If you
> can afford 4 GiB of RAM, you can definately afford 8 GiB of swap.

For me, the problem is not the money.  If I have a system that needs
4GB of RAM, it is highly unlikely that I would ever want to be running
this machine with 8GB of swap active.  However, I may be willing to
tollerate 1GB of swapping before paging to disk slowed things down
too much.  This is the exact scenario I had when dealing with a large
Sun machine running Oracle & some other stuff.  Oracle is dedicated large
amounts of RAM, but if I wanted to run a quick, memory intensive program
too, (and at the moment performance isn't all that big of a deal), then
using some swap is OK.

So, I too cast my vote for the 2*RAM requiment to be odious and in
need of fixing!!  It could be a suggestion, but I would consider that
if not following the suggestion caused more than 10% slowdown, then
things are still broken, and optimally, it should work like the 2.2
does (in other words, I don't notice, and don't particularly care
how much swap per RAM I need, just how much total RAM-like-stuff I need.)

Thanks,
Ben

-- 
Ben Greear <[EMAIL PROTECTED]>  <[EMAIL PROTECTED]>
President of Candela Technologies Inc  http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com http://scry.wanfear.com/~greear
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden


John Alvord wrote:
> 
> On Wed, 06 Jun 2001 11:31:28 -0400, Derek Glidden
> <[EMAIL PROTECTED]> wrote:
> 
> >
> >I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
> >this problem.  I expect this sort of behaviour from academics - ignoring
> >real actual problems being reported by real actual people really and
> >actually experiencing and reporting them because "technically" or
> >"theoretically" they "shouldn't be an issue" or because "the "literature
> >[documentation] says otherwise - but not from this group.
> 
> There have been multiple comments that a fix for the problem is
> forthcoming. Is there some reason you have to keep talking about it?

Because there have been many more comments that "The rule for 2.4 is
'swap == 2*RAM' and that's the way it is" and "disk space is cheap -
just add more" than there have been "this is going to be fixed" which is
extremely discouraging and doesn't instill me with all sorts of
confidence that this problem is being taken seriously.

Or are you saying that if someone is unhappy with a particular
situation, they should just keep their mouth shut and accept it?

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread John Alvord


On Wed, 06 Jun 2001 11:31:28 -0400, Derek Glidden
<[EMAIL PROTECTED]> wrote:


>
>I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
>this problem.  I expect this sort of behaviour from academics - ignoring
>real actual problems being reported by real actual people really and
>actually experiencing and reporting them because "technically" or
>"theoretically" they "shouldn't be an issue" or because "the "literature
>[documentation] says otherwise - but not from this group.  

There have been multiple comments that a fix for the problem is
forthcoming. Is there some reason you have to keep talking about it?

John alvord
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Christian Bornträger


OK, Linus said if I use swap it should be at least twice as much as RAM.
there will be much more discussion about it, for me this contraint is a very
very bad idea.

Have you ever thought about diskless workstations? Swapping over a network
sounds ugly.

Nevertheless, my question is:
what happens if I plan to use no swap. I  have enough memory installed for
my purposes and every swapping operation can do only one thing: slowing down
the system.
Is there a different behaviour if I completely disable swap?

greetings

Christian Bornträger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden



> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.

Yes, but Linux isn't 4.3BSD, SunOS or post-4.4 BSD.  Not to mention, all
other OS's I've had experience using *don't* break severely if you don't
follow the "swap==2*RAM" rule.  Except Linux 2.4.

> Not that being more kind on swap would be a bad thing, but that rule for
> amount of swap is pretty common. ISTR similar for (very old) SCO, so it's
> not just BSD world. How are modern Missed'em'V variants in that respect, BTW?

Yes, but that has traditionally been one of the big BENEFITS of Linux,
and other UNIXes.  As Sean Hunter said, "Virtual memory is one of the
killer features of
unix."  Linux has *never* in the past REQUIRED me to follow that rule. 
Which is a big reason I use it in so many places.

Take an example mentioned by someone on the list already: a laptop.  I
have two laptops that run Linux.  One has a 4GB disk, one has a 12GB
disk.  Both disks are VERY full of data and both machines get pretty
heavy use.  It's a fact that I just bumped one laptop (with 256MB of
swap configured) from 128MB to 256MB of RAM.  Does this mean that if I
want to upgrade to the 2.4 kernel on that machine I now have to back up
all that data, repartition the drive and restore everything just so I
can fastidiously follow the "swap == 2*RAM" rule else the 2.4 VM
subsystem will break?  Bollocks, to quote yet another participant in
this silly discussion.

I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
this problem.  I expect this sort of behaviour from academics - ignoring
real actual problems being reported by real actual people really and
actually experiencing and reporting them because "technically" or
"theoretically" they "shouldn't be an issue" or because "the "literature
[documentation] says otherwise - but not from this group.  

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Richard Gooch


Daniel Phillips writes:
> On Wednesday 06 June 2001 10:54, Sean Hunter wrote:
> 
> > > Did you try to put twice as much swap as you have RAM ? (e.g. add a
> > > 512M swapfile to your box)
> > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying
> > > that anything less won't do any good: 2.4 overallocates swap even
> > > if it doesn't use it all. So in your case you just have enough swap
> > > to map your RAM, and nothing to really swap your apps.
> >
> > For large memory boxes, this is ridiculous.  Should I have 8GB of
> > swap?

Sure. It's cheap. If you don't mind slumming it, go and buy a 20 GB
IDE drive for US$65. I know RAM has gotten a lot cheaper lately (US$66
for a 512 MiB PC133 DIMM), but it's still far more expensive. If you
can afford 4 GiB of RAM, you can definately afford 8 GiB of swap.

> And laptops with big memories and small disks.

That's not that common, though. Usually you get far more disc than RAM
on a laptop, just as with a desktop.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden

Helge Hafting wrote:
> 
> The drive is inactive because it isn't needed, the machine is
> running loops on data in memory.  And it is unresponsive because
> nothing else is scheduled, maybe "swapoff" is easier to implement

I don't quite get what you're saying.  If the system becomes
unresponsive because  the VM swap recovery parts of the kernel are
interfering with the kernel scheduler then that's also bad because there
absolutely *are* other processes that should be getting time, like the
console windows/shells at which I'm logged in.  If they aren't getting
it specifically because the VM is preventing them from receiving
execution time, then that's another bug.

> when processes cannot try to allocate more or touch pages
> while it runs.  "swapoff" isn't something you normally do often,
> so it don't have to be nice.

I'm not familiar enough with the swapping bits of the kernel code, so I
could be totally wrong, but turning off a swap file/partition should
just call the same parts of the VM subsystem that would normally try to
recover swap space under memory pressure.  Using "swapoff" to force this
behaviour should just force it to happen manually rather than when
memory pressure is high enough.  

Which means that if that's the normal behaviour of the VM subsystem when
memory pressure gets high and it needs to recover unused pages from swap
- i.e. the machine stops running - then that's still very broken
behaviour, no matter what instigated the occurance.

> Still, I find it strange that swapoff should take much more time,
> even if you can get 2.2 to have the same amount in swap.

So do I.  Hence the original report.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 128 matches

Mail list logo