from:"Jonathan Morton"

Re: VM Requirement Document - v0.0

2001-06-28 Thread Jonathan Morton


>There is a simple change in strategy that will fix up the updatedb case quite
>nicely, it goes something like this: a single access to a page (e.g., reading
>it) isn't enough to bring it to the front of the LRU queue, but accessing it
>twice or more is.  This is being looked at.

Say, when a page is created due to a page fault, page->age is set to 
zero instead of whatever it is now.  Then, on the first access, it is 
incremented to one.  All accesses where page->age was previously zero 
cause it to be incremented to one, and subsequent accesses where 
page->age is non-zero cause a doubling rather than an increment. 
This gives a nice heavy priority boost to frequently-accessed pages...

>Note that we don't actually use a LRU queue, we use a more efficient
>approximation called aging, so the above is not a recipe for implementation.

Maybe it is, but in a slightly lateral manner as above.

-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Requirement Document - v0.0

2001-06-28 Thread Jonathan Morton


There is a simple change in strategy that will fix up the updatedb case quite
nicely, it goes something like this: a single access to a page (e.g., reading
it) isn't enough to bring it to the front of the LRU queue, but accessing it
twice or more is.  This is being looked at.

Say, when a page is created due to a page fault, page-age is set to 
zero instead of whatever it is now.  Then, on the first access, it is 
incremented to one.  All accesses where page-age was previously zero 
cause it to be incremented to one, and subsequent accesses where 
page-age is non-zero cause a doubling rather than an increment. 
This gives a nice heavy priority boost to frequently-accessed pages...

Note that we don't actually use a LRU queue, we use a more efficient
approximation called aging, so the above is not a recipe for implementation.

Maybe it is, but in a slightly lateral manner as above.

-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What are the VM motivations??

2001-06-24 Thread Jonathan Morton


>The conclusion of most of this discussion is in my FREENIX
>paper, which can be found at http://www.surriel.com/lectures/.

Aha...  that paper answers a lot of the questions I had about how 
things work.  I seem to remember asking some of them, too, and didn't 
get an answer...  :P
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thrashing WITHOUT swap.

2001-06-24 Thread Jonathan Morton


>Now my question is how can it be
>thrashing with swap explicitly turned off?

Easy.  All applications are themselves swap space - the binary is 
merely memory-mapped onto the executable file.  When the system gets 
low on memory, the only thing it can do is purge some binary pages, 
and then repeatedly page them back in from the original executable 
file.

If you added a little bit of swap, it would be able to page out some 
idle data rather than binaries, and would probably avoid thrashing.
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Thrashing WITHOUT swap.

2001-06-24 Thread Jonathan Morton


Now my question is how can it be
thrashing with swap explicitly turned off?

Easy.  All applications are themselves swap space - the binary is 
merely memory-mapped onto the executable file.  When the system gets 
low on memory, the only thing it can do is purge some binary pages, 
and then repeatedly page them back in from the original executable 
file.

If you added a little bit of swap, it would be able to page out some 
idle data rather than binaries, and would probably avoid thrashing.
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What are the VM motivations??

2001-06-24 Thread Jonathan Morton


The conclusion of most of this discussion is in my FREENIX
paper, which can be found at http://www.surriel.com/lectures/.

Aha...  that paper answers a lot of the questions I had about how 
things work.  I seem to remember asking some of them, too, and didn't 
get an answer...  :P
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: temperature standard - global config option?

2001-06-21 Thread Jonathan Morton


>  > > Only the truly stupid would assume accuracy from decimal places.
>>
>>  Well then, tell all the teachers in this world that they're stupid, and tell
>>  everyone who learnt from them as well.
>
>*All*?
>
>>  I'm in high school (gd. 11, junior)
>>  and my physics teacher is always screaming at us for putting too many
>>  decimal places or having them inconsistent.
>
>Ok, *yours*.
>
>But yours is not all. I certainly don't remember ever seeing that attitude 
>in school.
>
>And yes, this behaviour *is* stupid. Someone who thinks like that should
>never be allowed to become a science teacher.

*cough*

I've been taught by every Maths, Engineering and Physics 
teacher/lecturer I've encountered to write down significant figures 
consistent with the precision of the value.  So blindly writing down 
a value of 59.42886726469 ±2°C is obviously ludicrous, even if that's 
what my calculator gives me.  I should instead write 59 ±2°C, since 
that is the most precision I can possibly know it to.  With some 
advanced measuring techniques it *may* be acceptable to write 59.43 
±2°C *at most*, and then only if you really know why you need the 
extra information.

The UK education system is one of the better ones available, and the 
above philosophy is consistently held throughout it.  I'd be well 
advised not to argue, especially since it's common sense.
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [OT] Threads, inelegance, and Java

2001-06-21 Thread Jonathan Morton


>  > I have seen school projects with interfaces done in java (to be 'portable')
>>  and you could go to have a coffee while a menu pulled down.
>
>Yeah, but the slowness there comes from the phrase "school project" and not
>the phrase "done in java".  I've seen menuing interfaces on a 1 mhz commodore
>64 that refreshed faster than the screen retrace, and I've WRITTEN java
>programs that calculated animated mathematical function plots point by point
>in realtime on a 486.

Sure, but the Commodore had highly-optimised code working for it.  I 
think I'm a long way beyong the "school project" stage, but I've had 
REAL difficulty getting Java to perform well.

As one of the assignments in my 1st-year CompSci course, we got to 
write a program which checked for balanced braces in a source file. 
We were supposed to do this in Java - at the time, the Blackdown 
1.1.8 JVM was current.  I and a classmate used a 160K C source file 
from a MUD as a test load.  He ran it multiple times on the Solaris 
system provided for 1st-year programming, I ran it on my Linux 486.

My first effort was a nice "clean" OOP-friendly implementation which 
made heavy use of Java's nice-looking String operators.  As anyone 
who has worked with Java will be able to guess, this was also an 
extremely slow and inefficient implementation.  I recall it took 
upwards of 2 minutes to parse that source file and confirm that it 
had properly-paired braces.  This on a machine which now handles DNS, 
e-mail, webcaching, webserving, and sometimes gateway duties for my 
other machines.

Much work later, I garnered an implementation that used more C-like 
operators and looked much messier, but ran 6 times quicker.  Most of 
the work went into eliminating object creation and destruction, which 
tickled Java's extremely slow garbage collection.  I still don't 
fully understand why free() has no equivalent in Java.  My classmate 
got his version running even faster, but I don't know how he managed 
it.

Then I quickly hacked up a C version of the same program using the 
same algorithm, and compiled it using GCC.  It immediately ran 6 
times quicker still, and consumed less than a 50th of the memory.

With 28Mb RAM, I could only run 4 copies of the Java program in 
parallel before the machine started thrashing, but the C version got 
to 20 copies before the terminal simply got too sluggish to start any 
more (the machine was not thrashing, it was just under severe load). 
All 20 copies completed in less time than a single instance of the 
original Java implementation.

Incidentally, I've used VB as well, and it's even worse.  I couldn't 
get a P75 to drive a stepper-motor at more than 4 steps per second, 
and that was hardly a complex algorithm.  Given that I learned basic 
programming techniques using BBC BASIC on a 2MHz 6502, I *know* 
that's pathetic even for interpreted BASIC.  When an ARM610 at 40MHz 
running interpreted BASIC can outperform highly-optimised 16-bit x86 
assembly on a 486SX/40, you know Acorn got their interpreter done 
better than M$ did.

>  > Until java can be efficiently compiled, it is no more than a toy.
>
>I haven't played with Jikes.

Nor have I.  But frankly, I don't care.  Neither C, nor C++, nor Java 
make good beginner's languages.  The former two are efficient and 
safe if handled with some care.  The latter is safe but not efficient 
even in an expert's hands.

>I still
>had the GOOD bits of C++ syntax without having to worry about conflicting
>virtual base classes.

H...  a well-designed C++ system doesn't have to worry about that 
either.  C++'s features are only bad if misused - it's an expert's 
language for crying out loud.

>  > See above. Traversing a list of objects to draw is not time consuming,
>  > implementing a zbuffer or texturing is. Try to implement a zbuffer in java.
>
>I'll top that, I tried to implement "deflate" in java 1.0.  (I was porting
>info-zip to java when java 1.1 came out.
>
>Yeah, the performance sucked.  But the performance of IBM's OS/2 java 1.0 jdk
>sucked compared to anything anybody's using today (even without JIT).

That reminds me...  allocating a two-dimensional array in Java is a 
*real* *pain*.  You have to declare the darn thing as an array of 
arrays, and then allocate that array of arrays explicitly, and then 
loop through that bloody array and allocate each subarray 
individually!  The alternative is to allocate a one-dimensional array 
and use what amounts to heavy pointer arithmetic, which can't be 
cheap on the CPU.

>  > The problem with java is that people tries to use it as a general purpose
>>  programming language, and it is not efficient. It can be used to organize
>>  your program and to interface to low-level libraries written in C. But
>>  do not try to implement any fast path in java.
>
>I once wrote an equation parser that took strings, substituted values for
>variables via string search and replace, and performed the calculation the
>string described.  It

Re: [OT] Threads, inelegance, and Java

2001-06-21 Thread Jonathan Morton


   I have seen school projects with interfaces done in java (to be 'portable')
  and you could go to have a coffee while a menu pulled down.

Yeah, but the slowness there comes from the phrase school project and not
the phrase done in java.  I've seen menuing interfaces on a 1 mhz commodore
64 that refreshed faster than the screen retrace, and I've WRITTEN java
programs that calculated animated mathematical function plots point by point
in realtime on a 486.

Sure, but the Commodore had highly-optimised code working for it.  I 
think I'm a long way beyong the school project stage, but I've had 
REAL difficulty getting Java to perform well.

As one of the assignments in my 1st-year CompSci course, we got to 
write a program which checked for balanced braces in a source file. 
We were supposed to do this in Java - at the time, the Blackdown 
1.1.8 JVM was current.  I and a classmate used a 160K C source file 
from a MUD as a test load.  He ran it multiple times on the Solaris 
system provided for 1st-year programming, I ran it on my Linux 486.

My first effort was a nice clean OOP-friendly implementation which 
made heavy use of Java's nice-looking String operators.  As anyone 
who has worked with Java will be able to guess, this was also an 
extremely slow and inefficient implementation.  I recall it took 
upwards of 2 minutes to parse that source file and confirm that it 
had properly-paired braces.  This on a machine which now handles DNS, 
e-mail, webcaching, webserving, and sometimes gateway duties for my 
other machines.

Much work later, I garnered an implementation that used more C-like 
operators and looked much messier, but ran 6 times quicker.  Most of 
the work went into eliminating object creation and destruction, which 
tickled Java's extremely slow garbage collection.  I still don't 
fully understand why free() has no equivalent in Java.  My classmate 
got his version running even faster, but I don't know how he managed 
it.

Then I quickly hacked up a C version of the same program using the 
same algorithm, and compiled it using GCC.  It immediately ran 6 
times quicker still, and consumed less than a 50th of the memory.

With 28Mb RAM, I could only run 4 copies of the Java program in 
parallel before the machine started thrashing, but the C version got 
to 20 copies before the terminal simply got too sluggish to start any 
more (the machine was not thrashing, it was just under severe load). 
All 20 copies completed in less time than a single instance of the 
original Java implementation.

Incidentally, I've used VB as well, and it's even worse.  I couldn't 
get a P75 to drive a stepper-motor at more than 4 steps per second, 
and that was hardly a complex algorithm.  Given that I learned basic 
programming techniques using BBC BASIC on a 2MHz 6502, I *know* 
that's pathetic even for interpreted BASIC.  When an ARM610 at 40MHz 
running interpreted BASIC can outperform highly-optimised 16-bit x86 
assembly on a 486SX/40, you know Acorn got their interpreter done 
better than M$ did.

   Until java can be efficiently compiled, it is no more than a toy.

I haven't played with Jikes.

Nor have I.  But frankly, I don't care.  Neither C, nor C++, nor Java 
make good beginner's languages.  The former two are efficient and 
safe if handled with some care.  The latter is safe but not efficient 
even in an expert's hands.

I still
had the GOOD bits of C++ syntax without having to worry about conflicting
virtual base classes.

H...  a well-designed C++ system doesn't have to worry about that 
either.  C++'s features are only bad if misused - it's an expert's 
language for crying out loud.

   See above. Traversing a list of objects to draw is not time consuming,
   implementing a zbuffer or texturing is. Try to implement a zbuffer in java.

I'll top that, I tried to implement deflate in java 1.0.  (I was porting
info-zip to java when java 1.1 came out.

Yeah, the performance sucked.  But the performance of IBM's OS/2 java 1.0 jdk
sucked compared to anything anybody's using today (even without JIT).

That reminds me...  allocating a two-dimensional array in Java is a 
*real* *pain*.  You have to declare the darn thing as an array of 
arrays, and then allocate that array of arrays explicitly, and then 
loop through that bloody array and allocate each subarray 
individually!  The alternative is to allocate a one-dimensional array 
and use what amounts to heavy pointer arithmetic, which can't be 
cheap on the CPU.

   The problem with java is that people tries to use it as a general purpose
  programming language, and it is not efficient. It can be used to organize
  your program and to interface to low-level libraries written in C. But
  do not try to implement any fast path in java.

I once wrote an equation parser that took strings, substituted values for
variables via string search and replace, and performed the calculation the
string described.  It did this for every x pixel in a 300 pixel or so

Re: temperature standard - global config option?

2001-06-21 Thread Jonathan Morton


Only the truly stupid would assume accuracy from decimal places.

  Well then, tell all the teachers in this world that they're stupid, and tell
  everyone who learnt from them as well.

*All*?

  I'm in high school (gd. 11, junior)
  and my physics teacher is always screaming at us for putting too many
  decimal places or having them inconsistent.

Ok, *yours*.

But yours is not all. I certainly don't remember ever seeing that attitude 
in school.

And yes, this behaviour *is* stupid. Someone who thinks like that should
never be allowed to become a science teacher.

*cough*

I've been taught by every Maths, Engineering and Physics 
teacher/lecturer I've encountered to write down significant figures 
consistent with the precision of the value.  So blindly writing down 
a value of 59.42886726469 ±2°C is obviously ludicrous, even if that's 
what my calculator gives me.  I should instead write 59 ±2°C, since 
that is the most precision I can possibly know it to.  With some 
advanced measuring techniques it *may* be acceptable to write 59.43 
±2°C *at most*, and then only if you really know why you need the 
extra information.

The UK education system is one of the better ones available, and the 
above philosophy is consistently held throughout it.  I'd be well 
advised not to argue, especially since it's common sense.
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The latest Microsoft FUD. This time from BillG, himself.

2001-06-20 Thread Jonathan Morton


>You can scream all you want that "it isn't free software" but the fact
>of the matter is that you all scream that and then go do your slides for
>your Linux talks in PowerPoint.

Or AppleWorks (Mac), in my case.  Or, if I wanted to be flashy, I'd 
make the slides up in CorelXARA (which originated on the Acorn and 
would probably run under WINE today) and move them to 
GraphicConvertor (Mac) for display.  I daresay it's possible to do 
all that under Linux, but I haven't found such readily-available 
solutions staring me in the face yet.

Incidentally, you don't need a flashy presentation to make an impact. 
I won a prize this month largely based on a presentation I did - the 
content was king, the slides were white-on-black text, and I 
stammered my way through the actual presentation (I'm not good at 
public speaking).  The close runner-up had done a big flashy 
PowerPoint presentation, was better at public speaking, but hadn't 
researched his material quite so thoroughly.

I use Linux for programming and servers.  I still use my Macs for 
regular day-to-day workstation duty.  That's the status quo, and it 
will only change slowly and with much effort.
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The latest Microsoft FUD. This time from BillG, himself.

2001-06-20 Thread Jonathan Morton


You can scream all you want that it isn't free software but the fact
of the matter is that you all scream that and then go do your slides for
your Linux talks in PowerPoint.

Or AppleWorks (Mac), in my case.  Or, if I wanted to be flashy, I'd 
make the slides up in CorelXARA (which originated on the Acorn and 
would probably run under WINE today) and move them to 
GraphicConvertor (Mac) for display.  I daresay it's possible to do 
all that under Linux, but I haven't found such readily-available 
solutions staring me in the face yet.

Incidentally, you don't need a flashy presentation to make an impact. 
I won a prize this month largely based on a presentation I did - the 
content was king, the slides were white-on-black text, and I 
stammered my way through the actual presentation (I'm not good at 
public speaking).  The close runner-up had done a big flashy 
PowerPoint presentation, was better at public speaking, but hadn't 
researched his material quite so thoroughly.

I use Linux for programming and servers.  I still use my Macs for 
regular day-to-day workstation duty.  That's the status quo, and it 
will only change slowly and with much effort.
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Client receives TCP packets but does not ACK

2001-06-18 Thread Jonathan Morton


>  > >  > Btw: can the aplication somehow ask the tcp/ip stack what was
>>  >actualy acked?
>>  >>  (ie. how many bytes were acked).
>>  >
>>  >no, but it's not necessarily a useful number anyhow -- because it's
>>  >possible that the remote end ACKd bytes but the ACK never arrives.  so you
>>  >can get into a situation where the remote application has the entire
>>  >message but the local application doesn't know.  the only way to solve
>>  >this is above the TCP layer.  (message duplicate elimination using an
>>  >unique id.)
>>
>>  No, because if the ACK doesn't reach the sending machine, the sender
>>  will retry the data until it does get an ACK.
>
>if the network goes down in between, the sender may never get the ACK.
>the sender will see a timeout eventually.  the receiver may already be
>done with the connection and closed it and never see the error.  if it
>were a protocol such as SMTP then the sender would retry later, and the
>result would be a duplicate message.  (which you can eliminate above the
>TCP layer using unique ids.)

But, if the sender does not attempt to close the socket until the ACK 
returns, then the receiver will see an unfinished connection and 
(hopefully) realise that the message is unsafe and not attempt to 
send it.

With SMTP, the last piece of data is a QUIT anyway, which occurs 
after the end-of-message marker - once the QUIT is sent and/or 
received, both ends know that the connection is finished with and 
will close the socket independently of each other.  If the network 
disappears before the QUIT comes along, the receiver should be 
discarding messages and the sender retrying later.
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Client receives TCP packets but does not ACK

2001-06-18 Thread Jonathan Morton


>  > Btw: can the aplication somehow ask the tcp/ip stack what was 
>actualy acked?
>>  (ie. how many bytes were acked).
>
>no, but it's not necessarily a useful number anyhow -- because it's
>possible that the remote end ACKd bytes but the ACK never arrives.  so you
>can get into a situation where the remote application has the entire
>message but the local application doesn't know.  the only way to solve
>this is above the TCP layer.  (message duplicate elimination using an
>unique id.)

No, because if the ACK doesn't reach the sending machine, the sender 
will retry the data until it does get an ACK.  So the sender always 
has information about some amount of data which is guaranteed to have 
arrived at the other end.  The receiver might know about this sooner, 
but that's simply a function of network latency.

The fundamental problem, if I understand right, is that some stacks 
allow packets indicating closing of a connection (FIN) to arrive 
before the actual data at the end of the connection does.  The only 
workaround I can think of for this is for the closing stack to wait 
until all sent data has been ACKed before sending the FIN.  The ACK 
may, of course, never arrive, but that's what round-trip timeouts are 
for.
-- 
--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Client receives TCP packets but does not ACK

2001-06-18 Thread Jonathan Morton


   Btw: can the aplication somehow ask the tcp/ip stack what was 
actualy acked?
  (ie. how many bytes were acked).

no, but it's not necessarily a useful number anyhow -- because it's
possible that the remote end ACKd bytes but the ACK never arrives.  so you
can get into a situation where the remote application has the entire
message but the local application doesn't know.  the only way to solve
this is above the TCP layer.  (message duplicate elimination using an
unique id.)

No, because if the ACK doesn't reach the sending machine, the sender 
will retry the data until it does get an ACK.  So the sender always 
has information about some amount of data which is guaranteed to have 
arrived at the other end.  The receiver might know about this sooner, 
but that's simply a function of network latency.

The fundamental problem, if I understand right, is that some stacks 
allow packets indicating closing of a connection (FIN) to arrive 
before the actual data at the end of the connection does.  The only 
workaround I can think of for this is for the closing stack to wait 
until all sent data has been ACKed before sending the FIN.  The ACK 
may, of course, never arrive, but that's what round-trip timeouts are 
for.
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Client receives TCP packets but does not ACK

2001-06-18 Thread Jonathan Morton


  Btw: can the aplication somehow ask the tcp/ip stack what was
  actualy acked?
(ie. how many bytes were acked).
  
  no, but it's not necessarily a useful number anyhow -- because it's
  possible that the remote end ACKd bytes but the ACK never arrives.  so you
  can get into a situation where the remote application has the entire
  message but the local application doesn't know.  the only way to solve
  this is above the TCP layer.  (message duplicate elimination using an
  unique id.)

  No, because if the ACK doesn't reach the sending machine, the sender
  will retry the data until it does get an ACK.

if the network goes down in between, the sender may never get the ACK.
the sender will see a timeout eventually.  the receiver may already be
done with the connection and closed it and never see the error.  if it
were a protocol such as SMTP then the sender would retry later, and the
result would be a duplicate message.  (which you can eliminate above the
TCP layer using unique ids.)

But, if the sender does not attempt to close the socket until the ACK 
returns, then the receiver will see an unfinished connection and 
(hopefully) realise that the message is unsafe and not attempt to 
send it.

With SMTP, the last piece of data is a QUIT anyway, which occurs 
after the end-of-message marker - once the QUIT is sent and/or 
received, both ends know that the connection is finished with and 
will close the socket independently of each other.  If the network 
disappears before the QUIT comes along, the receiver should be 
discarding messages and the sender retrying later.
-- 
--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
website:  http://www.chromatix.uklinux.net/vnc/
geekcode: GCS$/E dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$
   V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
tagline:  The key to knowledge is not to rely on people to teach you it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Clock drift on Transmeta Crusoe

2001-06-12 Thread Jonathan Morton


>> clock drift of a few minutes per day.

That's about 0.1%.  It may be relatively large compared to tolerances of
hardware clocks, but it's realistically tiny.  It certainly compares
favourably with mkLinux on my PowerBook 5300, which usually drifts by
several hours per day regardless of actual load.

The drift might be caused by something masking interrupts for too long, too
often, considering you state that the hardware clock remains comparatively
well-synced.  As another poster suggests, the framebuffer may be to blame.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Clock drift on Transmeta Crusoe

2001-06-12 Thread Jonathan Morton


 clock drift of a few minutes per day.

That's about 0.1%.  It may be relatively large compared to tolerances of
hardware clocks, but it's realistically tiny.  It certainly compares
favourably with mkLinux on my PowerBook 5300, which usually drifts by
several hours per day regardless of actual load.

The drift might be caused by something masking interrupts for too long, too
often, considering you state that the hardware clock remains comparatively
well-synced.  As another poster suggests, the framebuffer may be to blame.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: what is using memory?

2001-06-11 Thread Jonathan Morton


>My box has
>
>320280K
>
>from proc/meminfo
>
> 17140 buffer
>123696 cache
> 32303 free
>
>leaving unaccounted
>
>123627K

This is your processes' memory, the inode and dentry caches, and possibly
some extra kernel memory which may be allocated after boot time.  It is
*very* much accounted for.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: what is using memory?

2001-06-11 Thread Jonathan Morton


My box has

320280K

from proc/meminfo

 17140 buffer
123696 cache
 32303 free

leaving unaccounted

123627K

This is your processes' memory, the inode and dentry caches, and possibly
some extra kernel memory which may be allocated after boot time.  It is
*very* much accounted for.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


>> On the subject of Mike Galbraith's kernel compilation test, how much
>> physical RAM does he have for his machine, what type of CPU is it, and what
>> (approximate) type of device does he use for swap?  I'll see if I can
>> partially duplicate his results at this end.  So far all my tests have been
>> done with a fast CPU - perhaps I should try the P166/MMX or even try
>> loading linux-pmac onto my 8100.
>
>It's a PIII/500 with one ide disk.

...with how much RAM?  That's the important bit.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


[ Re-entering discussion after too long a day and a long sleep... ]

>> There is the problem in terms of some people want pure interactive
>> performance, while others are looking for throughput over all else,
>> but those are both extremes of the spectrum.  Though I suspect
>> raw throughput is the less wanted (in terms of numbers of systems)
>> than keeping interactive response good during VM pressure.
>
>And this raises a very very important point: raw throughtput wins
>enterprise-like benchmarks, and the enterprise people are the ones who pay
>most of hackers here. (including me and Rik)

Very true.  As well as the fact that interactivity is much harder to
measure.  The question is, what is interactivity (from the kernel's
perspective)?  It usually means small(ish) processes with intermittent
working-set and CPU requirements.  These types of process can safely be
swapped out when not immediately in use, but the kernel has to be able to
page them in quite quickly when needed.  Doing that under heavy load is
very non-trivial.

It can also mean multimedia applications with a continuous (maybe small)
working set, a continuous but not 100% CPU usage, and the special property
that the user WILL notice if this process gets swapped out even briefly.
mpg123 and XMMS fall into this category, and I sometimes tried running
these alongside my compilation tests to see how they fared.  I think I had
it going fairly well towards the end, with mpg123 stuttering relatively
rarely and briefly while VM load was high.

On the subject of Mike Galbraith's kernel compilation test, how much
physical RAM does he have for his machine, what type of CPU is it, and what
(approximate) type of device does he use for swap?  I'll see if I can
partially duplicate his results at this end.  So far all my tests have been
done with a fast CPU - perhaps I should try the P166/MMX or even try
loading linux-pmac onto my 8100.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


[ Re-entering discussion after too long a day and a long sleep... ]

 There is the problem in terms of some people want pure interactive
 performance, while others are looking for throughput over all else,
 but those are both extremes of the spectrum.  Though I suspect
 raw throughput is the less wanted (in terms of numbers of systems)
 than keeping interactive response good during VM pressure.

And this raises a very very important point: raw throughtput wins
enterprise-like benchmarks, and the enterprise people are the ones who pay
most of hackers here. (including me and Rik)

Very true.  As well as the fact that interactivity is much harder to
measure.  The question is, what is interactivity (from the kernel's
perspective)?  It usually means small(ish) processes with intermittent
working-set and CPU requirements.  These types of process can safely be
swapped out when not immediately in use, but the kernel has to be able to
page them in quite quickly when needed.  Doing that under heavy load is
very non-trivial.

It can also mean multimedia applications with a continuous (maybe small)
working set, a continuous but not 100% CPU usage, and the special property
that the user WILL notice if this process gets swapped out even briefly.
mpg123 and XMMS fall into this category, and I sometimes tried running
these alongside my compilation tests to see how they fared.  I think I had
it going fairly well towards the end, with mpg123 stuttering relatively
rarely and briefly while VM load was high.

On the subject of Mike Galbraith's kernel compilation test, how much
physical RAM does he have for his machine, what type of CPU is it, and what
(approximate) type of device does he use for swap?  I'll see if I can
partially duplicate his results at this end.  So far all my tests have been
done with a fast CPU - perhaps I should try the P166/MMX or even try
loading linux-pmac onto my 8100.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Jonathan Morton


 On the subject of Mike Galbraith's kernel compilation test, how much
 physical RAM does he have for his machine, what type of CPU is it, and what
 (approximate) type of device does he use for swap?  I'll see if I can
 partially duplicate his results at this end.  So far all my tests have been
 done with a fast CPU - perhaps I should try the P166/MMX or even try
 loading linux-pmac onto my 8100.

It's a PIII/500 with one ide disk.

...with how much RAM?  That's the important bit.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Jonathan Morton

At 12:29 am +0100 8/6/2001, Shane Nay wrote:
>(VM report at Marcelo Tosatti's request.  He has mentioned that rather than
>complaining about the VM that people mention what there experiences were.  I
>have tried to do so in the way that he asked.)

>> By performance you mean interactivity or throughput?
>
>Interactivity.  I don't have any throughput needs to speak of.
>
>I just ran a barage of tests on my machine, and the smallest it would ever
>make the cache was 16M, it would prefer to kill processes rather than make
>the cache smaller than that.

http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch

Try this.  I can't guarantee it's SMP-safe yet (I'm leaving the gurus to
that, but they haven't told me about any errors in the past hour so I'm
assuming they aren't going to find anything glaringly wrong...), but you
might like to see if your performance improves with it.  It also fixes the
OOM-killer bug, which you refer to above.

Some measurements, from my own box (1GHz Athlon, 256Mb RAM):

For the following benchmarks, physical memory availability was reduced
according to the parameter in the left column.  The benchmark is the
wall-clock time taken to compile MySQL.

mem=2.4.5   earlier tweaks  now
48M 8m30s   6m30s   5m58s
32M unknown 2h15m   12m34s

The following was performed with all 256Mb RAM available.  This is
compilation of MySQL using make -j 15.

kernel: 2.4.5   now
time:   6m30s   6m15s
peak swap:  190M70M

For the following test, the 256Mb swap partition on my IDE drive was
disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive.  This
is compilation of MySQL using make -j 20.

kernel: 2.4.5   now
time:   7m20s   6m30s
peak swap:  370M254M

Draw your own conclusions.  :)

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background scanning change on 2.4.6-pre1

2001-06-07 Thread Jonathan Morton


>> > This is going to make all pages have age 0 on an idle system after some
>> > time (the old code from Rik which has been replaced by this code tried to
>> > avoid that)
>
>There's another reason why I think the patch may be ok even without any
>added logic: not only does it simplify the code and remove a illogical
>heuristic, but there is nothing that really says that "age 0" is
>necessarily very bad.

Here's my take on it.  The point of ageing is twofold - to age down pages
that aren't in use, and to age up pages that *are* in use.  So, pages that
are in use will remain with high ages even when background scanning is
being done, and pages that aren't in use will decay to zero age.

I can't see what's wrong with that.  When we need more memory, it's a Very
Good Thing to know that most of the pages in the system haven't been
accessed in yonks - we know exactly which ones we want to throw out first.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reap dead swap cache earlier v2

2001-06-07 Thread Jonathan Morton


>> >As suggested by Linus, I've cleaned the reapswap code to be contained
>> >inside an inline function. (yes, the if statement is really ugly)
>>
>> I can't seem to find the patch which adds this behaviour to the background
>> scanning.
>
>I've just sent Linus a patch to free swap cache pages at the time we free
>the last pte. (requested by himself)
>
>With it applied we should get the old behaviour back again.
>
>I can put it on my webpage if you wish.

Just copy it to me so I can replace the dead-swap hacks you introduced earlier.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reap dead swap cache earlier v2

2001-06-07 Thread Jonathan Morton


>As suggested by Linus, I've cleaned the reapswap code to be contained
>inside an inline function. (yes, the if statement is really ugly)

I can't seem to find the patch which adds this behaviour to the background
scanning.  Can someone point me to it?

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reap dead swap cache earlier v2

2001-06-07 Thread Jonathan Morton


As suggested by Linus, I've cleaned the reapswap code to be contained
inside an inline function. (yes, the if statement is really ugly)

I can't seem to find the patch which adds this behaviour to the background
scanning.  Can someone point me to it?

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reap dead swap cache earlier v2

2001-06-07 Thread Jonathan Morton


 As suggested by Linus, I've cleaned the reapswap code to be contained
 inside an inline function. (yes, the if statement is really ugly)

 I can't seem to find the patch which adds this behaviour to the background
 scanning.

I've just sent Linus a patch to free swap cache pages at the time we free
the last pte. (requested by himself)

With it applied we should get the old behaviour back again.

I can put it on my webpage if you wish.

Just copy it to me so I can replace the dead-swap hacks you introduced earlier.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background scanning change on 2.4.6-pre1

2001-06-07 Thread Jonathan Morton


  This is going to make all pages have age 0 on an idle system after some
  time (the old code from Rik which has been replaced by this code tried to
  avoid that)

There's another reason why I think the patch may be ok even without any
added logic: not only does it simplify the code and remove a illogical
heuristic, but there is nothing that really says that age 0 is
necessarily very bad.

Here's my take on it.  The point of ageing is twofold - to age down pages
that aren't in use, and to age up pages that *are* in use.  So, pages that
are in use will remain with high ages even when background scanning is
being done, and pages that aren't in use will decay to zero age.

I can't see what's wrong with that.  When we need more memory, it's a Very
Good Thing to know that most of the pages in the system haven't been
accessed in yonks - we know exactly which ones we want to throw out first.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Jonathan Morton


At 12:29 am +0100 8/6/2001, Shane Nay wrote:
(VM report at Marcelo Tosatti's request.  He has mentioned that rather than
complaining about the VM that people mention what there experiences were.  I
have tried to do so in the way that he asked.)

 By performance you mean interactivity or throughput?

Interactivity.  I don't have any throughput needs to speak of.

I just ran a barage of tests on my machine, and the smallest it would ever
make the cache was 16M, it would prefer to kill processes rather than make
the cache smaller than that.

http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch

Try this.  I can't guarantee it's SMP-safe yet (I'm leaving the gurus to
that, but they haven't told me about any errors in the past hour so I'm
assuming they aren't going to find anything glaringly wrong...), but you
might like to see if your performance improves with it.  It also fixes the
OOM-killer bug, which you refer to above.

Some measurements, from my own box (1GHz Athlon, 256Mb RAM):

For the following benchmarks, physical memory availability was reduced
according to the parameter in the left column.  The benchmark is the
wall-clock time taken to compile MySQL.

mem=2.4.5   earlier tweaks  now
48M 8m30s   6m30s   5m58s
32M unknown 2h15m   12m34s

The following was performed with all 256Mb RAM available.  This is
compilation of MySQL using make -j 15.

kernel: 2.4.5   now
time:   6m30s   6m15s
peak swap:  190M70M

For the following test, the 256Mb swap partition on my IDE drive was
disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive.  This
is compilation of MySQL using make -j 20.

kernel: 2.4.5   now
time:   7m20s   6m30s
peak swap:  370M254M

Draw your own conclusions.  :)

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


At 11:27 pm +0100 6/6/2001, android wrote:
>> >I'd be happy to write a new routine in assembly
>>
>>I sincerely hope you're joking.
>>
>>It's the algorithm that needs fixing, not the implementation of that
>>algorithm.  Writing in assembler?  Hope you're proficient at writing in
>>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
>>architectures we support these days.  And you darn well better hope every
>>other kernel hacker is as proficient as that, to be able to read it.

>As for the algorithm, I'm sure that
>whatever method is used to handle page swapping, it has to comply with
>the kernel's memory management scheme already in place. That's why I would
>need the details so that I wouldn't create more problems than already present.

Have you actually been following this thread?  The algorithm has been
discussed and at least one alternative brought forward.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>I'd be happy to write a new routine in assembly

I sincerely hope you're joking.

It's the algorithm that needs fixing, not the implementation of that
algorithm.  Writing in assembler?  Hope you're proficient at writing in
x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
architectures we support these days.  And you darn well better hope every
other kernel hacker is as proficient as that, to be able to read it.

IOW, no chance.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>> > Did you try to put twice as much swap as you have RAM ? (e.g. add a
>> > 512M swapfile to your box)
>> > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying
>> > that anything less won't do any good: 2.4 overallocates swap even
>> > if it doesn't use it all. So in your case you just have enough swap
>> > to map your RAM, and nothing to really swap your apps.
>>
>> For large memory boxes, this is ridiculous.  Should I have 8GB of
>> swap?
>
>And laptops with big memories and small disks.

Strongly agree.  I have a PowerBook G3 with 320Mb RAM.  The 18Gb HD is
shared between a total of 4 operating systems.  I haven't got space to put
2/3rds of a Gb of swap on it - in fact I use only 128Mb of swap under
Linux, and don't usually have a problem.

MacOS X uses whatever disk space it needs, from the volumes currently
mounted.  MacOS 9.0.4 is configured to run totally without swap.  Windoze
95 is configured to run in it's usual bloated way, from a total of about
1Gb of virtual HD.

I'm glad to report that with the new fixes being worked on at present, swap
usage is relatively minimalist under the test loads I am able to subject my
Athlon to.  With mem=32M, compiling MySQL goes 65Mb into swap at maximum,
during compilation of a particularly massive C++ file.  Compilation takes
2h15m under these conditions, which is a little slow but that's what
happens when a system starts thrashing heavily.

With mem=48M, compilation completes in about 6m30s, which compares well
with the 5-minute "best case" compile time with unrestricted memory
available.  I didn't check the total swap usage on that run, but it was
less than the 65Mb used with mem=32M.  After the monster file had
completed, the swap balance was largely restored within a few files'
compilation - something which doesn't happen with stock 2.4.x.

With mem=32M, I can sensibly load XFree86 v4, KDE 1.2, XMMS, a webcam app
and Netscape 4.6.  XMMS glitches occasionally (not often, and not
particularly seriously) as I switch between 1600x1200x24bpp virtual
desktops, and swapping gets heavy at times, but the system is essentially
usable and avoids thrashing.  This weekend, I'll treat a friend with an
ageing Cyrix machine to the patches and see if she notices the difference -
the answer will probably be yes.

It remains to be seen how industrial-sized applications fare with the
changes, but I strongly suspect that any reaction will be positive rather
than negative.  Industrial applications *should* be running as if no swap
was available, in any case...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>I am waiting patiently for the bug to be fixed. However, it is a real
>embarrasment that we can't run this "stable" kernel in production yet
>because somethign as fundamental as this is so badly broken.

Rest assured that a fix is in the works.  I'm already seeing a big
improvement in behaviour on my Athlon (256Mb RAM, but testing using mem=32M
and mem=48M), and I strongly believe that we're making progress here.
Maybe some of the more significant improvements will find their way into
2.4.6.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.4.5 VM

2001-06-06 Thread Jonathan Morton


> On a side question: does Linux support swap-files in addition to
>sawp-partitions? Even if that has a performance penalty, when the system
>is swapping performance is dead anyway.

Yes.  Simply use mkswap and swapon/off on a regular file instead of a
partition device.  I don't notice any significant performance penalty (a
swapfile on a SCSI disk is faster than a swap-partition on an IDE disk),
although you'd be advised to attempt to keep the file unfragmented.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>It seems bizarre that a 4GB machine with a working set _far_ lower than that
>should be dying from OOM and swapping itself to death, but that's life in 2.4
>land.

I posted a fix for the OOM problem long ago, and it didn't get integrated
(even after I sent Alan a separated-out version from the larger patch it
was embedded in).  I'm going to re-introduce it soon, and hope that it gets
a better hearing this time.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.4.5 VM

2001-06-06 Thread Jonathan Morton


 On a side question: does Linux support swap-files in addition to
sawp-partitions? Even if that has a performance penalty, when the system
is swapping performance is dead anyway.

Yes.  Simply use mkswap and swapon/off on a regular file instead of a
partition device.  I don't notice any significant performance penalty (a
swapfile on a SCSI disk is faster than a swap-partition on an IDE disk),
although you'd be advised to attempt to keep the file unfragmented.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


It seems bizarre that a 4GB machine with a working set _far_ lower than that
should be dying from OOM and swapping itself to death, but that's life in 2.4
land.

I posted a fix for the OOM problem long ago, and it didn't get integrated
(even after I sent Alan a separated-out version from the larger patch it
was embedded in).  I'm going to re-introduce it soon, and hope that it gets
a better hearing this time.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


I am waiting patiently for the bug to be fixed. However, it is a real
embarrasment that we can't run this stable kernel in production yet
because somethign as fundamental as this is so badly broken.

Rest assured that a fix is in the works.  I'm already seeing a big
improvement in behaviour on my Athlon (256Mb RAM, but testing using mem=32M
and mem=48M), and I strongly believe that we're making progress here.
Maybe some of the more significant improvements will find their way into
2.4.6.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


  Did you try to put twice as much swap as you have RAM ? (e.g. add a
  512M swapfile to your box)
  This is what Linus recommended for 2.4 (swap = 2 * RAM), saying
  that anything less won't do any good: 2.4 overallocates swap even
  if it doesn't use it all. So in your case you just have enough swap
  to map your RAM, and nothing to really swap your apps.

 For large memory boxes, this is ridiculous.  Should I have 8GB of
 swap?

And laptops with big memories and small disks.

Strongly agree.  I have a PowerBook G3 with 320Mb RAM.  The 18Gb HD is
shared between a total of 4 operating systems.  I haven't got space to put
2/3rds of a Gb of swap on it - in fact I use only 128Mb of swap under
Linux, and don't usually have a problem.

MacOS X uses whatever disk space it needs, from the volumes currently
mounted.  MacOS 9.0.4 is configured to run totally without swap.  Windoze
95 is configured to run in it's usual bloated way, from a total of about
1Gb of virtual HD.

I'm glad to report that with the new fixes being worked on at present, swap
usage is relatively minimalist under the test loads I am able to subject my
Athlon to.  With mem=32M, compiling MySQL goes 65Mb into swap at maximum,
during compilation of a particularly massive C++ file.  Compilation takes
2h15m under these conditions, which is a little slow but that's what
happens when a system starts thrashing heavily.

With mem=48M, compilation completes in about 6m30s, which compares well
with the 5-minute best case compile time with unrestricted memory
available.  I didn't check the total swap usage on that run, but it was
less than the 65Mb used with mem=32M.  After the monster file had
completed, the swap balance was largely restored within a few files'
compilation - something which doesn't happen with stock 2.4.x.

With mem=32M, I can sensibly load XFree86 v4, KDE 1.2, XMMS, a webcam app
and Netscape 4.6.  XMMS glitches occasionally (not often, and not
particularly seriously) as I switch between 1600x1200x24bpp virtual
desktops, and swapping gets heavy at times, but the system is essentially
usable and avoids thrashing.  This weekend, I'll treat a friend with an
ageing Cyrix machine to the patches and see if she notices the difference -
the answer will probably be yes.

It remains to be seen how industrial-sized applications fare with the
changes, but I strongly suspect that any reaction will be positive rather
than negative.  Industrial applications *should* be running as if no swap
was available, in any case...

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


I'd be happy to write a new routine in assembly

I sincerely hope you're joking.

It's the algorithm that needs fixing, not the implementation of that
algorithm.  Writing in assembler?  Hope you're proficient at writing in
x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
architectures we support these days.  And you darn well better hope every
other kernel hacker is as proficient as that, to be able to read it.

IOW, no chance.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


At 11:27 pm +0100 6/6/2001, android wrote:
 I'd be happy to write a new routine in assembly

I sincerely hope you're joking.

It's the algorithm that needs fixing, not the implementation of that
algorithm.  Writing in assembler?  Hope you're proficient at writing in
x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
architectures we support these days.  And you darn well better hope every
other kernel hacker is as proficient as that, to be able to read it.

As for the algorithm, I'm sure that
whatever method is used to handle page swapping, it has to comply with
the kernel's memory management scheme already in place. That's why I would
need the details so that I wouldn't create more problems than already present.

Have you actually been following this thread?  The algorithm has been
discussed and at least one alternative brought forward.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: is a kernel panic supposed to happen if root fs is on a SCSIdisk and SCSI support is compiled in as module?

2001-06-02 Thread Jonathan Morton

At 12:17 am +0100 3/6/2001, M.N. wrote:
>Basically, that's the question. I compiled my kernel with the SCSI AIC7xxx.o
>driver as a module, and then when it booted up, it paniced. I thought it was
>some sort of a kernel bug, but it didn't really seem that way when I
>recompiled the kernel with SCSI support built-in in the kernel itself
>(monolithically).  I'm just curious, does a _panic_ necessarily mean that
>the kernel needs fixing, or can a panic be a result of something that the
>user forgot to do which was required in order to avoid that panic?

A kernel panic happens whenever it finds itself in a situation which is
impossible or impractical to fix.  In your case, it needed the SCSI module
in order to load the root FS.  But the SCSI module is itself located on the
root FS.  Catch 22, so panic.  If you'd read the module documentation,
you'd have known about this beforehand, but chalk this up to experience
(aka. RTFM!).

So, a kernel panic usually means it's a configuration error OR hardware
failure OR (rarely) a kernel bug.  Most often, kernel bugs are marked by an
OOPS or BUG message splashing all over the console and the system log.

HTH,

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: is a kernel panic supposed to happen if root fs is on a SCSIdisk and SCSI support is compiled in as module?

2001-06-02 Thread Jonathan Morton


At 12:17 am +0100 3/6/2001, M.N. wrote:
Basically, that's the question. I compiled my kernel with the SCSI AIC7xxx.o
driver as a module, and then when it booted up, it paniced. I thought it was
some sort of a kernel bug, but it didn't really seem that way when I
recompiled the kernel with SCSI support built-in in the kernel itself
(monolithically).  I'm just curious, does a _panic_ necessarily mean that
the kernel needs fixing, or can a panic be a result of something that the
user forgot to do which was required in order to avoid that panic?

A kernel panic happens whenever it finds itself in a situation which is
impossible or impractical to fix.  In your case, it needed the SCSI module
in order to load the root FS.  But the SCSI module is itself located on the
root FS.  Catch 22, so panic.  If you'd read the module documentation,
you'd have known about this beforehand, but chalk this up to experience
(aka. RTFM!).

So, a kernel panic usually means it's a configuration error OR hardware
failure OR (rarely) a kernel bug.  Most often, kernel bugs are marked by an
OOPS or BUG message splashing all over the console and the system log.

HTH,

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Plain 2.4.5 VM

2001-05-30 Thread Jonathan Morton


>The page aging logic does seems fragile as heck.  You never know how
>many folks are aging pages or at what rate.  If aging happens too fast,
>it defeats the garbage identification logic and you rape your cache. If
>aging happens too slowly.. sigh.

Then it sounds like the current algorithm is totally broken and needs
replacement.  If it's impossible to make a system stable by choosing the
right numbers, the system needs changing, not the numbers.  I think that's
pretty much what we're being taught in Control Engineering.  :)

Not having studied the code too closely, it sounds as though there are half
a dozen different "clocks" running for different types of memory, and each
one runs at a different speed and is updated at a different time.
Meanwhile, the paging-out is done on the assumption that all the clocks are
(at least roughly) in sync.  Makes sense, right?  (not!)

I think it's worthwhile to think of the page/buffer caches as having a
working set of their own - if they are being heavily used, they should get
more memory than if they are only lightly used.  The important point to get
right is to ensure that the "clocks" used for each memory area remain in
sync - they don't have to measure real time, just be consistent and fine
granularity.

I'm working on some relatively small changes to vmscan.c which should help
improve the behaviour without upsetting the balance too much.  Watch this
space...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please help me fill in the blanks.

2001-05-26 Thread Jonathan Morton


>> * Live Upgrade
>
>LOBOS will let one Linux kernel boot another, but that requires a boot
>step, so it is not a live upgrade.  so, no, afaik

If you build nearly everything (except, obviously what you need to boot) as
modules, you can unload modules, build new versions, and reload them.  So,
you could say that partial support for "live upgrades" is included.

It works, too - I unloaded my OV511 driver a few weeks ago, copied the
source for the new one in, built it, and re-inserted it.  Same goes for the
DRM module a couple of weeks before that.  Now, the machine in question
gets rebooted fairly often in any case, but those were things I *didn't*
have to reboot for.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please help me fill in the blanks.

2001-05-26 Thread Jonathan Morton


 * Live Upgrade

LOBOS will let one Linux kernel boot another, but that requires a boot
step, so it is not a live upgrade.  so, no, afaik

If you build nearly everything (except, obviously what you need to boot) as
modules, you can unload modules, build new versions, and reload them.  So,
you could say that partial support for live upgrades is included.

It works, too - I unloaded my OV511 driver a few weeks ago, copied the
source for the new one in, built it, and re-inserted it.  Same goes for the
DRM module a couple of weeks before that.  Now, the machine in question
gets rebooted fairly often in any case, but those were things I *didn't*
have to reboot for.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] Re: Linux 2.4.4-ac10

2001-05-23 Thread Jonathan Morton


>Time to hunt around for a 386 or 486 which is limited to such
>a small amount of RAM ;)

I've got an old knackered 486DX/33 with 8Mb RAM (in 30-pin SIMMs, woohoo!),
a flat CMOS battery, a 2Gb Maxtor HD that needs a low-level format every
year, and no case.  It isn't running anything right now...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] Re: Linux 2.4.4-ac10

2001-05-23 Thread Jonathan Morton


Time to hunt around for a 386 or 486 which is limited to such
a small amount of RAM ;)

I've got an old knackered 486DX/33 with 8Mb RAM (in 30-pin SIMMs, woohoo!),
a flat CMOS battery, a 2Gb Maxtor HD that needs a low-level format every
year, and no case.  It isn't running anything right now...

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-21 Thread Jonathan Morton


>If you run into a case where you have a config which would work, but
>CML2 doesn't let you, why don't you fix the grammar instead of saying
>CML2 is wrong?  Let's not confuse these two issues as well.

Strongly agree.  Especially since I'm pushing for an explicit recognition
of the difference between a hard dependancy and a soft derivation.  The
latter can be over-ridden safely by someone who knows what he's after.  The
former will cause a miscompile.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-21 Thread Jonathan Morton


>> order to hold down ruleset complexity and simplify the user
>> experience.  The cost of deciding that the answer to that question is
>
>The user experience can be simplified by a NOVICE/EASY/SANE_DEFAULTS
>option, and perhaps a HACKER option for the really strange
>but _theoretically_ ok stuff.

Having now briefly looked at the language constructs first-hand, I can see
two ways to go about this:

1) Have a HACKER symbol which unsuppresses the "unusual" options, and
suppresses the "generalised" ones (like: "build all the sound drivers for
my hardware, as modules").  This is kinda how it would be implemented in
CML1, cf. EXPERIMENTAL.

2) Have a HACKERS submenu system which contains all the derivations that
could *possibly* be un-defaulted, and allow our intrepid hacker to
explicitly force each to a value or leave unset.  Leaving unset means the
derivation holds, forcing it to a value will explicitly enable or disable
the option along with any hard dependencies.  Head this submenu system with
a big banner that says "FOR EXPERTS ONLY", or suppress it using an
"Experts" switch.

Is there already a language construct to support the difference between a
"derivation" and a "dependency"?  Yes, it's the difference between "unless
FOO==n default BAR==y" and "require FOO!=n implies BAR==y" respectively (or
something to that effect, if my syntax is wrong flame gently please!).
With that in mind, the front-end UI could implement Option 2 easily, by
presenting a mode which automatically collects defaulted but otherwise
hidden symbols, and reveals them to the user when set to "hacker" mode.

I'm going to assume for now that CML2 saves two files - one storing a
relatively small number of symbols (which is strictly limited to those
explicitly set by the user), and one containing the full set for
consumption by the Makefiles.  If this is the case, then if a "hacker" type
switches something explicitly then it'll stay there even if the default
changes for that option in a future kernel.  Meanwhile, Aunt Tillie gets
the changed default option applied with no extra effort.  "make oldconfig"
might as well be a thing of the past for certain purposes, although it
should still be kept as a way of reminding people what the new options are.

Incidentally, in this scenario, if we have "enable driver for device FOOBAR
[NEW] [y/m/N]:" then pressing Return should *not* mark the symbol as
"explicitly set" but left alone because "user accepted the default".  If
they pressed "N", then that has the same effect but is saved explicitly for
future kernels, regardless of any defaults change for that option.

Hope this makes sense and I'm not being a stark raving loonie...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-21 Thread Jonathan Morton


 order to hold down ruleset complexity and simplify the user
 experience.  The cost of deciding that the answer to that question is

The user experience can be simplified by a NOVICE/EASY/SANE_DEFAULTS
option, and perhaps a HACKER option for the really strange
but _theoretically_ ok stuff.

Having now briefly looked at the language constructs first-hand, I can see
two ways to go about this:

1) Have a HACKER symbol which unsuppresses the unusual options, and
suppresses the generalised ones (like: build all the sound drivers for
my hardware, as modules).  This is kinda how it would be implemented in
CML1, cf. EXPERIMENTAL.

2) Have a HACKERS submenu system which contains all the derivations that
could *possibly* be un-defaulted, and allow our intrepid hacker to
explicitly force each to a value or leave unset.  Leaving unset means the
derivation holds, forcing it to a value will explicitly enable or disable
the option along with any hard dependencies.  Head this submenu system with
a big banner that says FOR EXPERTS ONLY, or suppress it using an
Experts switch.

Is there already a language construct to support the difference between a
derivation and a dependency?  Yes, it's the difference between unless
FOO==n default BAR==y and require FOO!=n implies BAR==y respectively (or
something to that effect, if my syntax is wrong flame gently please!).
With that in mind, the front-end UI could implement Option 2 easily, by
presenting a mode which automatically collects defaulted but otherwise
hidden symbols, and reveals them to the user when set to hacker mode.

I'm going to assume for now that CML2 saves two files - one storing a
relatively small number of symbols (which is strictly limited to those
explicitly set by the user), and one containing the full set for
consumption by the Makefiles.  If this is the case, then if a hacker type
switches something explicitly then it'll stay there even if the default
changes for that option in a future kernel.  Meanwhile, Aunt Tillie gets
the changed default option applied with no extra effort.  make oldconfig
might as well be a thing of the past for certain purposes, although it
should still be kept as a way of reminding people what the new options are.

Incidentally, in this scenario, if we have enable driver for device FOOBAR
[NEW] [y/m/N]: then pressing Return should *not* mark the symbol as
explicitly set but left alone because user accepted the default.  If
they pressed N, then that has the same effect but is saved explicitly for
future kernels, regardless of any defaults change for that option.

Hope this makes sense and I'm not being a stark raving loonie...

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-21 Thread Jonathan Morton


If you run into a case where you have a config which would work, but
CML2 doesn't let you, why don't you fix the grammar instead of saying
CML2 is wrong?  Let's not confuse these two issues as well.

Strongly agree.  Especially since I'm pushing for an explicit recognition
of the difference between a hard dependancy and a soft derivation.  The
latter can be over-ridden safely by someone who knows what he's after.  The
former will cause a miscompile.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-20 Thread Jonathan Morton


>1. The Mac derivations were half-right.  The MAC_SCC one is good but Macs
>can have either of two different SCSI controllers.  I fixed that with help
>from Ray Knight, who maintains the 68K Mac port.

If I understand the "philosophy" correctly, it is still possible to specify
additional cards for those Macs which have PCI slots.  If so, absolutely
fine - I can shove my Adaptec 19160 into my G4 and have it work just as
well as it currently does in my Athlon.

One caveat though - not all Macs have SCSI controllers, and not all that do
even have one of the two standard ones.  The "Blue and White G3", the iMac,
the PowerBook G3 "Firewire" and later models on these lines all lack a
built-in SCSI controller, but many could have one added via PCI or CardBus
slots.  The PowerBooks 5300 and 190 (and possibly others) use some
non-standard P.O.S. hanging off the NuBus, which even mkLinux doesn't know
how to drive.  However, in these situations, disabling the extra drivers is
usually not critical unless you're running low on space somewhere.  At that
point, you enable the "Are you insane?" module outlined below...

>3. The MVME derivations are correct *if* (and only if) you agree to ignore
>the possibility that somebody could want to ignore the onboard hardware,
>plug outboard disk or Ethernet cards into the VME-bus connector, and
>do something like running SCSI-over-ATAPI to the outboard device.
>(I missed these possibilities when I wrote the derivations.)

...and then someone else mentioned the possibility of f*x0r3d hardware.  In
this case, I would say this *isn't* a kernel-configuration issue but one of
being able to disable the drivers for the malfunctioning hardware.  If I
have a PCMCIA controller that reboots the machine as soon as the driver
tries to switch it on, I should be able to send a command-line parameter to
the kernel which tells it to switch it off *at run-time*.  And iff I'm
running with hardware which is so f*x0r3d that that isn't enough, I'd
either fix the hardware or hack the config files.  I don't see the problem.

I think the MVME derivations are *perfectly* sensible - if the reference
board and most (read: virtually all) derivatives have those features, turn
them on by all means.  To satisfy some others, you might want to say "Hey,
these guys might want to *explicitly turn off* some of this stuff" - so
provide an option under "Are you insane?" which presents all the "derived"
symbols and allows the hackers to manually turn stuff off.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Background to the argument about CML2 design philosophy

2001-05-20 Thread Jonathan Morton


1. The Mac derivations were half-right.  The MAC_SCC one is good but Macs
can have either of two different SCSI controllers.  I fixed that with help
from Ray Knight, who maintains the 68K Mac port.

If I understand the philosophy correctly, it is still possible to specify
additional cards for those Macs which have PCI slots.  If so, absolutely
fine - I can shove my Adaptec 19160 into my G4 and have it work just as
well as it currently does in my Athlon.

One caveat though - not all Macs have SCSI controllers, and not all that do
even have one of the two standard ones.  The Blue and White G3, the iMac,
the PowerBook G3 Firewire and later models on these lines all lack a
built-in SCSI controller, but many could have one added via PCI or CardBus
slots.  The PowerBooks 5300 and 190 (and possibly others) use some
non-standard P.O.S. hanging off the NuBus, which even mkLinux doesn't know
how to drive.  However, in these situations, disabling the extra drivers is
usually not critical unless you're running low on space somewhere.  At that
point, you enable the Are you insane? module outlined below...

3. The MVME derivations are correct *if* (and only if) you agree to ignore
the possibility that somebody could want to ignore the onboard hardware,
plug outboard disk or Ethernet cards into the VME-bus connector, and
do something like running SCSI-over-ATAPI to the outboard device.
(I missed these possibilities when I wrote the derivations.)

...and then someone else mentioned the possibility of f*x0r3d hardware.  In
this case, I would say this *isn't* a kernel-configuration issue but one of
being able to disable the drivers for the malfunctioning hardware.  If I
have a PCMCIA controller that reboots the machine as soon as the driver
tries to switch it on, I should be able to send a command-line parameter to
the kernel which tells it to switch it off *at run-time*.  And iff I'm
running with hardware which is so f*x0r3d that that isn't enough, I'd
either fix the hardware or hack the config files.  I don't see the problem.

I think the MVME derivations are *perfectly* sensible - if the reference
board and most (read: virtually all) derivatives have those features, turn
them on by all means.  To satisfy some others, you might want to say Hey,
these guys might want to *explicitly turn off* some of this stuff - so
provide an option under Are you insane? which presents all the derived
symbols and allows the hackers to manually turn stuff off.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CML2 design philosophy heads-up

2001-05-18 Thread Jonathan Morton


>> Aunt Tillie doesn't even know what a kernel is, nor does she want
>> to. I think it's fair to assume that people who configure and
>> compile their own kernel (as opposed to using the distribution
>> supplied ones) know what they are doing.
>
>I'd like to break these assumptions.  Or at the very least see how far
>they can be bent.  I know this sounds crazy to a lot of hackers, but
>I think there's a certain amount of unhelpful elitism and self-puffery
>in the "kernels are hard to configure and they *should* be hard to
>configure* attitude.  Let's give Aunt Tillie a chance to surprise us.

Not everyone falls into the "expert user" and "Aunt Tillie" categories.
It's a *very* big grey area.  If some semi-computer-literate user (ie. some
friends of mine!) wants to upgrade their kernel so they have access to
newer hardware (such as a cheap USB webcam), it should be made as simple as
possible for them.  CML1 doesn't handle that very well, I'd like to see
it's replacement do better.

So, the first questions should be along the lines of "Do you have
(approximately) these kinds of standard configuration?" starting with "x86
PC", "Apple PowerMac" and other sensible defaults - followed by "none of
the above".  Then later on, things like "Do you have SCSI?" followed by
"What type of SCSI card(s)".  And under IDE configuration, we have "Do you
want IDE-SCSI emulation (useful for CD-writers and such)?" which turns on
SCSI without any of the card drivers.

The above strategy, if extended properly, would allow novice users to get
*something* which worked, more easily.  More advanced users could then
fiddle with settings they knew about, and experiment.  Those who *really*
know what they're up to can create a wholly customised setup by choosing
"none of the above", right at the beginning.

As for the language CML2 is written in, surely C would work just as well as
Python if the config-ruleset file is in a known format.  GCC is required
for the kernel to build, I don't see why anything else should be required
simply to configure it.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CML2 design philosophy heads-up

2001-05-18 Thread Jonathan Morton


 Aunt Tillie doesn't even know what a kernel is, nor does she want
 to. I think it's fair to assume that people who configure and
 compile their own kernel (as opposed to using the distribution
 supplied ones) know what they are doing.

I'd like to break these assumptions.  Or at the very least see how far
they can be bent.  I know this sounds crazy to a lot of hackers, but
I think there's a certain amount of unhelpful elitism and self-puffery
in the kernels are hard to configure and they *should* be hard to
configure* attitude.  Let's give Aunt Tillie a chance to surprise us.

Not everyone falls into the expert user and Aunt Tillie categories.
It's a *very* big grey area.  If some semi-computer-literate user (ie. some
friends of mine!) wants to upgrade their kernel so they have access to
newer hardware (such as a cheap USB webcam), it should be made as simple as
possible for them.  CML1 doesn't handle that very well, I'd like to see
it's replacement do better.

So, the first questions should be along the lines of Do you have
(approximately) these kinds of standard configuration? starting with x86
PC, Apple PowerMac and other sensible defaults - followed by none of
the above.  Then later on, things like Do you have SCSI? followed by
What type of SCSI card(s).  And under IDE configuration, we have Do you
want IDE-SCSI emulation (useful for CD-writers and such)? which turns on
SCSI without any of the card drivers.

The above strategy, if extended properly, would allow novice users to get
*something* which worked, more easily.  More advanced users could then
fiddle with settings they knew about, and experiment.  Those who *really*
know what they're up to can create a wholly customised setup by choosing
none of the above, right at the beginning.

As for the language CML2 is written in, surely C would work just as well as
Python if the config-ruleset file is in a known format.  GCC is required
for the kernel to build, I don't see why anything else should be required
simply to configure it.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: page_launder() bug

2001-05-08 Thread Jonathan Morton


>That said, anyone who doesn't understand the former should probably
>get some more C experience before commenting on others' code...

I understood it, but it looked very much like a typo.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: page_launder() bug

2001-05-08 Thread Jonathan Morton


That said, anyone who doesn't understand the former should probably
get some more C experience before commenting on others' code...

I understood it, but it looked very much like a typo.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: page_launder() bug

2001-05-06 Thread Jonathan Morton


>-   page_count(page) == (1 + !!page->buffers));

Two inversions in a row?  I'd like to see that made more explicit,
otherwise it looks like a bug to me.  Of course, if it IS a bug...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: page_launder() bug

2001-05-06 Thread Jonathan Morton


-   page_count(page) == (1 + !!page-buffers));

Two inversions in a row?  I'd like to see that made more explicit,
otherwise it looks like a bug to me.  Of course, if it IS a bug...

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Athlon and fast_page_copy: What's it worth ? :)

2001-05-05 Thread Jonathan Morton

At 3:41 pm +0100 5/5/2001, Alan Cox wrote:
>> My wild guess is that with the "faster" code, the K7 is avoiding loading
>> cache lines just to write them out again, and is just writing tons of data.
>> The PPC G4 - and perhaps even the G3 - performs a similar trick
>> automatically, without special assembly...
>
>X86 has done that since the K5 era.
>
>No the main thing that the mmx copier does is to read and write in 64bit
>wide chunks

Just for the record, this can be done on any PPC, by using the FPU
registers (which are much faster than x86 FPU).  AltiVec can do 128-bit
wide transfers.

>and then more importantly to prefetch pending data.

That's a tougher one.  AltiVec (in the G4) can do this, but I suspect it
can be emulated using the pipeline on earlier PowerPCs, by queueing up a
line of FPU load instructions and then a queue of FPU saves.  However, the
601 and 603 don't have a superscalar FPU, though I wonder if that would
actually affect a simple load/store operation.

This is rapidly getting offtopic, though...

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Athlon and fast_page_copy: What's it worth ? :)

2001-05-05 Thread Jonathan Morton


At 7:20 am +0100 5/5/2001, Mark Hahn wrote:
>On Fri, 4 May 2001, Seth Goldberg wrote:
>
>> Hi,
>>
>>   Before I go any further with this investigation, I'd like to get an
>> idea
>> of how much of a performance improvement the K7 fast_page_copy will give
>> me.
>> Can someone suggest the best benchmark to test the speed of this
>> routine?
>
>Arjan van de Ven did the code, and he wrote a little test harness.
>I've hacked it a bit (http://brain.mcmaster.ca/~hahn/athlon.c);
>on my duron/600, kt133, pc133 cas2, it looks like this:
>
>clear_page by 'normal_clear_page'took 7221 cycles (324.6 MB/s)
>clear_page by 'slow_zero_page'   took 7232 cycles (324.1 MB/s)
>clear_page by 'fast_clear_page'  took 6110 cycles (383.6 MB/s)
>clear_page by 'faster_clear_page'took 2574 cycles (910.6 MB/s)
>
>copy_page by 'normal_copy_page'  took 7224 cycles (324.4 MB/s)
>copy_page by 'slow_copy_page'took 7223 cycles (324.5 MB/s)
>copy_page by 'fast_copy_page'took 4662 cycles (502.7 MB/s)
>copy_page by 'faster_copy'   took 2746 cycles (853.5 MB/s)
>copy_page by 'even_faster'   took 2802 cycles (836.5 MB/s)
>
>70% faster!


On my Athlon 1GHz, Abit KT7, PC133 set to "Turbo" (not quite sure what the
actual CAS rating is, but it works):
[chromi@beryllium compsci]$ ./athlon
1000.047 MHz
clear_page by 'normal_clear_page'took 10769 cycles (362.7 MB/s)
clear_page by 'slow_zero_page'   took 10349 cycles (377.5 MB/s)
clear_page by 'fast_clear_page'  took 10868 cycles (359.4 MB/s)
clear_page by 'faster_clear_page'took 4345 cycles (899.1 MB/s)

copy_page by 'normal_copy_page'  took 11242 cycles (347.5 MB/s)
copy_page by 'slow_copy_page'took 11245 cycles (347.4 MB/s)
copy_page by 'fast_copy_page'took 7951 cycles (491.3 MB/s)
copy_page by 'faster_copy'   took 4765 cycles (819.7 MB/s)
copy_page by 'even_faster'   took 4955 cycles (788.3 MB/s)

My wild guess is that with the "faster" code, the K7 is avoiding loading
cache lines just to write them out again, and is just writing tons of data.
The PPC G4 - and perhaps even the G3 - performs a similar trick
automatically, without special assembly...

Perhaps the IWILL m/board doesn't like this behaviour, and somehow assumes
that all written cachelines have been read beforehand.  I heard of some
m/boards - particularly those with more than 3 DIMM slots - using a "helper
chip" to boost the signal to the last slot or two, so maybe that is a
problem?  How many slots does the IWILL have?

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Athlon and fast_page_copy: What's it worth ? :)

2001-05-05 Thread Jonathan Morton


At 7:20 am +0100 5/5/2001, Mark Hahn wrote:
On Fri, 4 May 2001, Seth Goldberg wrote:

 Hi,

   Before I go any further with this investigation, I'd like to get an
 idea
 of how much of a performance improvement the K7 fast_page_copy will give
 me.
 Can someone suggest the best benchmark to test the speed of this
 routine?

Arjan van de Ven did the code, and he wrote a little test harness.
I've hacked it a bit (http://brain.mcmaster.ca/~hahn/athlon.c);
on my duron/600, kt133, pc133 cas2, it looks like this:

clear_page by 'normal_clear_page'took 7221 cycles (324.6 MB/s)
clear_page by 'slow_zero_page'   took 7232 cycles (324.1 MB/s)
clear_page by 'fast_clear_page'  took 6110 cycles (383.6 MB/s)
clear_page by 'faster_clear_page'took 2574 cycles (910.6 MB/s)

copy_page by 'normal_copy_page'  took 7224 cycles (324.4 MB/s)
copy_page by 'slow_copy_page'took 7223 cycles (324.5 MB/s)
copy_page by 'fast_copy_page'took 4662 cycles (502.7 MB/s)
copy_page by 'faster_copy'   took 2746 cycles (853.5 MB/s)
copy_page by 'even_faster'   took 2802 cycles (836.5 MB/s)

70% faster!


On my Athlon 1GHz, Abit KT7, PC133 set to Turbo (not quite sure what the
actual CAS rating is, but it works):
[chromi@beryllium compsci]$ ./athlon
1000.047 MHz
clear_page by 'normal_clear_page'took 10769 cycles (362.7 MB/s)
clear_page by 'slow_zero_page'   took 10349 cycles (377.5 MB/s)
clear_page by 'fast_clear_page'  took 10868 cycles (359.4 MB/s)
clear_page by 'faster_clear_page'took 4345 cycles (899.1 MB/s)

copy_page by 'normal_copy_page'  took 11242 cycles (347.5 MB/s)
copy_page by 'slow_copy_page'took 11245 cycles (347.4 MB/s)
copy_page by 'fast_copy_page'took 7951 cycles (491.3 MB/s)
copy_page by 'faster_copy'   took 4765 cycles (819.7 MB/s)
copy_page by 'even_faster'   took 4955 cycles (788.3 MB/s)

My wild guess is that with the faster code, the K7 is avoiding loading
cache lines just to write them out again, and is just writing tons of data.
The PPC G4 - and perhaps even the G3 - performs a similar trick
automatically, without special assembly...

Perhaps the IWILL m/board doesn't like this behaviour, and somehow assumes
that all written cachelines have been read beforehand.  I heard of some
m/boards - particularly those with more than 3 DIMM slots - using a helper
chip to boost the signal to the last slot or two, so maybe that is a
problem?  How many slots does the IWILL have?

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Athlon and fast_page_copy: What's it worth ? :)

2001-05-05 Thread Jonathan Morton


At 3:41 pm +0100 5/5/2001, Alan Cox wrote:
 My wild guess is that with the faster code, the K7 is avoiding loading
 cache lines just to write them out again, and is just writing tons of data.
 The PPC G4 - and perhaps even the G3 - performs a similar trick
 automatically, without special assembly...

X86 has done that since the K5 era.

No the main thing that the mmx copier does is to read and write in 64bit
wide chunks

Just for the record, this can be done on any PPC, by using the FPU
registers (which are much faster than x86 FPU).  AltiVec can do 128-bit
wide transfers.

and then more importantly to prefetch pending data.

That's a tougher one.  AltiVec (in the G4) can do this, but I suspect it
can be emulated using the pipeline on earlier PowerPCs, by queueing up a
line of FPU load instructions and then a queue of FPU saves.  However, the
601 and 603 don't have a superscalar FPU, though I wonder if that would
actually affect a simple load/store operation.

This is rapidly getting offtopic, though...

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: DISCOVERED! Cause of Athlon/VIA KX133 Instability

2001-05-03 Thread Jonathan Morton


>> I'm using an Abit KT7 board (KT133) and my new 1GHz T'bird (running 50-60°C
>> in a warm room) is giving me no trouble.  This is with the board and RAM
>> pushed as fast as it will go without actually overclocking anything...  and
>> yes, I do have Athlon/K7 optimisations turned on in my kernel (2.4.3).
>>
>
>  I wonder if the KT133A (which is what the IWILL KK266 is based on)
>differences
>a could be a source of the problem.  My FSB is at plain old 100 MHz
>since I
>have regular PC100 SDRAM.  Overclocked, or not, I get the same results.
>I,
>too, had an ABIT KA7[-RAID] and it was rock solid.  So much for "if it's
>not broke, don't fix it" -- I should have listened to my gf, but that's
>the life of an upgrader ;)...  In general the IWILL got great reviews at
>a
>number of reliable hardware review sites, and hey, it doesn't lock up in
>windows ;) (ok don't flame me for that ;)).

Maybe, but the IWILL board is the only one we've heard about problems with.
The Abit KT7A (which also has the KT133A chipset and is otherwise identical
to the KT7) would appear to run smoothly, although I don't actually *have*
one of those.  Probably the Windows drivers turn off some feature of the
IWILL board which is known to be flaky.

I suggest setting *everything* in the BIOS to the "most conservative"
settings and seeing if the problem persists.  If so, then it can't be a
hardware-speed-limitation problem, and there's clearly something we have to
turn off "manually".  Also, try running memtest86 and see if that is
capable of triggering the problem.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: DISCOVERED! Cause of Athlon/VIA KX133 Instability

2001-05-03 Thread Jonathan Morton


 I'm using an Abit KT7 board (KT133) and my new 1GHz T'bird (running 50-60°C
 in a warm room) is giving me no trouble.  This is with the board and RAM
 pushed as fast as it will go without actually overclocking anything...  and
 yes, I do have Athlon/K7 optimisations turned on in my kernel (2.4.3).


  I wonder if the KT133A (which is what the IWILL KK266 is based on)
differences
a could be a source of the problem.  My FSB is at plain old 100 MHz
since I
have regular PC100 SDRAM.  Overclocked, or not, I get the same results.
I,
too, had an ABIT KA7[-RAID] and it was rock solid.  So much for if it's
not broke, don't fix it -- I should have listened to my gf, but that's
the life of an upgrader ;)...  In general the IWILL got great reviews at
a
number of reliable hardware review sites, and hey, it doesn't lock up in
windows ;) (ok don't flame me for that ;)).

Maybe, but the IWILL board is the only one we've heard about problems with.
The Abit KT7A (which also has the KT133A chipset and is otherwise identical
to the KT7) would appear to run smoothly, although I don't actually *have*
one of those.  Probably the Windows drivers turn off some feature of the
IWILL board which is known to be flaky.

I suggest setting *everything* in the BIOS to the most conservative
settings and seeing if the problem persists.  If so, then it can't be a
hardware-speed-limitation problem, and there's clearly something we have to
turn off manually.  Also, try running memtest86 and see if that is
capable of triggering the problem.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: DISCOVERED! Cause of Athlon/VIA KX133 Instability

2001-05-02 Thread Jonathan Morton


>> the only general issue is that kx133 systems seem to be difficult
>> to configure for stability.  ugly things like tweaking Vio.
>> there's no implication that has anything to do with Linux, though.
>
>
>When I reported my problem a couple weeks back another fellow
>said he and several others on the list had the same problem,
>and as far as I can tell it is *only* with the IWILL boards.
>When I compiled with k7 optimizations I'd get all kinds of oopses
>and panics and never fully boot.  They were different every time.
>When any of the lesser optimizations are used I have no problems.
>My memory is one 256MB Corsair PC150 dimm, CPU is a Thunderbird 850,
>and mobo is an IWILL KK266 (KT133A).  The CPU runs between 35°C
>and 40°C.

I'm using an Abit KT7 board (KT133) and my new 1GHz T'bird (running 50-60°C
in a warm room) is giving me no trouble.  This is with the board and RAM
pushed as fast as it will go without actually overclocking anything...  and
yes, I do have Athlon/K7 optimisations turned on in my kernel (2.4.3).

Out of interest, what FSB are you using for your machine?  I understand the
difference between the KT133 and the KT133A is that the latter supports a
266MHz FSB for the Athlon rather than 200MHz.  Since your CPU is running
cool, I doubt you've managed to accidentally o/c it, but nevertheless this
is a possibility...

The 266MHz FSB does require considerably higher standards in board
construction though, so it could be that IWILL have managed to do a shoddy
job on that end.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: DISCOVERED! Cause of Athlon/VIA KX133 Instability

2001-05-02 Thread Jonathan Morton


 the only general issue is that kx133 systems seem to be difficult
 to configure for stability.  ugly things like tweaking Vio.
 there's no implication that has anything to do with Linux, though.


When I reported my problem a couple weeks back another fellow
said he and several others on the list had the same problem,
and as far as I can tell it is *only* with the IWILL boards.
When I compiled with k7 optimizations I'd get all kinds of oopses
and panics and never fully boot.  They were different every time.
When any of the lesser optimizations are used I have no problems.
My memory is one 256MB Corsair PC150 dimm, CPU is a Thunderbird 850,
and mobo is an IWILL KK266 (KT133A).  The CPU runs between 35°C
and 40°C.

I'm using an Abit KT7 board (KT133) and my new 1GHz T'bird (running 50-60°C
in a warm room) is giving me no trouble.  This is with the board and RAM
pushed as fast as it will go without actually overclocking anything...  and
yes, I do have Athlon/K7 optimisations turned on in my kernel (2.4.3).

Out of interest, what FSB are you using for your machine?  I understand the
difference between the KT133 and the KT133A is that the latter supports a
266MHz FSB for the Athlon rather than 200MHz.  Since your CPU is running
cool, I doubt you've managed to accidentally o/c it, but nevertheless this
is a possibility...

The 266MHz FSB does require considerably higher standards in board
construction though, so it could be that IWILL have managed to do a shoddy
job on that end.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM stupidity

2001-04-29 Thread Jonathan Morton


>Where is a patch to allow the sensible OOM I had in prior kernels?
>(cause this crap is getting pitched)

I gave Alan a patch to fix the problem where the OOM activates too early
(eg. when there's still plenty of swap and buffer memory to eat).  I don't
know whether this made it into the mainstream kernel, but from the sound of
it, it didn't.

I also did some work on the OOM killer itself (so that it tries to be more
intelligent about *what* it kills), and I'm fairly certain that didn't get
accepted.

If you like, I can post a patch containing these two fixes.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM stupidity

2001-04-29 Thread Jonathan Morton


Where is a patch to allow the sensible OOM I had in prior kernels?
(cause this crap is getting pitched)

I gave Alan a patch to fix the problem where the OOM activates too early
(eg. when there's still plenty of swap and buffer memory to eat).  I don't
know whether this made it into the mainstream kernel, but from the sound of
it, it didn't.

I also did some work on the OOM killer itself (so that it tries to be more
intelligent about *what* it kills), and I'm fairly certain that didn't get
accepted.

If you like, I can post a patch containing these two fixes.

--
from: Jonathan Chromatix Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] swap-speedup-2.4.3-A1, massive swapping speedup

2001-04-23 Thread Jonathan Morton


>There seems to be one more reason, take a look at the function
>read_swap_cache_async() in swap_state.c, around line 240:
>
>/*
> * Add it to the swap cache and read its contents.
> */
>lock_page(new_page);
>add_to_swap_cache(new_page, entry);
>rw_swap_page(READ, new_page, wait);
>return new_page;
>
>Here we add an "empty" page to the swap cache and use the
>page lock to protect people from reading this non-up-to-date
>page.

How about reversing the order of the calls - ie. add the page to the cache
only when it's been filled?  That would fix the race.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.5 Workshop RealVideo streams -- next time, pleaseget better audio.

2001-04-17 Thread Jonathan Morton


>>I like this idea quite a bit.  It would probably not
>>be terribly expensive to rent/buy the required equipment,
>>it would be easy to use and would not be terribly disruptive
>>to the preceedings.
>
>Just to keep this on topic... the real question is what would be
>the best way to interface this sound system into the Linux
>kernel?
>
>;o)

Not a problem.  :)  Simply fit a machine with several ALSA-compatible
soundcards with mic-level inputs and use it as the recording medium.
Actually, I forget - do OSS-type soundcard drivers handle multiple cards
sensibly too?

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.5 Workshop RealVideo streams -- next time, pleaseget better audio.

2001-04-17 Thread Jonathan Morton


I like this idea quite a bit.  It would probably not
be terribly expensive to rent/buy the required equipment,
it would be easy to use and would not be terribly disruptive
to the preceedings.

Just to keep this on topic... the real question is what would be
the best way to interface this sound system into the Linux
kernel?

;o)

Not a problem.  :)  Simply fit a machine with several ALSA-compatible
soundcards with mic-level inputs and use it as the recording medium.
Actually, I forget - do OSS-type soundcard drivers handle multiple cards
sensibly too?

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer WORKS for a change!

2001-04-13 Thread Jonathan Morton


>I just ran netscape which for some reason or another went totally
>whacky and gobbled RAM.  It has done this before and made the box
>totally unuseable in 2.2.17-2.2.19 befor the kernel killed 90% of
>my running apps before getting the right one.  This time, it
>OOM'd and killed Netscape and I got control back instantly.  This
>is with 2.4.2.  I hope this is a good sign!

Maybe, but 2.4.2 and 2.4.3 are still using the "old" killer algorithms
which can behave erratically.  I haven't looked at 2.2.x OOM killers at
all, so I don't know how they compare.  At some point in the near future, I
want to separate my patches out so they can receive individual attention
and hopefully get applied.

BTW, on this subject, if anyone sent me a mail which I haven't replied to,
I probably never got it due to e-mail problems with my ISP.  If it's still
relevant, please resend.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUESTION] init/main.c

2001-04-13 Thread Jonathan Morton


>ticks = jiffies; while (ticks == jiffies); ticks = jiffies; ?

jiffies is updated by an interrupt routine, I think.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bug in float on Pentium

2001-04-13 Thread Jonathan Morton


> double x = 5483.99;
> float y = 5483.99;

>5483.99
>5483.990234

Well, duh.  Floats are less accurate than doubles, so what?  Read your C
textbook again.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bug in float on Pentium

2001-04-13 Thread Jonathan Morton


 double x = 5483.99;
 float y = 5483.99;

5483.99
5483.990234

Well, duh.  Floats are less accurate than doubles, so what?  Read your C
textbook again.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUESTION] init/main.c

2001-04-13 Thread Jonathan Morton


ticks = jiffies; while (ticks == jiffies); ticks = jiffies; ?

jiffies is updated by an interrupt routine, I think.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer WORKS for a change!

2001-04-13 Thread Jonathan Morton


I just ran netscape which for some reason or another went totally
whacky and gobbled RAM.  It has done this before and made the box
totally unuseable in 2.2.17-2.2.19 befor the kernel killed 90% of
my running apps before getting the right one.  This time, it
OOM'd and killed Netscape and I got control back instantly.  This
is with 2.4.2.  I hope this is a good sign!

Maybe, but 2.4.2 and 2.4.3 are still using the "old" killer algorithms
which can behave erratically.  I haven't looked at 2.2.x OOM killers at
all, so I don't know how they compare.  At some point in the near future, I
want to separate my patches out so they can receive individual attention
and hopefully get applied.

BTW, on this subject, if anyone sent me a mail which I haven't replied to,
I probably never got it due to e-mail problems with my ISP.  If it's still
relevant, please resend.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Revised memory-management stuff (was: OOM killer)

2001-04-04 Thread Jonathan Morton


The attached patch applies to 2.4.3 and should address the most serious
concerns surrounding OOM and low-memory situations for most people.  A
summary of the patch contents follows:

MAJOR: OOM killer now only activates when truly out of memory, ie. when
buffer and cache memory has already been eaten down to the bone.

MEDIUM: The allocation mechanism will now only allow processes to reserve
memory if there is sufficient memory remaining *and* the process is not
already hogging RAM.  IOW, if the allocating process is already 4x the
size of the remaining free memory, reservation of more memory (by fork(),
malloc() or related calls) will fail.

MEDIUM: The OOM killer algorithm has been reworked to be a little more
intelligent by default, and also now allows the sysadmin to specify PIDs
and/or process names which should be left untouched.  Simply echo a
space-delimited list of PIDs and/or process names into
/proc/sys/vm/oom-no-kill, and the OOM killer will ignore all processes
matching any entry in the list until only they and init remain.  Init (as
PID 1 or as a root process named "init") is now always
ignored.  TODO: make certain parameters of the OOM killer configurable.

W-I-P: The memory-accounting code from an old 2.3.99 patch has been
re-introduced, but is in sore need of debugging.  It can be activated by
echoing a negative number into /proc/sys/vm/overcommit_memory - but do
this at your own risk.  Interested kernel hackers should alter the
"#define VM_DEBUG 0" to 1 in include/linux/mm.h to view lots of debugging
and warning messages.  I have seen the memory-accounting code attempt to
"free" blocks of memory exceeding 2GB which had never been allocated,
while running gcc.  The sanity-check code detects these anomalies and
attempts to correct for them, but this isn't good...

SIDE EFFECT: All parts of the kernel which can change the total amount of
VM (eg. by adding/removing swap) should now call
vm_invalidate_totalmem() to notify the VM about this.  A new function
vm_total() now reports the total amount of VM available.  The total VM and
the amount of reserved memory are now available from /proc/meminfo.



diff -rBU 5 linux-2.4.3/fs/exec.c linux-oom/fs/exec.c
--- linux-2.4.3/fs/exec.c   Thu Mar 22 09:26:18 2001
+++ linux-oom/fs/exec.c Tue Apr  3 09:32:07 2001
@@ -386,23 +386,31 @@
 }
 
 static int exec_mmap(void)
 {
struct mm_struct * mm, * old_mm;
+   struct task_struct * tsk = current;
+   unsigned long reserved = 0;
 
-   old_mm = current->mm;
+   old_mm = tsk->mm;
if (old_mm && atomic_read(_mm->mm_users) == 1) {
+   /* Keep old stack reservation */
mm_release();
exit_mmap(old_mm);
return 0;
}
 
+   reserved = vm_enough_memory(tsk->rlim[RLIMIT_STACK].rlim_cur >> 
+   PAGE_SHIFT);
+   if(!reserved)
+   return -ENOMEM;
+
mm = mm_alloc();
if (mm) {
-   struct mm_struct *active_mm;
+   struct mm_struct *active_mm = tsk->active_mm;
 
-   if (init_new_context(current, mm)) {
+   if (init_new_context(tsk, mm)) {
mmdrop(mm);
return -ENOMEM;
}
 
/* Add it to the list of mm's */
@@ -424,10 +432,12 @@
return 0;
}
mmdrop(active_mm);
return 0;
}
+
+   vm_release_memory(reserved);
return -ENOMEM;
 }
 
 /*
  * This function makes sure the current process has its own signal table,
diff -rBU 5 linux-2.4.3/fs/proc/proc_misc.c linux-oom/fs/proc/proc_misc.c
--- linux-2.4.3/fs/proc/proc_misc.c Fri Mar 23 11:45:28 2001
+++ linux-oom/fs/proc/proc_misc.c   Tue Apr  3 09:32:27 2001
@@ -173,11 +173,13 @@
 "HighTotal:%8lu kB\n"
 "HighFree: %8lu kB\n"
 "LowTotal: %8lu kB\n"
 "LowFree:  %8lu kB\n"
 "SwapTotal:%8lu kB\n"
-"SwapFree: %8lu kB\n",
+"SwapFree:  %8lu kB\n"
+"VMTotal:   %8lu kB\n"
+"VMReserved:%8lu kB\n",
 K(i.totalram),
 K(i.freeram),
 K(i.sharedram),
 K(i.bufferram),
 K(atomic_read(_cache_size)),
@@ -188,11 +190,13 @@
 K(i.totalhigh),
 K(i.freehigh),
 K(i.totalram-i.totalhigh),
 K(i.freeram-i.freehigh),
 K(i.totalswap),
-K(i.freeswap));
+K(i.freeswap),
+K(vm_total()), 
+K(vm_reserved));
 
return proc_calc_metrics(page, start, off, count, eof, len);
 #undef B
 #undef K
 }
diff -rBU 5 linux-2.4.3/include/linux/mm.h linux-oom/include/linux/mm.h

[PATCH] Revised memory-management stuff (was: OOM killer)

2001-04-04 Thread Jonathan Morton


The attached patch applies to 2.4.3 and should address the most serious
concerns surrounding OOM and low-memory situations for most people.  A
summary of the patch contents follows:

MAJOR: OOM killer now only activates when truly out of memory, ie. when
buffer and cache memory has already been eaten down to the bone.

MEDIUM: The allocation mechanism will now only allow processes to reserve
memory if there is sufficient memory remaining *and* the process is not
already hogging RAM.  IOW, if the allocating process is already 4x the
size of the remaining free memory, reservation of more memory (by fork(),
malloc() or related calls) will fail.

MEDIUM: The OOM killer algorithm has been reworked to be a little more
intelligent by default, and also now allows the sysadmin to specify PIDs
and/or process names which should be left untouched.  Simply echo a
space-delimited list of PIDs and/or process names into
/proc/sys/vm/oom-no-kill, and the OOM killer will ignore all processes
matching any entry in the list until only they and init remain.  Init (as
PID 1 or as a root process named "init") is now always
ignored.  TODO: make certain parameters of the OOM killer configurable.

W-I-P: The memory-accounting code from an old 2.3.99 patch has been
re-introduced, but is in sore need of debugging.  It can be activated by
echoing a negative number into /proc/sys/vm/overcommit_memory - but do
this at your own risk.  Interested kernel hackers should alter the
"#define VM_DEBUG 0" to 1 in include/linux/mm.h to view lots of debugging
and warning messages.  I have seen the memory-accounting code attempt to
"free" blocks of memory exceeding 2GB which had never been allocated,
while running gcc.  The sanity-check code detects these anomalies and
attempts to correct for them, but this isn't good...

SIDE EFFECT: All parts of the kernel which can change the total amount of
VM (eg. by adding/removing swap) should now call
vm_invalidate_totalmem() to notify the VM about this.  A new function
vm_total() now reports the total amount of VM available.  The total VM and
the amount of reserved memory are now available from /proc/meminfo.



diff -rBU 5 linux-2.4.3/fs/exec.c linux-oom/fs/exec.c
--- linux-2.4.3/fs/exec.c   Thu Mar 22 09:26:18 2001
+++ linux-oom/fs/exec.c Tue Apr  3 09:32:07 2001
@@ -386,23 +386,31 @@
 }
 
 static int exec_mmap(void)
 {
struct mm_struct * mm, * old_mm;
+   struct task_struct * tsk = current;
+   unsigned long reserved = 0;
 
-   old_mm = current-mm;
+   old_mm = tsk-mm;
if (old_mm  atomic_read(old_mm-mm_users) == 1) {
+   /* Keep old stack reservation */
mm_release();
exit_mmap(old_mm);
return 0;
}
 
+   reserved = vm_enough_memory(tsk-rlim[RLIMIT_STACK].rlim_cur  
+   PAGE_SHIFT);
+   if(!reserved)
+   return -ENOMEM;
+
mm = mm_alloc();
if (mm) {
-   struct mm_struct *active_mm;
+   struct mm_struct *active_mm = tsk-active_mm;
 
-   if (init_new_context(current, mm)) {
+   if (init_new_context(tsk, mm)) {
mmdrop(mm);
return -ENOMEM;
}
 
/* Add it to the list of mm's */
@@ -424,10 +432,12 @@
return 0;
}
mmdrop(active_mm);
return 0;
}
+
+   vm_release_memory(reserved);
return -ENOMEM;
 }
 
 /*
  * This function makes sure the current process has its own signal table,
diff -rBU 5 linux-2.4.3/fs/proc/proc_misc.c linux-oom/fs/proc/proc_misc.c
--- linux-2.4.3/fs/proc/proc_misc.c Fri Mar 23 11:45:28 2001
+++ linux-oom/fs/proc/proc_misc.c   Tue Apr  3 09:32:27 2001
@@ -173,11 +173,13 @@
 "HighTotal:%8lu kB\n"
 "HighFree: %8lu kB\n"
 "LowTotal: %8lu kB\n"
 "LowFree:  %8lu kB\n"
 "SwapTotal:%8lu kB\n"
-"SwapFree: %8lu kB\n",
+"SwapFree:  %8lu kB\n"
+"VMTotal:   %8lu kB\n"
+"VMReserved:%8lu kB\n",
 K(i.totalram),
 K(i.freeram),
 K(i.sharedram),
 K(i.bufferram),
 K(atomic_read(page_cache_size)),
@@ -188,11 +190,13 @@
 K(i.totalhigh),
 K(i.freehigh),
 K(i.totalram-i.totalhigh),
 K(i.freeram-i.freehigh),
 K(i.totalswap),
-K(i.freeswap));
+K(i.freeswap),
+K(vm_total()), 
+K(vm_reserved));
 
return proc_calc_metrics(page, start, off, count, eof, len);
 #undef B
 #undef K
 }
diff -rBU 5 linux-2.4.3/include/linux/mm.h linux-oom/include/linux/mm.h
---

Revised memory-management stuff (was: OOM killer)

2001-03-31 Thread Jonathan Morton


There's clearly been lots of discussion about OOM (and memory management in
general) over the last week, so it looks like it's time to summarise it and
work out the solution that's actually going to find it's way into the
kernel.

Issue 1:
The OOM killer was activating too early.  I have a 4-line fix for
this problem, which has already appeared on the list.  Maybe I should
forward a copy directly to Alan and/or Linus.

Issue 2:
Applications are not warned when memory is running low, either in
terms of reserved or allocated memory.  I have implemented an improvement
on this state of affairs, which makes memory reservation (whether by fork
or malloc type operations) fail for applications which are larger than 4
times the unallocated space available.  This also applies to reserved
memory, but the memory-accounting code needs debugging before this will
work reliably.  The reason for stopping large processes short of the hard
OOM line is so that smaller (mostly interactive) processes can still be
started and run reliably.

I will probably need some help with debugging the memory-accounting code,
since it goes into bits of the kernel I know nothing (rather than "very
little") about.

Some posters suggested SIGDANGER, a feature from AIX, to warn processes
when the system became dangerously low on memory.  Other posters pointed
out some disadvantages of SIGDANGER, which however (thankfully) only apply
when SIGDANGER is used in isolation.  For example, a malicious process
designed to reserve memory within it's SIGDANGER handler could be thwarted
by malloc() simply failing cleanly as above.  If the process had already
reserved memory and merely attempted to allocate it (by accessing it), the
non-memory-overcommit code could defeat it by guaranteeing that the
reserved memory was already available to be allocated.  Without the
non-memory-overcommit code, the OOM killer would be triggered - but with
the improved algorithm I came up with as promised, the effects would be
less severe on average (and most likely kill the malicious process in
preference to a valuable batch job or system daemon).

I have not implemented SIGDANGER, but I don't see any reason why it
shouldn't be implemented.  Certain implementation details will need some
care.

Issue 3:
The OOM killer was frequently killing the "wrong" process.  I have
developed an improved badness selector, and devised a possible means of
specifying "don't touch" PIDs at runtime.  PID 1 is never selected for
killing.  I am debating whether to allow selection of *any* process
labelled "init" and running as root for the chop, since one of the "unusual
but frequently encountered" scenarios is for a second init to be running
during an install or recovery procedure.  This might make it's way in as an
optional feature.

Issue 4:
Memory overcommit.  I totally agree with those posters who point
out that there are situations where this is a Bad Thing, specifically in
mission-critical environments.  However, for the "average" system, I still
quite firmly believe it has some advantages.  Since the
non-memory-overcommit code needs a fair amount of debugging (after I
hacksawed it in to fit the latest kernels), I hope the solutions to the
first 3 issues are sufficient to satisfy most people for the time being.

Issue 5:
VM balancing needs a *lot* of work.  During my exercising of the
memory-management code, I noticed that memory-hogging applications could
completely stall the machine, even when there is a lot of physical RAM
available.  I'm considering some simple algorithms to help alleviate this -
these generally amount to a variation on the "suspend some processes when
thrashing" theory.  I'll need to think about these for a bit though, and
try to implement them when I have time.

Expect to see patches (containing the fixes mentioned above) on the list soon.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Revised memory-management stuff (was: OOM killer)

2001-03-31 Thread Jonathan Morton


There's clearly been lots of discussion about OOM (and memory management in
general) over the last week, so it looks like it's time to summarise it and
work out the solution that's actually going to find it's way into the
kernel.

Issue 1:
The OOM killer was activating too early.  I have a 4-line fix for
this problem, which has already appeared on the list.  Maybe I should
forward a copy directly to Alan and/or Linus.

Issue 2:
Applications are not warned when memory is running low, either in
terms of reserved or allocated memory.  I have implemented an improvement
on this state of affairs, which makes memory reservation (whether by fork
or malloc type operations) fail for applications which are larger than 4
times the unallocated space available.  This also applies to reserved
memory, but the memory-accounting code needs debugging before this will
work reliably.  The reason for stopping large processes short of the hard
OOM line is so that smaller (mostly interactive) processes can still be
started and run reliably.

I will probably need some help with debugging the memory-accounting code,
since it goes into bits of the kernel I know nothing (rather than "very
little") about.

Some posters suggested SIGDANGER, a feature from AIX, to warn processes
when the system became dangerously low on memory.  Other posters pointed
out some disadvantages of SIGDANGER, which however (thankfully) only apply
when SIGDANGER is used in isolation.  For example, a malicious process
designed to reserve memory within it's SIGDANGER handler could be thwarted
by malloc() simply failing cleanly as above.  If the process had already
reserved memory and merely attempted to allocate it (by accessing it), the
non-memory-overcommit code could defeat it by guaranteeing that the
reserved memory was already available to be allocated.  Without the
non-memory-overcommit code, the OOM killer would be triggered - but with
the improved algorithm I came up with as promised, the effects would be
less severe on average (and most likely kill the malicious process in
preference to a valuable batch job or system daemon).

I have not implemented SIGDANGER, but I don't see any reason why it
shouldn't be implemented.  Certain implementation details will need some
care.

Issue 3:
The OOM killer was frequently killing the "wrong" process.  I have
developed an improved badness selector, and devised a possible means of
specifying "don't touch" PIDs at runtime.  PID 1 is never selected for
killing.  I am debating whether to allow selection of *any* process
labelled "init" and running as root for the chop, since one of the "unusual
but frequently encountered" scenarios is for a second init to be running
during an install or recovery procedure.  This might make it's way in as an
optional feature.

Issue 4:
Memory overcommit.  I totally agree with those posters who point
out that there are situations where this is a Bad Thing, specifically in
mission-critical environments.  However, for the "average" system, I still
quite firmly believe it has some advantages.  Since the
non-memory-overcommit code needs a fair amount of debugging (after I
hacksawed it in to fit the latest kernels), I hope the solutions to the
first 3 issues are sufficient to satisfy most people for the time being.

Issue 5:
VM balancing needs a *lot* of work.  During my exercising of the
memory-management code, I noticed that memory-hogging applications could
completely stall the machine, even when there is a lot of physical RAM
available.  I'm considering some simple algorithms to help alleviate this -
these generally amount to a variation on the "suspend some processes when
thrashing" theory.  I'll need to think about these for a bit though, and
try to implement them when I have time.

Expect to see patches (containing the fixes mentioned above) on the list soon.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ideas for the oom problem

2001-03-27 Thread Jonathan Morton


I'm going to be gentle here and try to point out where your suggestions are
flawed...

>a. don't kill any task with a uid < 100

Suppose your system daemon springs a leak?  It will have to be killed
eventually, however system daemons can sensibly be given a little "grace".
Also, the UIDs used by a system daemon vary from system to system.

>b. if uid between 100 to 500 or CAP-SYS equivalent enabled
>   set it too a lower priority, so if it is at fault it will happen
>slower
>giving more time before the system collapses

Not slowly enough.  When your system is thrashing, the CPU is the resource
under least pressure, so "nice" values and priorities have virtually zero
effect.  In any case, under OOM conditions the system has *already*
collapsed and we *have* to kill something for the system to keep running.

>c.  if a task is nice'd then immediately put the task too sleep, and schedule
>all code / data too be swapped out, or thrown away as appropiate. do not
>reschedule the task too continue until memory is available

In OOM conditions there is no swap space left to do what you suggest.  This
is a sensible solution for when thrashing is the only problem...

>d. kill any normal user interactive tasks that is started during a memory
>crisis.

Define "memory crisis".  However, this is a relatively sensible solution.

>allocate a pool of memory at system start up that is too be released to the
>memory pool when the system is in a memory crisis. This will reduce system
>swapping, and allow the system too stablize slightly

One of my patches already tries to do this, in a way.  It doesn't yet
provide a hard barrier, but it does prevent applications from hogging the
entire memory on the system (at least, without expending some effort into
it).

>report any task asking for large pool of memory while the system is in
>oom crisis. if uid > 500 and was started from an interactive shell it should
>be killed.

See above.  malloc() fails, which tells the application there is no more
memory in the system.  A well-written application will respond to this and
use more memory-conservative techniques.  A poorly-written application will
segfault.  End Of Problem.  Now to make memory accounting work properly so
these tests are reliable...

>when the crisis is ended, re-adquire the memory pool for later usage.

It is never given up, except when it is needed by the kernel itself (eg. to
swap in pages or (in the absence of true memory accounting) to provide COW
space.

>Prong 3 providing  information about oom crisis too user land
>
>create /proc/vm/oom_crisis this would be readonly file owned by root it would
>report if the system is in crisis and the uid of any process that is asking
>for large amounts of ram while the system
>is in crisis.

This kind of information is already available using /proc - applications
just have to look int he right places.

>create a SIGDANGER handler that is sent out too all tasks that have
>registered a handler when the kernel enters oom_kill, give these tasks a high
>priority access too system resources.

This is a fairly good idea, why does it look so familiar?  :)  SIGDANGER
would be sent to all processes when memory availablility goes below a
threshold, ie. when there is still enough memory left to handle the
situation.  The default handler would be a no-op, preserving compatibility.
However, the notion of "high priority access to resources" is not currently
feasible (or necessary).

>this would enable user land programs too deal with the situation with out
>continuous polling free ram/swap. They could email/page sysadmin and user
>about the crisis and add additional swap resources and kill any know  non
>essential tasks. and probe system for possible broken tasks, such as
>netscape-common tasks not connected too netscape client, at least i have been
>known too find these when netscape crashes.

Interesting applications for this signal.  However, this is entirely a
userspace issue as to what to do with the signal - the kernel's job is to
provide it (if we decide to, that is).

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


>If we use my OOM killer API, this patch would be a module and
>could have module parameters to select that.
>
>Johnathan: I URGE you to apply my patch before adding OOM killer
>   stuff. What's wrong with it, that you cannot use it? ;-)
>
>It is easy to add configurables to a module and play with them
>WITHOUT recompiling.

Thanks for reminding me - I'll look into it on the plane and see what I can
do with it.

>e.g. My important matlab calculation, which runs in user mode
>should not be killed. But killing a local webserver, which serves
>my help system is ok (because I will not loose work, and might
>get it over the net, if there is a problem).
>
>So as Rik stated: The OOM killer cannot suit all people, so it
>has to be configurable, to be OOM kill, not overkill ;-)

Yes, configurability is probably a very good idea.  However, it would be
best to include a good set of general parameters in the kernel itself, so
the set of average systems needs as little tweaking as possible.  One
cannot expect every sysadmin to be familiar with these arcane (and rarely
actually used) parameters, so being able to select "server", "batch",
"workstation", "embedded" and so on would help massively.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] OOM handling

2001-03-27 Thread Jonathan Morton


>> relative ages.  The major flaw in my code is that a sufficiently
>> long-lived
>> process becomes virtually immortal, even if it happens to spring a serious
>> leak after this time - the flaw in yours is that system processes
>
>I think this could easily be fixed if you'd 'chop off' the runtime at a
>certain point:
>
>if(runtime > something_big)
>   runtime = something_big;
>
>This would of course need some tuning. The only thing i don't like about
>this is that it's a kind of 'magical value', but i suppose it's not a very
>good idea to make this configurable, right?

Configurable is good, but right now I'm considering alternative (but
reasonably similar) algorithms.  If I can come up with something that works
reasonably well under all the scenarios I can think up - which is quite a
range - then configurable options may not be necessary.  In any case, other
work I'm doing should make OOM a thing of the past on most systems, since
malloc() and other memory-reservation calls will normally fail before OOM
happens.

It might just happen that totally different algorithms apply best to
different usage patterns, and I can put in some logic to try and detect
these patterns as needed, selecting the most appropriate algorithm.  An
embedded system is very different from a large batch-computation system,
and likewise for an Internet server, multiuser host, or single-user
workstation.  Internet servers come in different sizes, too - the 486 NAT
and web proxy differs considerably from the dedicated mail/web/database
server.

What would really help me is if a number of people with boxen under each of
the above loads could send me a "snapshot" of their system, under normal
load, containing the following info:

- General usage pattern description, in plain English
- Physical and swap memory: total sizes and current utilisation, in MB
- System uptime in days
- Summary of processes running at that instant, including for each process:
- Approximate UID range
- SIZE (not RSS, I want total size)
- CPU time (with separate user and system totals if possible)
- run time
Generalisations would probably be helpful - I don't expect to receive a
list of 500 emacs and bash processes, but indications of the distribution
of the above values for sensible groupings of processes would be valuable.
Of course, if you group processes, include information on how many process
you're grouping.  :)

For your security and protection, it would probably not be wise to indicate
the hostname or IP address(es) of the systems you profile in this manner.
You may, however, wish to invent codenames for the machines in case it
becomes necessary to refer to specific cases.  Profiles can be sent to me
at <[EMAIL PROTECTED]>, please include the string [SNAPSHOT] in the
subject for easy identification.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] OOM handling

2001-03-27 Thread Jonathan Morton


>> >> Of course, I realised that.  Actually, what the code does is take an
>> >> initial badness factor (the memory usage), then divide it using goodness
>> >> factors (some based on time, some purely arbitrary), both of which can be
>> >> considered dimensionless.  Also, at the end, the absolute value is not
>> >> considered - we simply look at the biggest one and kill it.  All
>> >> "denormalisation" does is scale all the values, it doesn't affect
>>which one
>> >> actually turns out biggest.
>> >
>> >So you should realize as well that the actual code implementing this
>> >all is by no means numerically stable...
>>
>> It probably isn't, no.  I'll take another look at it and do some dry runs
>> sometime, and see whether they come out as I expect.
>
>Well the output depends heavly on the actual memsize of the process,
>which IMHO isn't a good value for choosing killing candidates...
>Second there is the problem that it's not possible to wight
>the goodness values against each other. The unit
>remaining is Bit/sqr(seconds). Try to get a grasp on this.
>Please have a look at my patch. The function I'm using
>there is a simply wighted sum of two process parameters.

I just ran the following test case through my (Saturday) version of the code:

80MB Oracle process
1 hour CPU time
1 week uptime
UID = 50

The result was less than 1, which means Oracle (or virtually any other
process with an hour of CPU time and a week's uptime) would not get killed.

You're perfectly right about the numerical stability argument, though.
Integers are notoriously granular, so maybe an increase in resolution is
justified.  There's also an issue where an almost-new process (with
run_time under 1024 seconds) would be given infinitely large badness - that
needs fixing.  Jiffie wrap is worth taking account for, too.  The comments
accompanying the code are completely wrong - cpu_time is in units of 8
seconds, and run_time is in units of 1024 seconds, NOT seconds and minutes
as described.

HOWEVER, I just took a look at your patch from Sunday.  I have very serious
concerns about this, which I will try to explain below:

First, your code uses a hard and arbitrary priority level.  This is
arranged such that if the "bad process" (which I use as a euphemism to
indicate a runaway memory hog) is in any class other than "normal", all
"normal" processes MUST exit before the "bad process" will even be
considered.  As a test case:

Suppose you're running Sendmail as uid 25, which puts it in the "system"
class.  This is a multiuser system and there are a lot of interactive,
unprivileged users present.  You are also running RPC services as "service"
class, using UIDs between 100-500.  Now suppose that Sendmail springs a big
memory leak and swamps the available memory, causing OOM - Sendmail is now
the "bad process" I mentioned earlier.  The sysadmin isn't watching the
system closely enough to kill Sendmail manually, and in any case the system
is thrashing so hard he wouldn't be able to log in quickly.

With your code, all the interactive users would be systematically thrown
off the system (losing all their work - SIGKILL is not kind) and the RPC
services would be shut down.  Depending on the relative ages of Sendmail
and other system services, other essential system daemons may also be shut
down (since your code does not take memory usage into account).  Finally,
Sendmail itself is killed and the problem goes away.

In the same scenario, my version of the code would probably kill Sendmail
relatively early in the sequence, since it is the one hogging all the RAM.
A few of the larger interactive process might get killed, depending on
relative ages.  The major flaw in my code is that a sufficiently long-lived
process becomes virtually immortal, even if it happens to spring a serious
leak after this time - the flaw in yours is that system processes have *too
high* priority relative to others, *right from the beginning*.  Both
problems need addressing if either of our algorithms can be considered
acceptable.

Oh and BTW, I think Bit/sqr(seconds) is a perfectly acceptable unit for
"badness".  Think about it - it increases with pigginess and decreases with
longevity.  I really don't see a problem with it per se.

I'm going to be travelling tomorrow, so I've moved my VM work onto my
PowerBook and will consider OOM-kill-selection algorithms and
memory-accounting while I fly.  See you on the other side of the ocean, and
hopefully the fresh Canadian air will help me think about this clearly.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


>Plase change to 100 to 500 - this would make it consistant with
>the useradd command, which starts adding new users at the UID 500

Depends on which distribution you're using.  In my experience, almost all
the really important stuff happens below 100.  In any case, the
OOM-kill-selection algorithm in this patch is *not* final.  See my
accompanying mail.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


>Out of Memory: Killed process 117 (sendmail).
>
>What we did to run it out of memory, I don't know. But I do know that
>it shouldn't be killing one process more than once... (the process
>should not exist after one try...)

This is a known bug in the Out-of-Memory handler, where it does not count the buffer 
and cache memory as "free" (it should), causing premature OOM killing.  It is, 
however, normal for the OOM killer to attempt to kill a process more than once - it 
takes a few scheduler cycles for the SIGKILL to actually reach the process and take 
effect.

Also, it probably shouldn't have killed Sendmail, since that is usually a 
long-running, low-UID (and important) process.  The OOM-kill selector is another thing 
that wants fixing, and my patch contains a *very rough* beginning to this.

The following patch should solve your problem for now, until a more detailed fix 
(which also clears up many other problems) is available in the stable kernel.

Alan and/or Linus may wish to apply this patch too...

(excerpt from my original patch from Saturday follows)

--- start ---
diff -u linux-2.4.1.orig/mm/oom_kill.c linux/mm/oom_kill.c
--- linux-2.4.1.orig/mm/oom_kill.c  Tue Nov 14 18:56:46 2000
+++ linux/mm/oom_kill.c Sat Mar 24 20:35:20 2001
@@ -76,7 +76,9 @@
run_time = (jiffies - p->start_time) >> (SHIFT_HZ + 10);

points /= int_sqrt(cpu_time);
-   points /= int_sqrt(int_sqrt(run_time));
+
+   /* Long-running processes are *very* important, so don't take the 4th root */
+   points /= run_time;

/*
 * Niced processes are most likely less important, so double
@@ -93,6 +95,10 @@
p->uid == 0 || p->euid == 0)
points /= 4;

+   /* Much the same goes for processes with low UIDs */
+   if(p->uid < 100 || p->euid < 100)
+ points /= 2;
+
/*
 * We don't want to kill a process with direct hardware access.
 * Not only could that mess up the hardware, but usually users
@@ -192,12 +198,20 @@
 int out_of_memory(void)
 {
struct sysinfo swp_info;
+   long free;

/* Enough free memory?  Not OOM. */
-   if (nr_free_pages() > freepages.min)
+   free = nr_free_pages();
+   if (free > freepages.min)
+   return 0;
+
+   if (free + nr_inactive_clean_pages() > freepages.low)
return 0;

-   if (nr_free_pages() + nr_inactive_clean_pages() > freepages.low)
+   /* Buffers and caches can be freed up (Jonathan "Chromatix" Morton) */
+   free += atomic_read(_pages);
+   free += atomic_read(_cache_size);
+   if (free > freepages.low)
return 0;

/* Enough swap space left?  Not OOM. */
--- end ---

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ 
PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: URGENT : System hands on "Freeing unused kernel memory: "

2001-03-27 Thread Jonathan Morton


>> I have 2 ideas:
>> * glibc corrupted
>> * did you downgrade the cpu?
>
>These happen frequently to me (when compiling and installing a
>new glibc)
>But in this case you would have other messages (IIRC something
>like
>respawn too fast).
>Thus the problem is not this!

How about running memtest86 - could be that a RAM module blew up or worked
loose and caused the initial crash and this misbehaviour both at once.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: URGENT : System hands on Freeing unused kernel memory:

2001-03-27 Thread Jonathan Morton


 I have 2 ideas:
 * glibc corrupted
 * did you downgrade the cpu?

These happen frequently to me (when compiling and installing a
new glibc)
But in this case you would have other messages (IIRC something
like
respawn too fast).
Thus the problem is not this!

How about running memtest86 - could be that a RAM module blew up or worked
loose and caused the initial crash and this misbehaviour both at once.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


Out of Memory: Killed process 117 (sendmail).

What we did to run it out of memory, I don't know. But I do know that
it shouldn't be killing one process more than once... (the process
should not exist after one try...)

This is a known bug in the Out-of-Memory handler, where it does not count the buffer 
and cache memory as "free" (it should), causing premature OOM killing.  It is, 
however, normal for the OOM killer to attempt to kill a process more than once - it 
takes a few scheduler cycles for the SIGKILL to actually reach the process and take 
effect.

Also, it probably shouldn't have killed Sendmail, since that is usually a 
long-running, low-UID (and important) process.  The OOM-kill selector is another thing 
that wants fixing, and my patch contains a *very rough* beginning to this.

The following patch should solve your problem for now, until a more detailed fix 
(which also clears up many other problems) is available in the stable kernel.

Alan and/or Linus may wish to apply this patch too...

(excerpt from my original patch from Saturday follows)

--- start ---
diff -u linux-2.4.1.orig/mm/oom_kill.c linux/mm/oom_kill.c
--- linux-2.4.1.orig/mm/oom_kill.c  Tue Nov 14 18:56:46 2000
+++ linux/mm/oom_kill.c Sat Mar 24 20:35:20 2001
@@ -76,7 +76,9 @@
run_time = (jiffies - p-start_time)  (SHIFT_HZ + 10);

points /= int_sqrt(cpu_time);
-   points /= int_sqrt(int_sqrt(run_time));
+
+   /* Long-running processes are *very* important, so don't take the 4th root */
+   points /= run_time;

/*
 * Niced processes are most likely less important, so double
@@ -93,6 +95,10 @@
p-uid == 0 || p-euid == 0)
points /= 4;

+   /* Much the same goes for processes with low UIDs */
+   if(p-uid  100 || p-euid  100)
+ points /= 2;
+
/*
 * We don't want to kill a process with direct hardware access.
 * Not only could that mess up the hardware, but usually users
@@ -192,12 +198,20 @@
 int out_of_memory(void)
 {
struct sysinfo swp_info;
+   long free;

/* Enough free memory?  Not OOM. */
-   if (nr_free_pages()  freepages.min)
+   free = nr_free_pages();
+   if (free  freepages.min)
+   return 0;
+
+   if (free + nr_inactive_clean_pages()  freepages.low)
return 0;

-   if (nr_free_pages() + nr_inactive_clean_pages()  freepages.low)
+   /* Buffers and caches can be freed up (Jonathan "Chromatix" Morton) */
+   free += atomic_read(buffermem_pages);
+   free += atomic_read(page_cache_size);
+   if (free  freepages.low)
return 0;

/* Enough swap space left?  Not OOM. */
--- end ---

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ 
PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


Plase change to 100 to 500 - this would make it consistant with
the useradd command, which starts adding new users at the UID 500

Depends on which distribution you're using.  In my experience, almost all
the really important stuff happens below 100.  In any case, the
OOM-kill-selection algorithm in this patch is *not* final.  See my
accompanying mail.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] OOM handling

2001-03-27 Thread Jonathan Morton


  Of course, I realised that.  Actually, what the code does is take an
  initial badness factor (the memory usage), then divide it using goodness
  factors (some based on time, some purely arbitrary), both of which can be
  considered dimensionless.  Also, at the end, the absolute value is not
  considered - we simply look at the biggest one and kill it.  All
  "denormalisation" does is scale all the values, it doesn't affect
which one
  actually turns out biggest.
 
 So you should realize as well that the actual code implementing this
 all is by no means numerically stable...

 It probably isn't, no.  I'll take another look at it and do some dry runs
 sometime, and see whether they come out as I expect.

Well the output depends heavly on the actual memsize of the process,
which IMHO isn't a good value for choosing killing candidates...
Second there is the problem that it's not possible to wight
the goodness values against each other. The unit
remaining is Bit/sqr(seconds). Try to get a grasp on this.
Please have a look at my patch. The function I'm using
there is a simply wighted sum of two process parameters.

I just ran the following test case through my (Saturday) version of the code:

80MB Oracle process
1 hour CPU time
1 week uptime
UID = 50

The result was less than 1, which means Oracle (or virtually any other
process with an hour of CPU time and a week's uptime) would not get killed.

You're perfectly right about the numerical stability argument, though.
Integers are notoriously granular, so maybe an increase in resolution is
justified.  There's also an issue where an almost-new process (with
run_time under 1024 seconds) would be given infinitely large badness - that
needs fixing.  Jiffie wrap is worth taking account for, too.  The comments
accompanying the code are completely wrong - cpu_time is in units of 8
seconds, and run_time is in units of 1024 seconds, NOT seconds and minutes
as described.

HOWEVER, I just took a look at your patch from Sunday.  I have very serious
concerns about this, which I will try to explain below:

First, your code uses a hard and arbitrary priority level.  This is
arranged such that if the "bad process" (which I use as a euphemism to
indicate a runaway memory hog) is in any class other than "normal", all
"normal" processes MUST exit before the "bad process" will even be
considered.  As a test case:

Suppose you're running Sendmail as uid 25, which puts it in the "system"
class.  This is a multiuser system and there are a lot of interactive,
unprivileged users present.  You are also running RPC services as "service"
class, using UIDs between 100-500.  Now suppose that Sendmail springs a big
memory leak and swamps the available memory, causing OOM - Sendmail is now
the "bad process" I mentioned earlier.  The sysadmin isn't watching the
system closely enough to kill Sendmail manually, and in any case the system
is thrashing so hard he wouldn't be able to log in quickly.

With your code, all the interactive users would be systematically thrown
off the system (losing all their work - SIGKILL is not kind) and the RPC
services would be shut down.  Depending on the relative ages of Sendmail
and other system services, other essential system daemons may also be shut
down (since your code does not take memory usage into account).  Finally,
Sendmail itself is killed and the problem goes away.

In the same scenario, my version of the code would probably kill Sendmail
relatively early in the sequence, since it is the one hogging all the RAM.
A few of the larger interactive process might get killed, depending on
relative ages.  The major flaw in my code is that a sufficiently long-lived
process becomes virtually immortal, even if it happens to spring a serious
leak after this time - the flaw in yours is that system processes have *too
high* priority relative to others, *right from the beginning*.  Both
problems need addressing if either of our algorithms can be considered
acceptable.

Oh and BTW, I think Bit/sqr(seconds) is a perfectly acceptable unit for
"badness".  Think about it - it increases with pigginess and decreases with
longevity.  I really don't see a problem with it per se.

I'm going to be travelling tomorrow, so I've moved my VM work onto my
PowerBook and will consider OOM-kill-selection algorithms and
memory-accounting while I fly.  See you on the other side of the ocean, and
hopefully the fresh Canadian air will help me think about this clearly.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)

RE: [PATCH] OOM handling

2001-03-27 Thread Jonathan Morton


 relative ages.  The major flaw in my code is that a sufficiently
 long-lived
 process becomes virtually immortal, even if it happens to spring a serious
 leak after this time - the flaw in yours is that system processes

I think this could easily be fixed if you'd 'chop off' the runtime at a
certain point:

if(runtime  something_big)
   runtime = something_big;

This would of course need some tuning. The only thing i don't like about
this is that it's a kind of 'magical value', but i suppose it's not a very
good idea to make this configurable, right?

Configurable is good, but right now I'm considering alternative (but
reasonably similar) algorithms.  If I can come up with something that works
reasonably well under all the scenarios I can think up - which is quite a
range - then configurable options may not be necessary.  In any case, other
work I'm doing should make OOM a thing of the past on most systems, since
malloc() and other memory-reservation calls will normally fail before OOM
happens.

It might just happen that totally different algorithms apply best to
different usage patterns, and I can put in some logic to try and detect
these patterns as needed, selecting the most appropriate algorithm.  An
embedded system is very different from a large batch-computation system,
and likewise for an Internet server, multiuser host, or single-user
workstation.  Internet servers come in different sizes, too - the 486 NAT
and web proxy differs considerably from the dedicated mail/web/database
server.

What would really help me is if a number of people with boxen under each of
the above loads could send me a "snapshot" of their system, under normal
load, containing the following info:

- General usage pattern description, in plain English
- Physical and swap memory: total sizes and current utilisation, in MB
- System uptime in days
- Summary of processes running at that instant, including for each process:
- Approximate UID range
- SIZE (not RSS, I want total size)
- CPU time (with separate user and system totals if possible)
- run time
Generalisations would probably be helpful - I don't expect to receive a
list of 500 emacs and bash processes, but indications of the distribution
of the above values for sensible groupings of processes would be valuable.
Of course, if you group processes, include information on how many process
you're grouping.  :)

For your security and protection, it would probably not be wise to indicate
the hostname or IP address(es) of the systems you profile in this manner.
You may, however, wish to invent codenames for the machines in case it
becomes necessary to refer to specific cases.  Profiles can be sent to me
at [EMAIL PROTECTED], please include the string [SNAPSHOT] in the
subject for easy identification.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-27 Thread Jonathan Morton


If we use my OOM killer API, this patch would be a module and
could have module parameters to select that.

Johnathan: I URGE you to apply my patch before adding OOM killer
   stuff. What's wrong with it, that you cannot use it? ;-)

It is easy to add configurables to a module and play with them
WITHOUT recompiling.

Thanks for reminding me - I'll look into it on the plane and see what I can
do with it.

e.g. My important matlab calculation, which runs in user mode
should not be killed. But killing a local webserver, which serves
my help system is ok (because I will not loose work, and might
get it over the net, if there is a problem).

So as Rik stated: The OOM killer cannot suit all people, so it
has to be configurable, to be OOM kill, not overkill ;-)

Yes, configurability is probably a very good idea.  However, it would be
best to include a good set of general parameters in the kernel itself, so
the set of average systems needs as little tweaking as possible.  One
cannot expect every sysadmin to be familiar with these arcane (and rarely
actually used) parameters, so being able to select "server", "batch",
"workstation", "embedded" and so on would help massively.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ideas for the oom problem

2001-03-27 Thread Jonathan Morton


I'm going to be gentle here and try to point out where your suggestions are
flawed...

a. don't kill any task with a uid  100

Suppose your system daemon springs a leak?  It will have to be killed
eventually, however system daemons can sensibly be given a little "grace".
Also, the UIDs used by a system daemon vary from system to system.

b. if uid between 100 to 500 or CAP-SYS equivalent enabled
   set it too a lower priority, so if it is at fault it will happen
slower
giving more time before the system collapses

Not slowly enough.  When your system is thrashing, the CPU is the resource
under least pressure, so "nice" values and priorities have virtually zero
effect.  In any case, under OOM conditions the system has *already*
collapsed and we *have* to kill something for the system to keep running.

c.  if a task is nice'd then immediately put the task too sleep, and schedule
all code / data too be swapped out, or thrown away as appropiate. do not
reschedule the task too continue until memory is available

In OOM conditions there is no swap space left to do what you suggest.  This
is a sensible solution for when thrashing is the only problem...

d. kill any normal user interactive tasks that is started during a memory
crisis.

Define "memory crisis".  However, this is a relatively sensible solution.

allocate a pool of memory at system start up that is too be released to the
memory pool when the system is in a memory crisis. This will reduce system
swapping, and allow the system too stablize slightly

One of my patches already tries to do this, in a way.  It doesn't yet
provide a hard barrier, but it does prevent applications from hogging the
entire memory on the system (at least, without expending some effort into
it).

report any task asking for large pool of memory while the system is in
oom crisis. if uid  500 and was started from an interactive shell it should
be killed.

See above.  malloc() fails, which tells the application there is no more
memory in the system.  A well-written application will respond to this and
use more memory-conservative techniques.  A poorly-written application will
segfault.  End Of Problem.  Now to make memory accounting work properly so
these tests are reliable...

when the crisis is ended, re-adquire the memory pool for later usage.

It is never given up, except when it is needed by the kernel itself (eg. to
swap in pages or (in the absence of true memory accounting) to provide COW
space.

Prong 3 providing  information about oom crisis too user land

create /proc/vm/oom_crisis this would be readonly file owned by root it would
report if the system is in crisis and the uid of any process that is asking
for large amounts of ram while the system
is in crisis.

This kind of information is already available using /proc - applications
just have to look int he right places.

create a SIGDANGER handler that is sent out too all tasks that have
registered a handler when the kernel enters oom_kill, give these tasks a high
priority access too system resources.

This is a fairly good idea, why does it look so familiar?  :)  SIGDANGER
would be sent to all processes when memory availablility goes below a
threshold, ie. when there is still enough memory left to handle the
situation.  The default handler would be a no-op, preserving compatibility.
However, the notion of "high priority access to resources" is not currently
feasible (or necessary).

this would enable user land programs too deal with the situation with out
continuous polling free ram/swap. They could email/page sysadmin and user
about the crisis and add additional swap resources and kill any know  non
essential tasks. and probe system for possible broken tasks, such as
netscape-common tasks not connected too netscape client, at least i have been
known too find these when netscape crashes.

Interesting applications for this signal.  However, this is entirely a
userspace issue as to what to do with the signal - the kernel's job is to
provide it (if we decide to, that is).

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)
big-mail: [EMAIL PROTECTED]
uni-mail: [EMAIL PROTECTED]

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-BEGIN GEEK CODE BLOCK-
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-END GEEK CODE BLOCK-


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 >

1 - 100 of 238 matches

Mail list logo