from:"Sean Hunter"

Re: [PATCH] User chroot

2001-06-28 Thread Sean Hunter


On Wed, Jun 27, 2001 at 04:55:56PM -0400, Albert D. Cahalan wrote:
> ln /dev/zero /tmp/zero
> ln /dev/hda ~/hda
> ln /dev/mem /var/tmp/README

None of these (of course) work if you use mount options to restrict device
nodes on those filesystems.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.2.x series and mm

2001-06-28 Thread Sean Hunter


On Wed, Jun 27, 2001 at 05:27:11PM +0100, Alan Cox wrote:
> > I'm fairly sure it is the file buffers as the apache is already
> > reniced to 20, it is got max 50 processes and each of processes is
> > limited to like 1.5mb of size via ulimit.
> 
> nice wont help you, it controls scheduling priority. Similar a ulimit just 
> ensures that no apache process goes mad and eats lots of memory (good idea
> but not helpful here). If your working set (and thats the bit the matters)
> really is exceeding memory by a fair bit then
> 
> a)Add more RAM - that is the real optimal approach
> b)Make the processes smaller (eg switch to thttpd from www.acme.com)
> c)Speed up the I/O throughput relative to CPU speed
>   - eg the 2.2 IDE UDMA patches

It may also be worth considering 

d)  Reduce the number of Apache processes so they fit nicely in RAM

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] User chroot

2001-06-28 Thread Sean Hunter


On Wed, Jun 27, 2001 at 04:55:56PM -0400, Albert D. Cahalan wrote:
 ln /dev/zero /tmp/zero
 ln /dev/hda ~/hda
 ln /dev/mem /var/tmp/README

None of these (of course) work if you use mount options to restrict device
nodes on those filesystems.

Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.2.x series and mm

2001-06-28 Thread Sean Hunter


On Wed, Jun 27, 2001 at 05:27:11PM +0100, Alan Cox wrote:
  I'm fairly sure it is the file buffers as the apache is already
  reniced to 20, it is got max 50 processes and each of processes is
  limited to like 1.5mb of size via ulimit.
 
 nice wont help you, it controls scheduling priority. Similar a ulimit just 
 ensures that no apache process goes mad and eats lots of memory (good idea
 but not helpful here). If your working set (and thats the bit the matters)
 really is exceeding memory by a fair bit then
 
 a)Add more RAM - that is the real optimal approach
 b)Make the processes smaller (eg switch to thttpd from www.acme.com)
 c)Speed up the I/O throughput relative to CPU speed
   - eg the 2.2 IDE UDMA patches

It may also be worth considering 

d)  Reduce the number of Apache processes so they fit nicely in RAM

Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter

On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> 
> > 
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> > 
> 
> Do I understand you correctly?
> ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> drives.
> 
> It will cost you 19x as much to put the RAM in as to put the
> developer's recommended amount of swap space to back up that RAM.  The
> developers gave their reasons for this design some time ago and if the
> ONLY problem was that it required you to allocate more swap, why should
> it be a priority item to fix it for those that refuse to do so?   By all
> means fix it urgently where it doesn't work when used as advised but
> demanding priority to fixing a problem encountered when a user refuses to
> use it in the manner specified seems very unreasonable.  If you can afford
> 4GB RAM, you certainly can afford 8GB swap.
> 

This is completely bogus. I am not saying that I can't afford the swap.
What I am saying is that it is completely broken to require this amount
of swap given the boundaries of efficient use. 

This is only one of several things which make the 2.4 VM suck for large,
small or medium machines at the moment. Until we have a working VM 2.4
can't possibly go into production on my site on these machines.

A working VM would have several differences from what we have in my
opinion, among which are:
- It wouldn't require 8GB of swap on my large boxes
- It wouldn't suffer from the "bounce buffer" bug on my
  large boxes
- It wouldn't cause the disk drive on my laptop to be
  _constantly_ in use even when all I have done is spawned a
  shell session and have no large apps or daemons running.
- It wouldn't kill things saying it was OOM unless it was OOM.

Furthermore, I am not demanding anything, much less "priority fixing"
for this bug. Its my personal opinion that this is the most critical bug
in the 2.4 series, and if I had the time and skill, this is what I would
be working on. Because I don't have the time and skill, I am perfectly
happy to wait until those that do fix the problem. To say it isn't a
problem because I can buy more disk is nonsense, and its that sort of
thinking that leads to constant need to upgrade hardware in the
proprietary OS world.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter

On Wed, Jun 06, 2001 at 11:16:27AM +0200, Xavier Bestel wrote:
> On 06 Jun 2001 09:54:31 +0100, Sean Hunter wrote:
> > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that
> > > anything less won't do any good: 2.4 overallocates swap even if it
> > > doesn't use it all. So in your case you just have enough swap to map
> > > your RAM, and nothing to really swap your apps.
> > > 
> > 
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> 
> Life is tough. If guess if you have 4GB RAM, you'd be better having no
> swap at all. Or, yes, at least 8GB.
> Or just wait for this bug to be fixed. But be patient.

This is just pure bollocks.  Virtual memory is one of the killer features of
unix. It would be a strange admission to say that our "advanced" 2.4
kernel is so advanced that now you can't use virtual memory at all on
large machines. Needing 8GB of swap to prevent a box from committing
suicide when it has a working set of less than 512M is crazy.

I am waiting patiently for the bug to be fixed. However, it is a real
embarrasment that we can't run this "stable" kernel in production yet
because somethign as fundamental as this is so badly broken.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter


On Wed, Jun 06, 2001 at 10:19:30AM +0200, Xavier Bestel wrote:
> On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote:
> > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote:
> > > "Jeffrey W. Baker" wrote:
> > > > 
> > > > Because the 2.4 VM is so broken, and
> > > > because my machines are frequently deeply swapped,
> > > 
> > > The swapoff algorithms in 2.2 and 2.4 are basically identical.
> > > The problem *appears* worse in 2.4 because it uses lots
> > > more swap.
> > 
> > I disagree with the terminology you're using.  It *is* worse in 2.4,
> > period.  If it only *appears* worse, then if I encounter a situation
> > where a 2.2 box has utilized as much swap as a 2.4 box, I should see the
> > same results.  Yet this happens not to be the case. 
> 
> Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M
> swapfile to your box)
> This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that
> anything less won't do any good: 2.4 overallocates swap even if it
> doesn't use it all. So in your case you just have enough swap to map
> your RAM, and nothing to really swap your apps.
> 

For large memory boxes, this is ridiculous.  Should I have 8GB of swap?

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter

On Tue, Jun 05, 2001 at 09:42:26PM -0400, Russell Leighton wrote:
> 
> I also need some 2.4 features and can't really goto 2.2.
> I would have to agree that the VM is too broken for production...looking
> forward to the work that (hopefully) will be in 2.4.6 to resolve these issues.
> 

Boring to do a "me too", but "me too".  We have four big production oracle
servers that could use 2.4 .  However, the test server we have put 2.4 on has
no end of ridiculous VM and OOM problems.

It seems bizarre that a 4GB machine with a working set _far_ lower than that
should be dying from OOM and swapping itself to death, but that's life in 2.4
land.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter


On Tue, Jun 05, 2001 at 09:42:26PM -0400, Russell Leighton wrote:
 
 I also need some 2.4 features and can't really goto 2.2.
 I would have to agree that the VM is too broken for production...looking
 forward to the work that (hopefully) will be in 2.4.6 to resolve these issues.
 

Boring to do a me too, but me too.  We have four big production oracle
servers that could use 2.4 .  However, the test server we have put 2.4 on has
no end of ridiculous VM and OOM problems.

It seems bizarre that a 4GB machine with a working set _far_ lower than that
should be dying from OOM and swapping itself to death, but that's life in 2.4
land.


Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter


On Wed, Jun 06, 2001 at 10:19:30AM +0200, Xavier Bestel wrote:
 On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote:
  On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote:
   Jeffrey W. Baker wrote:

Because the 2.4 VM is so broken, and
because my machines are frequently deeply swapped,
   
   The swapoff algorithms in 2.2 and 2.4 are basically identical.
   The problem *appears* worse in 2.4 because it uses lots
   more swap.
  
  I disagree with the terminology you're using.  It *is* worse in 2.4,
  period.  If it only *appears* worse, then if I encounter a situation
  where a 2.2 box has utilized as much swap as a 2.4 box, I should see the
  same results.  Yet this happens not to be the case. 
 
 Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M
 swapfile to your box)
 This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that
 anything less won't do any good: 2.4 overallocates swap even if it
 doesn't use it all. So in your case you just have enough swap to map
 your RAM, and nothing to really swap your apps.
 

For large memory boxes, this is ridiculous.  Should I have 8GB of swap?

Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter


On Wed, Jun 06, 2001 at 11:16:27AM +0200, Xavier Bestel wrote:
 On 06 Jun 2001 09:54:31 +0100, Sean Hunter wrote:
   This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that
   anything less won't do any good: 2.4 overallocates swap even if it
   doesn't use it all. So in your case you just have enough swap to map
   your RAM, and nothing to really swap your apps.
   
  
  For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
 
 Life is tough. If guess if you have 4GB RAM, you'd be better having no
 swap at all. Or, yes, at least 8GB.
 Or just wait for this bug to be fixed. But be patient.

This is just pure bollocks.  Virtual memory is one of the killer features of
unix. It would be a strange admission to say that our advanced 2.4
kernel is so advanced that now you can't use virtual memory at all on
large machines. Needing 8GB of swap to prevent a box from committing
suicide when it has a working set of less than 512M is crazy.

I am waiting patiently for the bug to be fixed. However, it is a real
embarrasment that we can't run this stable kernel in production yet
because somethign as fundamental as this is so badly broken.

Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Sean Hunter


On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
 On Wed, 6 Jun 2001, Sean Hunter wrote:
 
  
  For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
  
 
 Do I understand you correctly?
 ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
 at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
 drives.
 
 It will cost you 19x as much to put the RAM in as to put the
 developer's recommended amount of swap space to back up that RAM.  The
 developers gave their reasons for this design some time ago and if the
 ONLY problem was that it required you to allocate more swap, why should
 it be a priority item to fix it for those that refuse to do so?   By all
 means fix it urgently where it doesn't work when used as advised but
 demanding priority to fixing a problem encountered when a user refuses to
 use it in the manner specified seems very unreasonable.  If you can afford
 4GB RAM, you certainly can afford 8GB swap.
 

This is completely bogus. I am not saying that I can't afford the swap.
What I am saying is that it is completely broken to require this amount
of swap given the boundaries of efficient use. 

This is only one of several things which make the 2.4 VM suck for large,
small or medium machines at the moment. Until we have a working VM 2.4
can't possibly go into production on my site on these machines.

A working VM would have several differences from what we have in my
opinion, among which are:
- It wouldn't require 8GB of swap on my large boxes
- It wouldn't suffer from the bounce buffer bug on my
  large boxes
- It wouldn't cause the disk drive on my laptop to be
  _constantly_ in use even when all I have done is spawned a
  shell session and have no large apps or daemons running.
- It wouldn't kill things saying it was OOM unless it was OOM.

Furthermore, I am not demanding anything, much less priority fixing
for this bug. Its my personal opinion that this is the most critical bug
in the 2.4 series, and if I had the time and skill, this is what I would
be working on. Because I don't have the time and skill, I am perfectly
happy to wait until those that do fix the problem. To say it isn't a
problem because I can buy more disk is nonsense, and its that sort of
thinking that leads to constant need to upgrade hardware in the
proprietary OS world.

Sean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux scalability?

2001-05-21 Thread Sean Hunter

Yup.  The problem is that you're trying to measure scalability in performance
of an i/o-bound task by comparing a machine with greater i/o resource but less
processing power with one with greater processing but poorer i/o.  Surprisingly
enough, the one with the best i/o wins.  This isn't really a fair comparison
between the two platforms.

If you put the same disk array on both machines and got the same results, then
you'd have a point.

My point was that in the real world having this configuration for a webserver
is unlikely to be sensible at all.

Sean

On Sat, May 19, 2001 at 10:31:01AM +0200, Sasi Peter wrote:
> On Fri, 18 May 2001, Sean Hunter wrote:
> 
> > Why would you want to run a web server with 8 processors rather than four
> > webservers with 2 each?
> 
> As you might already know, after the interviews to Mingo I assumed, that a
> major portion of the achievements was enabled by the 2.4 scalability
> enhacements. That is why I wrote to LKML, to ask about the 2.4
> scalability, if anybody out there could tell us about the linux kernel's
> scalability possibily compared to W2k scalability...
> 
> -- 
> SaPE - Peter, Sasi - mailto:[EMAIL PROTECTED] - http://sape.iq.rulez.org/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux scalability?

2001-05-21 Thread Sean Hunter


Yup.  The problem is that you're trying to measure scalability in performance
of an i/o-bound task by comparing a machine with greater i/o resource but less
processing power with one with greater processing but poorer i/o.  Surprisingly
enough, the one with the best i/o wins.  This isn't really a fair comparison
between the two platforms.

If you put the same disk array on both machines and got the same results, then
you'd have a point.

My point was that in the real world having this configuration for a webserver
is unlikely to be sensible at all.

Sean

On Sat, May 19, 2001 at 10:31:01AM +0200, Sasi Peter wrote:
 On Fri, 18 May 2001, Sean Hunter wrote:
 
  Why would you want to run a web server with 8 processors rather than four
  webservers with 2 each?
 
 As you might already know, after the interviews to Mingo I assumed, that a
 major portion of the achievements was enabled by the 2.4 scalability
 enhacements. That is why I wrote to LKML, to ask about the 2.4
 scalability, if anybody out there could tell us about the linux kernel's
 scalability possibily compared to W2k scalability...
 
 -- 
 SaPE - Peter, Sasi - mailto:[EMAIL PROTECTED] - http://sape.iq.rulez.org/
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux scalability?

2001-05-18 Thread Sean Hunter


Why would you want to run a web server with 8 processors rather than four
webservers with 2 each?

Sean

On Fri, May 18, 2001 at 09:24:48AM +0200, Sasi Peter wrote:
> Hi!
> 
> I am just writing an essay, an have mentioned TUX as a performance and
> scalability linearity recort holder with TUX, referencing the specweb99
> website summary page:
> 
> http://www.spec.org/osg/web99/results/web99.html
> 
> However, taking a closer look, it turns out, that the above statement
> holds true only for 1 and 2 processor machines. Scalability already
> suffers at 4 processors, and at 8 processors, TUX 2.0 (7500) gets beaten
> by IIS 5.0 (8001), and these were measured on the same kind of box!
> 
> How come, TUX is s good at the lowend (1 and 2 CPUs), and scales this
> bad?
> 
> -- 
> SaPE - Peter, Sasi - mailto:[EMAIL PROTECTED] - http://sape.iq.rulez.org/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux scalability?

2001-05-18 Thread Sean Hunter


Why would you want to run a web server with 8 processors rather than four
webservers with 2 each?

Sean

On Fri, May 18, 2001 at 09:24:48AM +0200, Sasi Peter wrote:
 Hi!
 
 I am just writing an essay, an have mentioned TUX as a performance and
 scalability linearity recort holder with TUX, referencing the specweb99
 website summary page:
 
 http://www.spec.org/osg/web99/results/web99.html
 
 However, taking a closer look, it turns out, that the above statement
 holds true only for 1 and 2 processor machines. Scalability already
 suffers at 4 processors, and at 8 processors, TUX 2.0 (7500) gets beaten
 by IIS 5.0 (8001), and these were measured on the same kind of box!
 
 How come, TUX is s good at the lowend (1 and 2 CPUs), and scales this
 bad?
 
 -- 
 SaPE - Peter, Sasi - mailto:[EMAIL PROTECTED] - http://sape.iq.rulez.org/
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: just-in-time debugging?

2001-05-01 Thread Sean Hunter


My approach is something like the others.  I developed a small wrapper to catch
unaligned traps on alpha.  What it does is run a program in gdb with some
specified arguments (it also sets up so that the process gets a SIGBUS when it
does an unaligned access, but that's probably not relevant here).

Any case, its available by anonymous ftp at ftp://uncarved.com/unaligned.c 
in case you're interested...

Sean

On Sat, Apr 28, 2001 at 09:17:10PM +0100, Tony Hoyle wrote:
> Is there a way (kernel or userspace... doesn't matter) that gdb/ddd
> could be invoked when a program is about
> to dump core, or perhaps on a certain signal (that the app could deliver
> to itself when required).  The latter case
> is what I need right now, as I have to debug an app that breaks
> seemingly randomly & I need to halt when
> certain assertions fail.  Core dumps aren't much use as you can't resume
> them, otherwise I'd just force a segfault
> or something.
> 
> I had a look at the do_coredump stuff and it looks like it could be
> altered to call gdb in the same way that
> modprobe gets called by kmod... however I don't sufficiently know the
> code to work out whether it'd work properly
> or not.  
> 
> A patch to glibc would perhaps be better, but I know that code even
> less!
> 
> Something like responding to SIGTRAP would probably be ideal.
> 
> Tony
> 
> -- 
> 
> "Two weeks before due date, the programmers work 22 hour days cobbling an
>  application from... (apparently) one programmer bashing his face into the
>  keyboard." -- Dilbert
> 
> [EMAIL PROTECTED]http://www.nothing-on.tv 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux and high volume web sites

2001-05-01 Thread Sean Hunter


Also make sure you aren't suffering database lock contention from Mysql.  This
causes very fast context switching on the database server, and is typically
unable to do useful work even though its load avg is not high.  "vmstat" is
useful here.

Sean

On Sat, Apr 28, 2001 at 01:55:01PM -0700, Tim Moore wrote:
> David Lang wrote:
> > 
> > watch the resonate heartbeat and see if it is getting lost in the network
> > traffic (the resonate logs will show missing heartbeat packets). think
> > seriously of setting the resonate stuff to run at a higher priority so
> > that it doesn't get behind.
> > 
> > depending on how high your network traffic is seriously look at putting in
> > a second nic and switch to move the NFS traffic off the network that has
> > the internet traffic and hearbeat.
> > 
> > I had the same problem with central dispatch a couple years ago when first
> > implementing it. the exact details of the problem that I ran into should
> > have been fixed by now (mostly having to do with large number of virtual
> > IP addresses) but the symptoms were the same.
> 
> In addition to the above make sure there's enough bandwidth to the filer
> (eg- good switches, multiple ethernets).
> 
> Consider moving to 2.2.19.  Significant VM changes after 2.2.19pre3 which
> could account for the freezes.
> 
> rgds,
> tim.
> 
> > > I have a high volume web site under linux :
> > > kernel is 2.2.17
> > > hardware is 5 bi-PIII 700Mhz / 512Mb, eepro100
> > > all server are diskless (nfs on an netapp filer) except for tmp and swap
> > >
> > > dispatch is done by the Resonate product
> > >
> > > web server is apache+php (something like 400 processes), database
> > > backend is a mysql on the same hardware
> > >
> > > in high volume from time to time machines are "freezing" then after a
> > > few seconds they "reappear" and response timne is
> > >
> > >
> > > how can I investigate all these problems ?
> 
> --
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux and high volume web sites

2001-05-01 Thread Sean Hunter


Also make sure you aren't suffering database lock contention from Mysql.  This
causes very fast context switching on the database server, and is typically
unable to do useful work even though its load avg is not high.  vmstat is
useful here.

Sean

On Sat, Apr 28, 2001 at 01:55:01PM -0700, Tim Moore wrote:
 David Lang wrote:
  
  watch the resonate heartbeat and see if it is getting lost in the network
  traffic (the resonate logs will show missing heartbeat packets). think
  seriously of setting the resonate stuff to run at a higher priority so
  that it doesn't get behind.
  
  depending on how high your network traffic is seriously look at putting in
  a second nic and switch to move the NFS traffic off the network that has
  the internet traffic and hearbeat.
  
  I had the same problem with central dispatch a couple years ago when first
  implementing it. the exact details of the problem that I ran into should
  have been fixed by now (mostly having to do with large number of virtual
  IP addresses) but the symptoms were the same.
 
 In addition to the above make sure there's enough bandwidth to the filer
 (eg- good switches, multiple ethernets).
 
 Consider moving to 2.2.19.  Significant VM changes after 2.2.19pre3 which
 could account for the freezes.
 
 rgds,
 tim.
 
   I have a high volume web site under linux :
   kernel is 2.2.17
   hardware is 5 bi-PIII 700Mhz / 512Mb, eepro100
   all server are diskless (nfs on an netapp filer) except for tmp and swap
  
   dispatch is done by the Resonate product
  
   web server is apache+php (something like 400 processes), database
   backend is a mysql on the same hardware
  
   in high volume from time to time machines are freezing then after a
   few seconds they reappear and response timne is
  
  
   how can I investigate all these problems ?
 
 --
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: just-in-time debugging?

2001-05-01 Thread Sean Hunter


My approach is something like the others.  I developed a small wrapper to catch
unaligned traps on alpha.  What it does is run a program in gdb with some
specified arguments (it also sets up so that the process gets a SIGBUS when it
does an unaligned access, but that's probably not relevant here).

Any case, its available by anonymous ftp at ftp://uncarved.com/unaligned.c 
in case you're interested...

Sean

On Sat, Apr 28, 2001 at 09:17:10PM +0100, Tony Hoyle wrote:
 Is there a way (kernel or userspace... doesn't matter) that gdb/ddd
 could be invoked when a program is about
 to dump core, or perhaps on a certain signal (that the app could deliver
 to itself when required).  The latter case
 is what I need right now, as I have to debug an app that breaks
 seemingly randomly  I need to halt when
 certain assertions fail.  Core dumps aren't much use as you can't resume
 them, otherwise I'd just force a segfault
 or something.
 
 I had a look at the do_coredump stuff and it looks like it could be
 altered to call gdb in the same way that
 modprobe gets called by kmod... however I don't sufficiently know the
 code to work out whether it'd work properly
 or not.  
 
 A patch to glibc would perhaps be better, but I know that code even
 less!
 
 Something like responding to SIGTRAP would probably be ideal.
 
 Tony
 
 -- 
 
 Two weeks before due date, the programmers work 22 hour days cobbling an
  application from... (apparently) one programmer bashing his face into the
  keyboard. -- Dilbert
 
 [EMAIL PROTECTED]http://www.nothing-on.tv 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Single user linux

2001-04-24 Thread Sean Hunter


On Tue, Apr 24, 2001 at 07:44:17PM +0700, [EMAIL PROTECTED] wrote:
> with multi-user concept, conceptually there should be an
> administrator to create account, grant permission, etc.
> no my sister doesn't want that. i bet there are billions of
> people not willing to learn how to use a computer, they just
> want to use it.

So they buy Macs.  <- This is not a joke or a criticism.  My wife is a happy
and contented ignorant mac user.  

[snippage]

> so what the hell is transmeta doing with mobile linux (midori).
> is it going to teach multi-user thing to tablet owners?
> surely mortals expect midori to behave like their pc. lets say
> on redhat, they have to login as root to access their files,
> they don't even know what a root is!
> 
> lets break unix mind for a while, and give everyone a chance
> to use linux.
> 

If you wanted to do this, the correct place would be to alter your pam config,
but then again, if you knew the slightest thing about unix, you'd know that.

Sean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pedantic code cleanup - am I wasting my time with this?

2001-04-23 Thread Sean Hunter


On Mon, Apr 23, 2001 at 05:26:27PM +0200, Jesper Juhl wrote:
> All the above does is to remove the last comma from 3 enumeration lists. 
> I know that gcc has no problem with that, but to be strictly correct the 
> last entry should not have a trailing comma.
> 

Sadly not.  This isn't a gcc thing: ANSI says that trailing comma is ok (K
Second edition, A8.7 - pg 218 &219 in my copy)

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-29 Thread Sean Hunter

On Thu, Mar 29, 2001 at 01:01:54PM +0200, Guest section DW wrote:
> [Never use planes where the company's engineers spend their
> time designing algorithms for selecting which passenger
> must be thrown out when the plane is overloaded.]

This is (as far as I can see) a fantastically specious argument.  A plane is
designed to function in an entirely constrained mode of operation and in an
entirely well-understood and circumscribed problem space, whereas a linux host
is a general-purpose device which can be used for many different applications.

The reason the aero engineers don't need to select a passanger to throw out
when the plane is overloaded is simply that the plane operators do not allow
the plane to become overloaded.  If I put a 100 people in a trilander it may
take off, but won't fly, and will probably crash.  The plane's designers don't
have to do anything about that- we do.

Furthermore, why do you suppose an aeroplane has more than one altimeter,
artifical horizon and compass?  Do you think it's because they are unable to
make one of each that is reliable?  Or do you think its because they are
concerned about what happens if one fails _however unlikely that is_.

In fact, aeroplane engineers do design in ways of mitigating the effects of all
kinds of failures, including lessening the impact of a crash (directly
analogous to our OOM killer).  For example, they provide means of jettisonning
fuel prior to crash landing to attempt to minimise explosions.

Risk management is about lessening impact as well as lessening probability.  If
something is important, you don't only make it work as well as you can, you
mitigate the effect of failure.  A reliable system is not just a strong belt,
it is belt, braces, suspenders and bicycle clips.

I have seen the OOM killer in operation three times on our production servers.
In each case it kept the machine alive in the face of hostile runaway
processes.  I don't want to see things killed, but if that is the only way to
keep the host alive, I vote to keep it alive.

When I'm on a plane, I want more than one engine _and_ lifejackets.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer???

2001-03-29 Thread Sean Hunter


On Thu, Mar 29, 2001 at 01:01:54PM +0200, Guest section DW wrote:
 [Never use planes where the company's engineers spend their
 time designing algorithms for selecting which passenger
 must be thrown out when the plane is overloaded.]

This is (as far as I can see) a fantastically specious argument.  A plane is
designed to function in an entirely constrained mode of operation and in an
entirely well-understood and circumscribed problem space, whereas a linux host
is a general-purpose device which can be used for many different applications.

The reason the aero engineers don't need to select a passanger to throw out
when the plane is overloaded is simply that the plane operators do not allow
the plane to become overloaded.  If I put a 100 people in a trilander it may
take off, but won't fly, and will probably crash.  The plane's designers don't
have to do anything about that- we do.

Furthermore, why do you suppose an aeroplane has more than one altimeter,
artifical horizon and compass?  Do you think it's because they are unable to
make one of each that is reliable?  Or do you think its because they are
concerned about what happens if one fails _however unlikely that is_.

In fact, aeroplane engineers do design in ways of mitigating the effects of all
kinds of failures, including lessening the impact of a crash (directly
analogous to our OOM killer).  For example, they provide means of jettisonning
fuel prior to crash landing to attempt to minimise explosions.

Risk management is about lessening impact as well as lessening probability.  If
something is important, you don't only make it work as well as you can, you
mitigate the effect of failure.  A reliable system is not just a strong belt,
it is belt, braces, suspenders and bicycle clips.

I have seen the OOM killer in operation three times on our production servers.
In each case it kept the machine alive in the face of hostile runaway
processes.  I don't want to see things killed, but if that is the only way to
keep the host alive, I vote to keep it alive.

When I'm on a plane, I want more than one engine _and_ lifejackets.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Disturbing news..

2001-03-28 Thread Sean Hunter


On Wed, Mar 28, 2001 at 06:08:15AM -0600, Jesse Pollard wrote:
> Sure - very simple. If the execute bit is set on a file, don't allow
> ANY write to the file. This does modify the permission bits slightly
> but I don't think it is an unreasonable thing to have.
> 

Are we not then in the somewhat zen-like state of having an "rm" which can't
"rm" itself without needing to be made non-executable so that it can't execute?

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Disturbing news..

2001-03-28 Thread Sean Hunter


On Wed, Mar 28, 2001 at 06:08:15AM -0600, Jesse Pollard wrote:
 Sure - very simple. If the execute bit is set on a file, don't allow
 ANY write to the file. This does modify the permission bits slightly
 but I don't think it is an unreasonable thing to have.
 

Are we not then in the somewhat zen-like state of having an "rm" which can't
"rm" itself without needing to be made non-executable so that it can't execute?

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: binfmt_script and ^M

2001-03-06 Thread Sean Hunter



I propose
/proc/sys/kernel/im_too_lame_to_learn_how_to_use_the_most_basic_of_unix_tools_so_i_want_the_kernel_to_be_filled_with_crap_to_disguise_my_ineptitude

Any support?

Sean

On Tue, Mar 06, 2001 at 02:45:51PM -, Laramie Leavitt wrote:
> > Andreas Schwab wrote:
> > > Paul Flinders <[EMAIL PROTECTED]> writes:
> > > |> Andreas Schwab wrote:
> > > |>
> > > |> > This [isspace('\r') == 1] has no significance here.  The
> > right thing to
> > > |>
> > > |> > look at is $IFS, which does not contain \r by default.
> > The shell only splits
> > > |>
> > > |> > words by "IFS whitespace", and the kernel should be
> > consistent with it:
> > > |> >
> > > |> > $ echo -e 'ls foo\r' | sh
> > > |> > ls: foo: No such file or directory
> > > |>
> > > |> The problem with that argument is that #! can be applied
> > > |> to more than just shells which understand $IFS, so which environment
> > > |> variable does the kernel pick?
> > >
> > > The kernel should use the same default value of IFS as the Bourne shell,
> > > ie. the same value you'll get with /bin/sh -c 'echo "$IFS"'.  This is
> > > independent of any settings in the environment.
> > >
> > > |> It's a difficult one - logically white space should
> > terminate the interpreter
> > >
> > > No, IFS-whitespace delimits arguments in the Bourne shell.
> >
> > Way back whenever processing #! was moved from the
> > shell to the kernel** this argument would have made sense -
> > today I'm not so sure.
> >
> > But I'm quite happy for the kernel to use just space and
> > tab if it wishes, or anything else for that matter but it _is_
> > confusing that the error code doesn't distinguish problems
> > with the script from problems with the interpreter.
> >
> > **Did linux ever rely on the shell for this?
> 
> Maybe the correct answer would be to create a proc entry for this.
> That allow the user to decide what is whitespace on his machine,
> since nobody here appears to agree.
> 
> User:  hmm... Wonder what happes if i do the following
>%cat '$#! \n\t\r' > /proc/whitespace
> later, % config.sh : Error file not found.
> Oops, bug report... ;-)
> 
> Laramie
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: binfmt_script and ^M

2001-03-06 Thread Sean Hunter

I propose
/proc/sys/kernel/im_too_lame_to_learn_how_to_use_the_most_basic_of_unix_tools_so_i_want_the_kernel_to_be_filled_with_crap_to_disguise_my_ineptitude

Any support?

Sean

On Tue, Mar 06, 2001 at 02:45:51PM -, Laramie Leavitt wrote:
Andreas Schwab wrote:
Paul Flinders [EMAIL PROTECTED] writes:
| Andreas Schwab wrote:
|
| This [isspace('\r') == 1] has no significance here. The
right thing to
|
| look at is $IFS, which does not contain \r by default.
The shell only splits
|
| words by "IFS whitespace", and the kernel should be
consistent with it:
|
| $ echo -e 'ls foo\r' | sh
| ls: foo: No such file or directory
|
| The problem with that argument is that #!interpreter can be applied
| to more than just shells which understand $IFS, so which environment
| variable does the kernel pick?

The kernel should use the same default value of IFS as the Bourne shell,
ie. the same value you'll get with /bin/sh -c 'echo "$IFS"'. This is
independent of any settings in the environment.

| It's a difficult one - logically white space should
terminate the interpreter

No, IFS-whitespace delimits arguments in the Bourne shell.

Way back whenever processing #! was moved from the
shell to the kernel** this argument would have made sense -
today I'm not so sure.

But I'm quite happy for the kernel to use just space and
tab if it wishes, or anything else for that matter but it _is_
confusing that the error code doesn't distinguish problems
with the script from problems with the interpreter.

**Did linux ever rely on the shell for this?

Maybe the correct answer would be to create a proc entry for this.
That allow the user to decide what is whitespace on his machine,
since nobody here appears to agree.

User: hmm... Wonder what happes if i do the following
%cat '$#! \n\t\r' /proc/whitespace
later, % config.sh : Error file not found.
Oops, bug report... ;-)

Laramie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.2 -> 2.4: /proc/net/tcp 10x slower ?

2001-02-26 Thread Sean Hunter


The identd wot I wrote is still fast as anything on 2.4 :)

As you can see from this teeny sample of my ident log, I take just a little
over 1/100th of a second to respond (on average). :)

2001-02-25 16:18:35.714731500 Q [194.75.152.225] - [32907, 25]
2001-02-25 16:18:35.726085500 A [194.75.152.225] - 
[9a0c62e79c0df893bb96dd74/3a99305b/b0164] for [32907, 25] UID [506]
2001-02-26 09:41:02.535514500 Q [195.92.249.252] - [33363, 21]
2001-02-26 09:41:02.548884500 A [195.92.249.252] - 
[8c0babd7b8ab6830b7092839/3a9a24ae/8454c] for [33363, 21] UID [500]

By the way, the intention of my ident server was not to be fast, but just to be
a little simpler and less over-engineered than pidentd, and not to give out any
site-specific information (uid's etc).  The speed was a bonus.

Sean


On Mon, Feb 26, 2001 at 03:12:01PM +0100, Sven Rudolph wrote:
> Usually identd's on Linux parse /proc/net/tcp.
> 
> When migrating from Linux 2.2.17 to 2.4.2 identd became much slower.
> 
> I traced it back to the point where /proc/net/tcp is read.
> 
> On the same slightly loaded system:
> 
> 2.2.17 $ time cat /proc/net/tcp >/dev/null
> real0m0.004s
> user0m0.000s
> sys 0m0.010s
> 
> (Or sometimes 0.000s due to granularity)
> 
> 2.2.17 $ time cat /proc/net/tcp >/dev/null
> real0m0.083s
> user0m0.000s
> sys 0m0.080s
> 
> 
> Is this expected? Or is there a more efficient interface that identd
> should use?
> 
>   Sven
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.2 - 2.4: /proc/net/tcp 10x slower ?

2001-02-26 Thread Sean Hunter


The identd wot I wrote is still fast as anything on 2.4 :)

As you can see from this teeny sample of my ident log, I take just a little
over 1/100th of a second to respond (on average). :)

2001-02-25 16:18:35.714731500 Q [194.75.152.225] - [32907, 25]
2001-02-25 16:18:35.726085500 A [194.75.152.225] - 
[9a0c62e79c0df893bb96dd74/3a99305b/b0164] for [32907, 25] UID [506]
2001-02-26 09:41:02.535514500 Q [195.92.249.252] - [33363, 21]
2001-02-26 09:41:02.548884500 A [195.92.249.252] - 
[8c0babd7b8ab6830b7092839/3a9a24ae/8454c] for [33363, 21] UID [500]

By the way, the intention of my ident server was not to be fast, but just to be
a little simpler and less over-engineered than pidentd, and not to give out any
site-specific information (uid's etc).  The speed was a bonus.

Sean


On Mon, Feb 26, 2001 at 03:12:01PM +0100, Sven Rudolph wrote:
 Usually identd's on Linux parse /proc/net/tcp.
 
 When migrating from Linux 2.2.17 to 2.4.2 identd became much slower.
 
 I traced it back to the point where /proc/net/tcp is read.
 
 On the same slightly loaded system:
 
 2.2.17 $ time cat /proc/net/tcp /dev/null
 real0m0.004s
 user0m0.000s
 sys 0m0.010s
 
 (Or sometimes 0.000s due to granularity)
 
 2.2.17 $ time cat /proc/net/tcp /dev/null
 real0m0.083s
 user0m0.000s
 sys 0m0.080s
 
 
 Is this expected? Or is there a more efficient interface that identd
 should use?
 
   Sven
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: random PID generation

2001-02-23 Thread Sean Hunter


I have already written a 2.2 implementation which does not suffer from these
problems.  It was rejected because Alan Cox (and others) felt it only provided
security through obscurity.

Sean

On Fri, Feb 23, 2001 at 11:40:37PM +0800, Matt Johnston wrote:
> OpenBSD has a working implementation, might be worth looking at???
> 
> Cheers,
> Matt Johnston.
> 
> On Fri, 23 Feb 2001 23:34, Heusden, Folkert van wrote:
> > >> My code runs trough the whole task_list to see if a chosen pid is
> > >> already
> > >>
> > >> in use or not.
> > >
> > > But it doesn't check for a recently used PID. Lets say your system is
> > > exhausting 1000 PIDs/second, and that there is a window of 20ms between
> >
> > you
> >
> > > determining which PID to send to, and the recipient process receiving it.
> >
> > Ah, I get your point. Good point :o)
> >
> > I was thinking: I could split the PIDs up in 2...16383 and 16384-32767 and
> > then
> > switch between them when a process ends? nah, that doesn't help it.
> > hmmm.
> > I think random increments (instead of last_pid+1) would be the best thing
> > to do then?
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: random PID generation

2001-02-23 Thread Sean Hunter


I have already written a 2.2 implementation which does not suffer from these
problems.  It was rejected because Alan Cox (and others) felt it only provided
security through obscurity.

Sean

On Fri, Feb 23, 2001 at 11:40:37PM +0800, Matt Johnston wrote:
 OpenBSD has a working implementation, might be worth looking at???
 
 Cheers,
 Matt Johnston.
 
 On Fri, 23 Feb 2001 23:34, Heusden, Folkert van wrote:
   My code runs trough the whole task_list to see if a chosen pid is
   already
  
   in use or not.
  
   But it doesn't check for a recently used PID. Lets say your system is
   exhausting 1000 PIDs/second, and that there is a window of 20ms between
 
  you
 
   determining which PID to send to, and the recipient process receiving it.
 
  Ah, I get your point. Good point :o)
 
  I was thinking: I could split the PIDs up in 2...16383 and 16384-32767 and
  then
  switch between them when a process ends? nah, that doesn't help it.
  hmmm.
  I think random increments (instead of last_pid+1) would be the best thing
  to do then?
 
 
  -
  To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  Please read the FAQ at  http://www.tux.org/lkml/
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Alpha: bad unaligned access handling

2001-02-14 Thread Sean Hunter

On Wed, Feb 14, 2001 at 03:38:33PM -0200, Carlos Carvalho wrote:
> Sean Hunter ([EMAIL PROTECTED]) wrote on 14 February 2001 17:26:
>  >This is an application problem, not a kernel one.  You need to upgrade your
>  >netkit.
> 
> Yes, I was quite confident of this. However, unaligned traps are a
> frequent problem with alphas. For a looong time we had zsh produce
> lots of it, to the point of making it unusable. Strangely, the problem
> disappeared without changing anything in zsh. It was either a library
> or kernel problem.

Definitely library, I'd think.

> 
>  >P.S. I wrote a small wrapper to aid in the debugging of unaligned
>  >traps, which I'll send to anyone who's interested.
> 
> I'd like it!
> 

OK, my alpha is a sick bunny at the moment, so I'll have to wait until I get
home (so I can see why I can't ssh to it).  What the wrapper does is set some
settings so your program gets sigbus when it generates an unaligned trap, and
then runs your program in gdb so gdb helpfully stops at the line which
generated the trap.  It goes without saying you need to build the program in
question with debugging symbols so that you see the code.

You then need to fix the unaligned access.  This sometimes requires real alpha
guruhood (Which I do not possess, but Richard Henderson or Michal Jagerman do,
if you need advice), but sometimes simply requires adding __attribute__
((__unaligned__)) to a struct member in a c file.

Sean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Alpha: bad unaligned access handling

2001-02-14 Thread Sean Hunter


On Wed, Feb 14, 2001 at 03:11:17PM -0200, Carlos Carvalho wrote:
> Jan-Benedict Glaw ([EMAIL PROTECTED]) wrote on 14 February 2001 15:48:
>  >With my currently installed ping (netkit-ping 0.10-6 from Debian Woody)
>  >I get unaligned accesses:
>  >
>  >ping(15953): unaligned trap at 0001200030e4: 000120026b34 29 1
>  >ping(15953): unaligned trap at 000120003110: 000120026b2c 29 2
>  >
>  >The worse part is: they seem to be handled The Wrong Way:
>  >
>  >[jbglaw@air:/home/jbglaw] $> ping -c 1 localhost
>  >PING localhost (127.0.0.1): 56 data bytes
>  >64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=13.8 ms
>  >wrong data byte #8 should be 0x8 but was 0xdc
>  >c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 
>26 27 28 29 2a 2b 
>  >2c 2d 2e 2f 0 0 0 0 0 0 0 0 0 0 0 0 
>  >
>  >--- localhost ping statistics ---
>  >1 packets transmitted, 1 packets received, 0% packet loss
>  >round-trip min/avg/max = 13.8/13.8/13.8 ms
>  >
>  >
>  >This is on a NoName Alpha box, running 2.4.0-test8-pre1 (with very good
>  >uptimes), but I think 2.4.2-pre2 would do the same (wrong) things as
>  >arch/alpha/kernel/traps.c wasn't really changed since ages...
> 
> I also get these, with 2.2.18pre5 (plus some Andrea patches) and
> vanilla 2.2.19pre10 on a SMP UP2000.

This is an application problem, not a kernel one.  You need to upgrade your
netkit.

Sean

P.S.  I wrote a small wrapper to aid in the debugging of unaligned traps, which
I'll send to anyone who's interested.

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Alpha: bad unaligned access handling

2001-02-14 Thread Sean Hunter


On Wed, Feb 14, 2001 at 03:11:17PM -0200, Carlos Carvalho wrote:
 Jan-Benedict Glaw ([EMAIL PROTECTED]) wrote on 14 February 2001 15:48:
  With my currently installed ping (netkit-ping 0.10-6 from Debian Woody)
  I get unaligned accesses:
  
  ping(15953): unaligned trap at 0001200030e4: 000120026b34 29 1
  ping(15953): unaligned trap at 000120003110: 000120026b2c 29 2
  
  The worse part is: they seem to be handled The Wrong Way:
  
  [jbglaw@air:/home/jbglaw] $ ping -c 1 localhost
  PING localhost (127.0.0.1): 56 data bytes
  64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=13.8 ms
  wrong data byte #8 should be 0x8 but was 0xdc
  c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 
26 27 28 29 2a 2b 
  2c 2d 2e 2f 0 0 0 0 0 0 0 0 0 0 0 0 
  
  --- localhost ping statistics ---
  1 packets transmitted, 1 packets received, 0% packet loss
  round-trip min/avg/max = 13.8/13.8/13.8 ms
  
  
  This is on a NoName Alpha box, running 2.4.0-test8-pre1 (with very good
  uptimes), but I think 2.4.2-pre2 would do the same (wrong) things as
  arch/alpha/kernel/traps.c wasn't really changed since ages...
 
 I also get these, with 2.2.18pre5 (plus some Andrea patches) and
 vanilla 2.2.19pre10 on a SMP UP2000.

This is an application problem, not a kernel one.  You need to upgrade your
netkit.

Sean

P.S.  I wrote a small wrapper to aid in the debugging of unaligned traps, which
I'll send to anyone who's interested.

 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Alpha: bad unaligned access handling

2001-02-14 Thread Sean Hunter


On Wed, Feb 14, 2001 at 03:38:33PM -0200, Carlos Carvalho wrote:
 Sean Hunter ([EMAIL PROTECTED]) wrote on 14 February 2001 17:26:
  This is an application problem, not a kernel one.  You need to upgrade your
  netkit.
 
 Yes, I was quite confident of this. However, unaligned traps are a
 frequent problem with alphas. For a looong time we had zsh produce
 lots of it, to the point of making it unusable. Strangely, the problem
 disappeared without changing anything in zsh. It was either a library
 or kernel problem.

Definitely library, I'd think.

 
  P.S. I wrote a small wrapper to aid in the debugging of unaligned
  traps, which I'll send to anyone who's interested.
 
 I'd like it!
 

OK, my alpha is a sick bunny at the moment, so I'll have to wait until I get
home (so I can see why I can't ssh to it).  What the wrapper does is set some
settings so your program gets sigbus when it generates an unaligned trap, and
then runs your program in gdb so gdb helpfully stops at the line which
generated the trap.  It goes without saying you need to build the program in
question with debugging symbols so that you see the code.

You then need to fix the unaligned access.  This sometimes requires real alpha
guruhood (Which I do not possess, but Richard Henderson or Michal Jagerman do,
if you need advice), but sometimes simply requires adding __attribute__
((__unaligned__)) to a struct member in a c file.

Sean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PCI-PCI bridges mess in 2.4.x

2000-11-10 Thread Sean Hunter


On Thu, Nov 09, 2000 at 04:31:24PM -0700, Michal Jaegermann wrote:
> On Thu, Nov 09, 2000 at 11:33:47AM -0500, Wakko Warner wrote:
> > > It was posted to lkml, so no link (except if you want to dig through
> > > lkml mail archives).
> > 
> > It booted but then it oops'ed before userland I belive.  I tried it this
> > morning and didn't have much time.  It did find the scsi controller (which
> > is across the bridge) and the drives attached so it does appear to be
> > working.
> 
> Looks so far that I am the worst off.  If I am trying to boot with
> a root on a SCSI device then either a controller is misdetected,
> or goes into an infinite "abort/reset" loop, or it does not initialize
> properly and disks are not found.  This is a non-exclusive, logical,
> "or". :-)

Me too!

Exact same symptoms on my ruffian.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PCI-PCI bridges mess in 2.4.x

2000-11-10 Thread Sean Hunter


On Thu, Nov 09, 2000 at 04:31:24PM -0700, Michal Jaegermann wrote:
 On Thu, Nov 09, 2000 at 11:33:47AM -0500, Wakko Warner wrote:
   It was posted to lkml, so no link (except if you want to dig through
   lkml mail archives).
  
  It booted but then it oops'ed before userland I belive.  I tried it this
  morning and didn't have much time.  It did find the scsi controller (which
  is across the bridge) and the drives attached so it does appear to be
  working.
 
 Looks so far that I am the worst off.  If I am trying to boot with
 a root on a SCSI device then either a controller is misdetected,
 or goes into an infinite "abort/reset" loop, or it does not initialize
 properly and disks are not found.  This is a non-exclusive, logical,
 "or". :-)

metooMe too!/metoo

Exact same symptoms on my ruffian.

Sean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PCI-PCI bridges mess in 2.4.x

2000-11-08 Thread Sean Hunter


Hi Richard.

I'm _very_ keen to try this (my Alpha won't boot 2.4 at the mo), however I
think the attachments faery has been playing tricks again.

Do you have a patch relative to 2.4.0-test10?

Sean

On Wed, Nov 08, 2000 at 01:39:31AM -0800, Richard Henderson wrote:
> [ For l-k, the issue is that pci-pci bridges and the devices behind
>   them are not initialized properly.  There are a number of Alphas
>   whose built-in scsi controlers are behind such a bridge preventing
>   these machines from booting at all.  Ivan provided an initial 
>   patch to solve this issue.  ]
> 
> I've not gotten a chance to try this on the rawhide yet,
> but I did give it a whirl on my up1000, which does have
> an agp bridge that acts like a pci bridge.
> 
> Notable changes from your patch:
> 
>   * Use kmalloc, not vmalloc.  (ouch!)
>   * Replace cropped found_vga detection code.
>   * Handle bridges with empty I/O (or MEM) ranges.
>   * Collect the proper width of the bus range.
> 
> 
> r~

Content-Description: diff vs bridges-2.4.0t10
> diff -rup linux/drivers/pci/setup-bus.c 2.4.0-11-1/drivers/pci/setup-bus.c
> --- linux/drivers/pci/setup-bus.c Wed Nov  8 01:24:16 2000
> +++ 2.4.0-11-1/drivers/pci/setup-bus.cWed Nov  8 01:04:17 2000
> @@ -20,7 +20,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  
>  
>  #define DEBUG_CONFIG 1
> @@ -56,31 +56,50 @@ pbus_assign_resources_sorted(struct pci_
>   mem_reserved += 32*1024*1024;
>   continue;
>   }
> +
> + if (dev->class >> 8 == PCI_CLASS_DISPLAY_VGA)
> + found_vga = 1;
> +
>   pdev_sort_resources(dev, _io, IORESOURCE_IO);
>   pdev_sort_resources(dev, _mem, IORESOURCE_MEM);
>   }
> +
>   for (list = head_io.next; list;) {
>   res = list->res;
>   idx = res - >dev->resource[0];
> - if (pci_assign_resource(list->dev, idx) == 0)
> + if (pci_assign_resource(list->dev, idx) == 0
> + && ranges->io_end < res->end)
>   ranges->io_end = res->end;
>   tmp = list;
>   list = list->next;
> - vfree(tmp);
> + kfree(tmp);
>   }
>   for (list = head_mem.next; list;) {
>   res = list->res;
>   idx = res - >dev->resource[0];
> - if (pci_assign_resource(list->dev, idx) == 0)
> + if (pci_assign_resource(list->dev, idx) == 0
> + && ranges->mem_end < res->end)
>   ranges->mem_end = res->end;
>   tmp = list;
>   list = list->next;
> - vfree(tmp);
> + kfree(tmp);
>   }
> +
>   ranges->io_end += io_reserved;
>   ranges->mem_end += mem_reserved;
> +
> + /* ??? How to turn off a bus from responding to, say, I/O at
> +all if there are no I/O ports behind the bus?  Turning off
> +PCI_COMMAND_IO doesn't seem to do the job.  So we must
> +allow for at least one unit.  */
> + if (ranges->io_end == ranges->io_start)
> + ranges->io_end += 1;
> + if (ranges->mem_end == ranges->mem_start)
> + ranges->mem_end += 1;
> +
>   ranges->io_end = ROUND_UP(ranges->io_end, 4*1024);
>   ranges->mem_end = ROUND_UP(ranges->mem_end, 1024*1024);
> +
>   return found_vga;
>  }
>  
> diff -rup linux/drivers/pci/setup-res.c 2.4.0-11-1/drivers/pci/setup-res.c
> --- linux/drivers/pci/setup-res.c Wed Nov  8 01:24:16 2000
> +++ 2.4.0-11-1/drivers/pci/setup-res.cWed Nov  8 00:21:13 2000
> @@ -22,10 +22,10 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  
>  
> -#define DEBUG_CONFIG 0
> +#define DEBUG_CONFIG 1
>  #if DEBUG_CONFIG
>  # define DBGC(args) printk args
>  #else
> @@ -146,7 +146,7 @@ pdev_sort_resources(struct pci_dev *dev,
>   if (ln)
>   size = ln->res->end - ln->res->start;
>   if (r->end - r->start > size) {
> - tmp = vmalloc(sizeof(*tmp));
> + tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
>   tmp->next = ln;
>   tmp->res = r;
>   tmp->dev = dev;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PCI-PCI bridges mess in 2.4.x

2000-11-08 Thread Sean Hunter


Hi Richard.

I'm _very_ keen to try this (my Alpha won't boot 2.4 at the mo), however I
think the attachments faery has been playing tricks again.

Do you have a patch relative to 2.4.0-test10?

Sean

On Wed, Nov 08, 2000 at 01:39:31AM -0800, Richard Henderson wrote:
 [ For l-k, the issue is that pci-pci bridges and the devices behind
   them are not initialized properly.  There are a number of Alphas
   whose built-in scsi controlers are behind such a bridge preventing
   these machines from booting at all.  Ivan provided an initial 
   patch to solve this issue.  ]
 
 I've not gotten a chance to try this on the rawhide yet,
 but I did give it a whirl on my up1000, which does have
 an agp bridge that acts like a pci bridge.
 
 Notable changes from your patch:
 
   * Use kmalloc, not vmalloc.  (ouch!)
   * Replace cropped found_vga detection code.
   * Handle bridges with empty I/O (or MEM) ranges.
   * Collect the proper width of the bus range.
 
 
 r~

Content-Description: diff vs bridges-2.4.0t10
 diff -rup linux/drivers/pci/setup-bus.c 2.4.0-11-1/drivers/pci/setup-bus.c
 --- linux/drivers/pci/setup-bus.c Wed Nov  8 01:24:16 2000
 +++ 2.4.0-11-1/drivers/pci/setup-bus.cWed Nov  8 01:04:17 2000
 @@ -20,7 +20,7 @@
  #include linux/errno.h
  #include linux/ioport.h
  #include linux/cache.h
 -#include linux/vmalloc.h
 +#include linux/slab.h
  
  
  #define DEBUG_CONFIG 1
 @@ -56,31 +56,50 @@ pbus_assign_resources_sorted(struct pci_
   mem_reserved += 32*1024*1024;
   continue;
   }
 +
 + if (dev-class  8 == PCI_CLASS_DISPLAY_VGA)
 + found_vga = 1;
 +
   pdev_sort_resources(dev, head_io, IORESOURCE_IO);
   pdev_sort_resources(dev, head_mem, IORESOURCE_MEM);
   }
 +
   for (list = head_io.next; list;) {
   res = list-res;
   idx = res - list-dev-resource[0];
 - if (pci_assign_resource(list-dev, idx) == 0)
 + if (pci_assign_resource(list-dev, idx) == 0
 +  ranges-io_end  res-end)
   ranges-io_end = res-end;
   tmp = list;
   list = list-next;
 - vfree(tmp);
 + kfree(tmp);
   }
   for (list = head_mem.next; list;) {
   res = list-res;
   idx = res - list-dev-resource[0];
 - if (pci_assign_resource(list-dev, idx) == 0)
 + if (pci_assign_resource(list-dev, idx) == 0
 +  ranges-mem_end  res-end)
   ranges-mem_end = res-end;
   tmp = list;
   list = list-next;
 - vfree(tmp);
 + kfree(tmp);
   }
 +
   ranges-io_end += io_reserved;
   ranges-mem_end += mem_reserved;
 +
 + /* ??? How to turn off a bus from responding to, say, I/O at
 +all if there are no I/O ports behind the bus?  Turning off
 +PCI_COMMAND_IO doesn't seem to do the job.  So we must
 +allow for at least one unit.  */
 + if (ranges-io_end == ranges-io_start)
 + ranges-io_end += 1;
 + if (ranges-mem_end == ranges-mem_start)
 + ranges-mem_end += 1;
 +
   ranges-io_end = ROUND_UP(ranges-io_end, 4*1024);
   ranges-mem_end = ROUND_UP(ranges-mem_end, 1024*1024);
 +
   return found_vga;
  }
  
 diff -rup linux/drivers/pci/setup-res.c 2.4.0-11-1/drivers/pci/setup-res.c
 --- linux/drivers/pci/setup-res.c Wed Nov  8 01:24:16 2000
 +++ 2.4.0-11-1/drivers/pci/setup-res.cWed Nov  8 00:21:13 2000
 @@ -22,10 +22,10 @@
  #include linux/errno.h
  #include linux/ioport.h
  #include linux/cache.h
 -#include linux/vmalloc.h
 +#include linux/slab.h
  
  
 -#define DEBUG_CONFIG 0
 +#define DEBUG_CONFIG 1
  #if DEBUG_CONFIG
  # define DBGC(args) printk args
  #else
 @@ -146,7 +146,7 @@ pdev_sort_resources(struct pci_dev *dev,
   if (ln)
   size = ln-res-end - ln-res-start;
   if (r-end - r-start  size) {
 - tmp = vmalloc(sizeof(*tmp));
 + tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
   tmp-next = ln;
   tmp-res = r;
   tmp-dev = dev;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Loadavg calculation

2000-11-06 Thread Sean Hunter


Sorry, I know this is a little left-field, but how about redesigning your
process so that instead of using a load_avg, you start all your calculations
from a single server on each node?  It could queue up incoming calculations,
and fork a child to do each one.

Of course, it would catch a signal when the child died, so you'd immediately
know when to start up another calculation.  If you liked, it could check the
one-minute load avg from time to time to see what would be a friendly level of
calculations overall, adjust the overall level of concurrent child processes
accordingly.

The timing, however, would still come from a signal, and would thus be
instantaneous.

Or am I being totally dumb?

Sean

On Sun, Nov 05, 2000 at 07:55:40AM -0500, [EMAIL PROTECTED] wrote:
> 
> I'm working a project a work that is using Linux to run some very
> math-intensive calculations.   One of the things we do is use the 1-minute
> loadavg to determine how busy the machine is and can we fire off another
> program to do more calculations.However, there's a problem with that.
> 
> Because it's a 1 minute load average, there's quite a bit of lag time from
> when 1 program finishes until the loadavg goes down below a threshold for
> our control mechanism to fire off another program.
> 
> Let me give an example (all on a 1-cpu PC)
> 
> HH:MM:SS
> 00:00:00  fire off 4 programs 
> 00:01:00  loadavg goes up to 4
> 00:01:30  3 of the 4 programs finish loadavg still at 4
> 00:02:20  load avg goes down to 1, below our threshold
> 00:02:21  we fire off 3 more programs.
> 
> We'd like to reduce that almost 50 second lag time.  Is it possible, in
> user-space, to duplicate the loadavg calculation period, say to a 15
> second load average, using the information in /proc?
> 
> The other option we looked at, besides using loadavg, was using idle pct%,
> but if I read the source for top right, involves reading the entire
> process table to calculate clock ticks used and then figuring out how many
> weren't used.
> 
> Ideas, opinions welcome.  Yes, I read the list, so either respond direct
> to me, or to the list.
> 
> [EMAIL PROTECTED] (Robert A. Yetman)
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Loadavg calculation

2000-11-06 Thread Sean Hunter


Sorry, I know this is a little left-field, but how about redesigning your
process so that instead of using a load_avg, you start all your calculations
from a single server on each node?  It could queue up incoming calculations,
and fork a child to do each one.

Of course, it would catch a signal when the child died, so you'd immediately
know when to start up another calculation.  If you liked, it could check the
one-minute load avg from time to time to see what would be a friendly level of
calculations overall, adjust the overall level of concurrent child processes
accordingly.

The timing, however, would still come from a signal, and would thus be
instantaneous.

Or am I being totally dumb?

Sean

On Sun, Nov 05, 2000 at 07:55:40AM -0500, [EMAIL PROTECTED] wrote:
 
 I'm working a project a work that is using Linux to run some very
 math-intensive calculations.   One of the things we do is use the 1-minute
 loadavg to determine how busy the machine is and can we fire off another
 program to do more calculations.However, there's a problem with that.
 
 Because it's a 1 minute load average, there's quite a bit of lag time from
 when 1 program finishes until the loadavg goes down below a threshold for
 our control mechanism to fire off another program.
 
 Let me give an example (all on a 1-cpu PC)
 
 HH:MM:SS
 00:00:00  fire off 4 programs 
 00:01:00  loadavg goes up to 4
 00:01:30  3 of the 4 programs finish loadavg still at 4
 00:02:20  load avg goes down to 1, below our threshold
 00:02:21  we fire off 3 more programs.
 
 We'd like to reduce that almost 50 second lag time.  Is it possible, in
 user-space, to duplicate the loadavg calculation period, say to a 15
 second load average, using the information in /proc?
 
 The other option we looked at, besides using loadavg, was using idle pct%,
 but if I read the source for top right, involves reading the entire
 process table to calculate clock ticks used and then figuring out how many
 weren't used.
 
 Ideas, opinions welcome.  Yes, I read the list, so either respond direct
 to me, or to the list.
 
 [EMAIL PROTECTED] (Robert A. Yetman)
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10 Sluggish After Load

2000-11-01 Thread Sean Hunter


On Wed, Nov 01, 2000 at 11:10:46AM -0600, matthew wrote:
> On Wed, 1 Nov 2000, Sean Hunter wrote:
> 
> > Pardon my speculations (if I am wrong), but isn't this an oracle question?  
> 
> 
> It could be.
> 
> 
> > Isn't oracle killing the server by trying to clean up 1800 connections all at
> > once?  When they're all connected, most of the work is done by one or two
> > oracle processes, but when you kill your ddos thing, all of the oracle
> > listeners (of which there is one per connection), steam in and try to clean up.
> 
> 
> Yes, but the factor that drove me to the list was that it's been > 400
> load average for 10 hours now.  Even if Oracle tried to clean up 1800
> connections at once, would it take this long?  That's not rhetorical, as
> the answer may well be "yes".
> 

Yup.  What seems to have happened is that waking up 1800 processes at once has
caused the box to thrash so hard it is taking ages for any one process to get
enough scheduler time to clean itself up and exit.

I guess we may need a thrash preventer that slows things down enough for each
process to get a healthy bite of the cherry.

Sean

> 
> > I thought oracle had an internal connection limit (on our servers it is set to
> > 440 connections), anyways.
> 
> 
> This is set in the init.ora.  I jacked it up to allow > 2000 connections.
> 
> Matthew
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

43 matches

Mail list logo