Re: 2.6.21-rc3-mm1 RSDL results

2007-03-13 Thread Mark Lord

Con Kolivas wrote:

On Wednesday 14 March 2007 05:21, Mark Lord wrote:

Con Kolivas wrote:

Can you try the new version of RSDL. Assuming it doesn't oops on you it
has some accounting bugfixes which may have been biting you.

Retesting today with 2.6.21-rc3-git7 + 2.6.21-rc3-sched-rsdl-0.30.patch.

Still not pleasant to use the GUI with a kernel build (-j1 or -j2)
happening unless the build is manually "nice'd".

Also, accounting looks weird in top(1).

With a 100% busy machine, top will show something like this :

top - 14:20:11 up 10:22,  1 user,  load average: 2.65, 2.80, 2.18
Tasks: 134 total,   4 running, 128 sleeping,   0 stopped,   2 zombie
Cpu(s): 68.7% us,  6.7% sy, 24.7% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0%
si Mem:   2076964k total,  2002560k used,74404k free,   148924k
buffers Swap:  2409740k total,  244k used,  2409496k free,  1448876k
cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1824 root  36  10 11748 7244 1936 R  4.0  0.3   0:00.12 cc1
 1845 root  31   0  8080 5272 1412 R  1.7  0.3   0:00.05 cc1
 4139 root  20   0  176m  35m 6860 S  1.3  1.7  18:59.35 Xorg
29381 root  20   0 33712  16m  12m R  1.0  0.8   0:27.24 konsole
3 root  20   0 000 S  0.3  0.0   0:00.49 events/0
 1529 root  20   0  2556 1460  752 S  0.3  0.1   0:00.05 make
14623 root  20   0  2200 1144  860 R  0.3  0.1   0:00.89 top
1 root  20   0  1568  532  464 S  0.0  0.0   0:00.22 init
2 root  39  19 000 S  0.0  0.0   0:00.01 ksoftirqd/0
4 root  20   0 000 S  0.0  0.0   0:00.00 khelper
5 root  20   0 000 S  0.0  0.0   0:00.00 kthread

Mmm.. I wonder where all of that 100% CPU went to.. the busiest tasks
are only showing up as 4.0% and 1.7% (when in fact they are using near
100%).


Nothing ever looks like it stays running for very long. That would be enough 
to account for this sort of top picture.


Sorry, I just don't buy that one.  This was a 2-second sampling interval in top.
top(1) is a program that has to work, so if this scheduler breaks it like this,
then we need to understand and fix top(1) or the scheduler.


What HZ are you running? Do you usually run two makes at different nice levels?


This was HZ=1000, with NO_HZ.  And, no, not normally different nice levels.
Here I was just trying to keep the machine usable while building a couple of 
things.

Keep at it.  Someday this might be good enough for mainline,
but right now the stock scheduler beats it for my desktop (notebook) loads.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-13 Thread Con Kolivas
On Wednesday 14 March 2007 05:21, Mark Lord wrote:
> Con Kolivas wrote:
> > Can you try the new version of RSDL. Assuming it doesn't oops on you it
> > has some accounting bugfixes which may have been biting you.
>
> Retesting today with 2.6.21-rc3-git7 + 2.6.21-rc3-sched-rsdl-0.30.patch.
>
> Still not pleasant to use the GUI with a kernel build (-j1 or -j2)
> happening unless the build is manually "nice'd".
>
> Also, accounting looks weird in top(1).
>
> With a 100% busy machine, top will show something like this :
> > top - 14:20:11 up 10:22,  1 user,  load average: 2.65, 2.80, 2.18
> > Tasks: 134 total,   4 running, 128 sleeping,   0 stopped,   2 zombie
> > Cpu(s): 68.7% us,  6.7% sy, 24.7% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0%
> > si Mem:   2076964k total,  2002560k used,74404k free,   148924k
> > buffers Swap:  2409740k total,  244k used,  2409496k free,  1448876k
> > cached
> >
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> >  1824 root  36  10 11748 7244 1936 R  4.0  0.3   0:00.12 cc1
> >  1845 root  31   0  8080 5272 1412 R  1.7  0.3   0:00.05 cc1
> >  4139 root  20   0  176m  35m 6860 S  1.3  1.7  18:59.35 Xorg
> > 29381 root  20   0 33712  16m  12m R  1.0  0.8   0:27.24 konsole
> > 3 root  20   0 000 S  0.3  0.0   0:00.49 events/0
> >  1529 root  20   0  2556 1460  752 S  0.3  0.1   0:00.05 make
> > 14623 root  20   0  2200 1144  860 R  0.3  0.1   0:00.89 top
> > 1 root  20   0  1568  532  464 S  0.0  0.0   0:00.22 init
> > 2 root  39  19 000 S  0.0  0.0   0:00.01 ksoftirqd/0
> > 4 root  20   0 000 S  0.0  0.0   0:00.00 khelper
> > 5 root  20   0 000 S  0.0  0.0   0:00.00 kthread
>
> Mmm.. I wonder where all of that 100% CPU went to.. the busiest tasks
> are only showing up as 4.0% and 1.7% (when in fact they are using near
> 100%).

Nothing ever looks like it stays running for very long. That would be enough 
to account for this sort of top picture. What HZ are you running? Do you 
usually run two makes at different nice levels?

Thanks.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-13 Thread Mark Lord

Con Kolivas wrote:


Can you try the new version of RSDL. Assuming it doesn't oops on you it has 
some accounting bugfixes which may have been biting you.


Retesting today with 2.6.21-rc3-git7 + 2.6.21-rc3-sched-rsdl-0.30.patch.

Still not pleasant to use the GUI with a kernel build (-j1 or -j2) happening
unless the build is manually "nice'd".

Also, accounting looks weird in top(1).
With a 100% busy machine, top will show something like this :


top - 14:20:11 up 10:22,  1 user,  load average: 2.65, 2.80, 2.18
Tasks: 134 total,   4 running, 128 sleeping,   0 stopped,   2 zombie
Cpu(s): 68.7% us,  6.7% sy, 24.7% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:   2076964k total,  2002560k used,74404k free,   148924k buffers
Swap:  2409740k total,  244k used,  2409496k free,  1448876k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1824 root  36  10 11748 7244 1936 R  4.0  0.3   0:00.12 cc1
 1845 root  31   0  8080 5272 1412 R  1.7  0.3   0:00.05 cc1
 4139 root  20   0  176m  35m 6860 S  1.3  1.7  18:59.35 Xorg
29381 root  20   0 33712  16m  12m R  1.0  0.8   0:27.24 konsole
3 root  20   0 000 S  0.3  0.0   0:00.49 events/0
 1529 root  20   0  2556 1460  752 S  0.3  0.1   0:00.05 make
14623 root  20   0  2200 1144  860 R  0.3  0.1   0:00.89 top
1 root  20   0  1568  532  464 S  0.0  0.0   0:00.22 init
2 root  39  19 000 S  0.0  0.0   0:00.01 ksoftirqd/0
4 root  20   0 000 S  0.0  0.0   0:00.00 khelper
5 root  20   0 000 S  0.0  0.0   0:00.00 kthread


Mmm.. I wonder where all of that 100% CPU went to.. the busiest tasks
are only showing up as 4.0% and 1.7% (when in fact they are using near 100%).

Cheers
  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-11 Thread Con Kolivas
On Sunday 11 March 2007 23:38, James Cloos wrote:
> |> See:
> |> http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r200/r200_i
> |>octl.c?revision=1.37&view=markup
>
> OK.
>
> Mesa is in git, now, but that still applies.  The gitweb url is:
>
> http://gitweb.freedesktop.org/?p=mesa/mesa.git
>
> and for the version of the above file in the master branch:
>
> http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=blob;f=src/mesa/drivers/dr
>i/r200/r200_ioctl.c
>
> The recursive grep(1) on mesa shows:
>
> ,[grep -r sched_yield mesa]
>
> | mesa/mesa/src/mesa/drivers/dri/r300/radeon_ioctl.c: sched_yield();
> | mesa/mesa/src/mesa/drivers/dri/i915tex/intel_batchpool.c: 
> | sched_yield();
> | mesa/mesa/src/mesa/drivers/dri/i915tex/intel_batchbuffer.c:
> | sched_yield(); mesa/mesa/src/mesa/drivers/dri/common/vblank.h:#include
> |/* for sched_yield() */
> | mesa/mesa/src/mesa/drivers/dri/common/vblank.h:#include/*
> | for sched_yield() */ mesa/mesa/src/mesa/drivers/dri/common/vblank.h: 
> | sched_yield();  \
> | mesa/mesa/src/mesa/drivers/dri/unichrome/via_ioctl.c:  sched_yield();
> | mesa/mesa/src/mesa/drivers/dri/i915/intel_ioctl.c:   sched_yield();
> | mesa/mesa/src/mesa/drivers/dri/r200/r200_ioctl.c:   sched_yield();
>
> `
>
> Thanks for the heads up.  I must've grep(1)ed the xorg subdir rather
> than the parent dir, and so missed mesa.

I just wonder what the heck all these will do to testing when using any of 
these drivers. Whether or not we do no yield, mild yield or full blown 
expiration yield, somehow or other I can't get over the feeling that if the 
code relies on yield() we can't really trust them to be meaningful cpu 
scheduler tests. This means most 3d apps out there that aren't using binary 
drivers, whether they be (fscking) glxgears, audio app visualisations or 
what...

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-11 Thread James Cloos
|> See:
|> 
http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r200/r200_ioctl.c?revision=1.37&view=markup

OK.

Mesa is in git, now, but that still applies.  The gitweb url is:

http://gitweb.freedesktop.org/?p=mesa/mesa.git

and for the version of the above file in the master branch:

http://gitweb.freedesktop.org/?p=mesa/mesa.git;a=blob;f=src/mesa/drivers/dri/r200/r200_ioctl.c

The recursive grep(1) on mesa shows:

,[grep -r sched_yield mesa]
| mesa/mesa/src/mesa/drivers/dri/r300/radeon_ioctl.c:   sched_yield();
| mesa/mesa/src/mesa/drivers/dri/i915tex/intel_batchpool.c:  sched_yield();
| mesa/mesa/src/mesa/drivers/dri/i915tex/intel_batchbuffer.c: 
sched_yield();
| mesa/mesa/src/mesa/drivers/dri/common/vblank.h:#include/* for 
sched_yield() */
| mesa/mesa/src/mesa/drivers/dri/common/vblank.h:#include/* for 
sched_yield() */
| mesa/mesa/src/mesa/drivers/dri/common/vblank.h:  sched_yield();   
\
| mesa/mesa/src/mesa/drivers/dri/unichrome/via_ioctl.c:  sched_yield();
| mesa/mesa/src/mesa/drivers/dri/i915/intel_ioctl.c: sched_yield();
| mesa/mesa/src/mesa/drivers/dri/r200/r200_ioctl.c:   sched_yield();
`

Thanks for the heads up.  I must've grep(1)ed the xorg subdir rather
than the parent dir, and so missed mesa.

-JimC
-- 
James Cloos <[EMAIL PROTECTED]> OpenPGP: 1024D/ED7DAEA6
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-10 Thread Con Kolivas
On Sunday 11 March 2007 10:34, Con Kolivas wrote:
> On Sunday 11 March 2007 05:21, Mark Lord wrote:
> > Con Kolivas wrote:
> > > On Saturday 10 March 2007 05:07, Mark Lord wrote:
> > >> Mmm.. when it's good, it's *really* good.
> > >> My desktop feels snappier and all of that.
> > >
> > >..
> > >
> > >> But when it's bad, it stinks.
> > >> Like when a "make -j2" kernel rebuild is happening in a background
> > >> window
> > >
> > > And that's bad. When you say "it stinks" is it more than 3 times
> > > slower? It should be precisely 3 times slower under that load (although
> > > low cpu using things like audio wont be affected by running 3 times
> > > slower). If it feels like much more than that much slower, there is a
> > > bug there somewhere.
> >
> > Scrolling windows is incredibly jerkey, and very very sluggish
> > when images are involved (eg. a large web page in firefox).
> >
> > > As another reader suggested, how does it run with the compile 'niced'?
> > > How does it perform with make (without a -j number).
> >
> > Yes, it behaves itself when the "make -j2" is nice'd.
> >
> > >> This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook).
> > >
> > > What HZ are you running? Are you running a Beryl desktop?
> >
> > HZ==1000, NO_HZ, Kubunutu Dapper Drake distro, ATI X300 open-source X.org
> > driver.
>
> Can you try the new version of RSDL. Assuming it doesn't oops on you it has
> some accounting bugfixes which may have been biting you.

Oh I just checked the mesa repo for that driver as well. It seems the r300 
drivers have sched_yield in them as well, but not all components. You may be 
getting bitten by this too.

http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r300/radeon_ioctl.c?revision=1.14&view=markup

I don't really know what the radeon and other models are so I'm not sure if it 
applies to your hardware; I just did a random search through the r300 
directory.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-10 Thread Con Kolivas
On Sunday 11 March 2007 05:21, Mark Lord wrote:
> Con Kolivas wrote:
> > On Saturday 10 March 2007 05:07, Mark Lord wrote:
> >> Mmm.. when it's good, it's *really* good.
> >> My desktop feels snappier and all of that.
> >
> >..
> >
> >> But when it's bad, it stinks.
> >> Like when a "make -j2" kernel rebuild is happening in a background
> >> window
> >
> > And that's bad. When you say "it stinks" is it more than 3 times slower?
> > It should be precisely 3 times slower under that load (although low cpu
> > using things like audio wont be affected by running 3 times slower). If
> > it feels like much more than that much slower, there is a bug there
> > somewhere.
>
> Scrolling windows is incredibly jerkey, and very very sluggish
> when images are involved (eg. a large web page in firefox).
>
> > As another reader suggested, how does it run with the compile 'niced'?
> > How does it perform with make (without a -j number).
>
> Yes, it behaves itself when the "make -j2" is nice'd.
>
> >> This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook).
> >
> > What HZ are you running? Are you running a Beryl desktop?
>
> HZ==1000, NO_HZ, Kubunutu Dapper Drake distro, ATI X300 open-source X.org
> driver.

Can you try the new version of RSDL. Assuming it doesn't oops on you it has 
some accounting bugfixes which may have been biting you.

Thanks
-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-10 Thread Con Kolivas
On Sunday 11 March 2007 04:01, James Cloos wrote:
> > "Con" == Con Kolivas <[EMAIL PROTECTED]> writes:
>
> Con> It's sad that sched_yield is still in our graphics card drivers ...
>
> I just did a recursive grep(1) on my mirror of the freedesktop git
> repos for sched_yield.  This only checked the master branches as I
> did not bother to script up something to clone each, check out all
> branches in turn, and grep(1) each possibility.
>
> The output is just:
> :; grep -r sched_yield FDO/xorg
>
> FDO/xorg/xserver/hw/kdrive/via/viadraw.c: sched_yield();
> FDO/xorg/driver/xf86-video-glint/src/pm2_video.c:if (sync) /*
> sched_yield? */
>
> Is there something else I should grep(1) for?  If not, it looks as
> if sched_yield(2) has been evicted from the drivers.

See:

http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r200/r200_ioctl.c?revision=1.37&view=markup

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-10 Thread Mark Lord

Con Kolivas wrote:

On Saturday 10 March 2007 05:07, Mark Lord wrote:

Mmm.. when it's good, it's *really* good.
My desktop feels snappier and all of that.

..

But when it's bad, it stinks.
Like when a "make -j2" kernel rebuild is happening in a background window


And that's bad. When you say "it stinks" is it more than 3 times slower? It 
should be precisely 3 times slower under that load (although low cpu using 
things like audio wont be affected by running 3 times slower). If it feels 
like much more than that much slower, there is a bug there somewhere.


Scrolling windows is incredibly jerkey, and very very sluggish
when images are involved (eg. a large web page in firefox).

As another reader suggested, how does it run with the compile 'niced'? How does 
it perform with make (without a -j number).


Yes, it behaves itself when the "make -j2" is nice'd.


This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook).


What HZ are you running? Are you running a Beryl desktop?


HZ==1000, NO_HZ, Kubunutu Dapper Drake distro, ATI X300 open-source X.org 
driver.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-10 Thread James Cloos
> "Con" == Con Kolivas <[EMAIL PROTECTED]> writes:

Con> It's sad that sched_yield is still in our graphics card drivers ...

I just did a recursive grep(1) on my mirror of the freedesktop git
repos for sched_yield.  This only checked the master branches as I
did not bother to script up something to clone each, check out all
branches in turn, and grep(1) each possibility.

The output is just:

:; grep -r sched_yield FDO/xorg
FDO/xorg/xserver/hw/kdrive/via/viadraw.c:   sched_yield();
FDO/xorg/driver/xf86-video-glint/src/pm2_video.c:if (sync) /* sched_yield? 
*/

Is there something else I should grep(1) for?  If not, it looks as
if sched_yield(2) has been evicted from the drivers.

-JimC
-- 
James Cloos <[EMAIL PROTECTED]> OpenPGP: 1024D/ED7DAEA6
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 13:26, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 01:20:22PM +1100, Con Kolivas wrote:
> > Progress at last! And without any patches! Well those look very
> > reasonable to me. Especially since -j5 is a worst case scenario.
>
> Well that's with a noyield patch and your sched_tick fix.
>
> > But would you say it's still _adequate_ with ccache considering you
> > only have 1/6th cpu left for X? With and without ccache it's quite a
> > different workload so they will behave differently.
>
> No, I don't think 1/6th is being left for X in the ccache case so I
> think there's a bug lurking here. My memload, execload, and forkload
> test cases did better even with X niced.
>
> To confirm, I've just run 15 instances of memload with unniced Xorg
> and it performs better than make -j 5 with ccache.
>
> If I have some time tomorrow, I'll try to do a straight -mm1 to mm2
> comparison with different loads.

Great, thanks very much for all that. I've found a few subtle bugs in the 
process and some that haven't made it to the list either. I'll respin a set 
of patches against -mm2 with the changes shortly.

Thanks!

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 01:20:22PM +1100, Con Kolivas wrote:
> Progress at last! And without any patches! Well those look very reasonable to 
> me. Especially since -j5 is a worst case scenario.

Well that's with a noyield patch and your sched_tick fix.

> But would you say it's still _adequate_ with ccache considering you
> only have 1/6th cpu left for X? With and without ccache it's quite a
> different workload so they will behave differently.

No, I don't think 1/6th is being left for X in the ccache case so I
think there's a bug lurking here. My memload, execload, and forkload
test cases did better even with X niced.

To confirm, I've just run 15 instances of memload with unniced Xorg
and it performs better than make -j 5 with ccache.

If I have some time tomorrow, I'll try to do a straight -mm1 to mm2
comparison with different loads.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 12:42, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 12:28:38PM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 11:49, Matt Mackall wrote:
> > > On Sat, Mar 10, 2007 at 11:34:26AM +1100, Con Kolivas wrote:
> > > > Ok, so some of the basics then. Can you please give me the output of
> > > > 'top -b' running for a few seconds during the whole affair?
> > >
> > > Here you go:
> > >
> > > http://selenic.com/baseline
> > > http://selenic.com/underload
> > >
> > > This is with 2.6.20+rsdl+tickfix at HZ=250.
> > >
> > > Something I haven't mentioned about my setup is that I'm using ccache.
> > > And it turns out disabling ccache makes a large difference. Going to
> > > switch back to a NO_HZ kernel and see what that looks like.
> >
> > Your X is reniced to -10 so try again with X nice 0 please.
>
> Doh, can't believe I didn't notice that. That's apparently a default
> in Debian/unstable (not sure where to tweak it).

See other email from Kyle on how to dpkg reconfigure. I submitted a bug report 
to debian years ago about this and I presume it was fixed but you've probably 
slowly dist upgraded from an older version and it stayed in your config?

> Reniced: 
>
>  without ccachewith ccache
> make -j 5
>  beryl   good  ok
>  galeon  ok/good   ok
>  mp3 good  good
>  terminalgood  ok
>  mouse   good  ok

Progress at last! And without any patches! Well those look very reasonable to 
me. Especially since -j5 is a worst case scenario.

> We're still left with a big unexplained ccache differential,

But would you say it's still _adequate_ with ccache considering you only have 
1/6th cpu left for X? With and without ccache it's quite a different workload 
so they will behave differently.

> and a big 
> NO_HZ vs HZ=250 differential.

That part I don't know about. You've only tested the difference with X running 
nice -10. I need to look further at the mechanism for -nice tasks. It should 
be possible to run smoothly even with a -niced X (although that was never my 
intent) so perhaps that's not working properly. I'll look into that.

Thanks!

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Kyle Moffett

On Mar 09, 2007, at 20:42:30, Matt Mackall wrote:
Doh, can't believe I didn't notice that. That's apparently a  
default in Debian/unstable (not sure where to tweak it).


Run this:
[EMAIL PROTECTED]:~# dpkg-reconfigure xserver-xorg

It should ask you if you want to run the X-server at a lower  
(higher?) nice level.


To decrease the minimum-debconf-priority for new package  
installations you can run this:


[EMAIL PROTECTED]:~# dpkg-reconfigure debconf

Change the "Ignore questions with a priority less than: " value to  
"medium" or "low".


Cheers,
Kyle Moffett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 12:28:38PM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 11:49, Matt Mackall wrote:
> > On Sat, Mar 10, 2007 at 11:34:26AM +1100, Con Kolivas wrote:
> > > Ok, so some of the basics then. Can you please give me the output of 'top
> > > -b' running for a few seconds during the whole affair?
> >
> > Here you go:
> >
> > http://selenic.com/baseline
> > http://selenic.com/underload
> >
> > This is with 2.6.20+rsdl+tickfix at HZ=250.
> >
> > Something I haven't mentioned about my setup is that I'm using ccache.
> > And it turns out disabling ccache makes a large difference. Going to
> > switch back to a NO_HZ kernel and see what that looks like.
> 
> Your X is reniced to -10 so try again with X nice 0 please.

Doh, can't believe I didn't notice that. That's apparently a default
in Debian/unstable (not sure where to tweak it). Reniced:

 without ccachewith ccache
make -j 5 
 beryl   good  ok
 galeon  ok/good   ok
 mp3 good  good
 terminalgood  ok
 mouse   good  ok

We're still left with a big unexplained ccache differential, and a big
NO_HZ vs HZ=250 differential.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 11:49, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 11:34:26AM +1100, Con Kolivas wrote:
> > Ok, so some of the basics then. Can you please give me the output of 'top
> > -b' running for a few seconds during the whole affair?
>
> Here you go:
>
> http://selenic.com/baseline
> http://selenic.com/underload
>
> This is with 2.6.20+rsdl+tickfix at HZ=250.
>
> Something I haven't mentioned about my setup is that I'm using ccache.
> And it turns out disabling ccache makes a large difference. Going to
> switch back to a NO_HZ kernel and see what that looks like.

Your X is reniced to -10 so try again with X nice 0 please.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 12:02:25PM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 09:12, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:57, Willy Tarreau wrote:
> > > On Fri, Mar 09, 2007 at 03:39:59PM -0600, Matt Mackall wrote:
> > > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> 
> > > > 5x memload: good
> > > > 5x execload: good
> > > > 5x forkload: good
> > > > 5 parallel makes: mostly good
> > > > make -j 5: bad
> > > >
> > > > So what's different between makes in parallel and make -j 5? Make's
> > > > job server uses pipe I/O to control how many jobs are running.
> > >
> > > Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
> > > that he added bugs when porting to -mm, or that someting in -mm causes
> > > the trouble. Your experience with -mm seems so much different from mine
> > > with mainline, there must be a difference somewhere !
> >
> > Good idea.
> 
> It's all very odd Matt. It really isn't behaving anything like you describe 
> for myself or others. It sounds more like a real bug than what the design 
> would do at all. The only things that are different on yours is Beryl and a 
> different graphics card. When you're comparing to mainline are you 
> comparing -mm1 to -mm2 to ensure something else from -mm isn't responsible? 
> Also have you tried rsdl on 2.6.20 as Willy suggested?

Haven't tried -mm2. So far I've tried 2.6.21-rc2-mm2 (aka 'stock'),
2.6.21-rc3-mm1, and 2.6.20+rsdl.

I also did a test with Metacity and saw the same issues under load. So
I think Beryl is not part of the problem.

Right now it's looking like the problem is caused by ccache. Disabling
ccache with 2.6.21-rc2-mm1+tickfix+noyield lets me run make -j 5
acceptably. So my new column would be:

RSDL+NO_HZ+tickfix+noyield+noccache
make -j 5
 beryl  ok/good
 galeon ok/good
 mp3good
 terminal   good
 mouse  good

So it's about on par with 2.6.20, maybe slightly better.

I suspect ccache lock contention is somehow involved though it doesn't
explain why my 5 independent makes test beats out make -j 5.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 09:12, Con Kolivas wrote:
> On Saturday 10 March 2007 08:57, Willy Tarreau wrote:
> > On Fri, Mar 09, 2007 at 03:39:59PM -0600, Matt Mackall wrote:
> > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:

> > > 5x memload: good
> > > 5x execload: good
> > > 5x forkload: good
> > > 5 parallel makes: mostly good
> > > make -j 5: bad
> > >
> > > So what's different between makes in parallel and make -j 5? Make's
> > > job server uses pipe I/O to control how many jobs are running.
> >
> > Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
> > that he added bugs when porting to -mm, or that someting in -mm causes
> > the trouble. Your experience with -mm seems so much different from mine
> > with mainline, there must be a difference somewhere !
>
> Good idea.

It's all very odd Matt. It really isn't behaving anything like you describe 
for myself or others. It sounds more like a real bug than what the design 
would do at all. The only things that are different on yours is Beryl and a 
different graphics card. When you're comparing to mainline are you 
comparing -mm1 to -mm2 to ensure something else from -mm isn't responsible? 
Also have you tried rsdl on 2.6.20 as Willy suggested? I would really love to 
get to the bottom of this as it really shouldn't behave that way under load 
no matter how the load is dished out.

Thanks!

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 11:34:26AM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 09:29, Matt Mackall wrote:
> > On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > > > > newly-started processes.
> > > > > > >
> > > > > > > Ah that's some nice detective work there. Mainline does some
> > > > > > > rather complex accounting on sched_fork including (possibly) a
> > > > > > > whole timer tick which rsdl does not do. make forks off
> > > > > > > continuously so what you say may well be correct. I'll see if I
> > > > > > > can try to revert to the mainline behaviour in sched_fork (which
> > > > > > > was obviously there for a reason).
> > > > > >
> > > > > > Wow! Thanks Matt. You've found a real bug too. This seems to fix
> > > > > > the qemu misbehaviour and bitmap errors so far too! Now can you
> > > > > > please try this to see if it fixes your problem?
> > > > >
> > > > > Sorry, it's about the same. I now suspect an accounting glitch
> > > > > involving pipe wake-ups.
> > > > >
> > > > > 5x memload: good
> > > > > 5x execload: good
> > > > > 5x forkload: good
> > > > > 5 parallel makes: mostly good
> > > > > make -j 5: bad
> > > > >
> > > > > So what's different between makes in parallel and make -j 5? Make's
> > > > > job server uses pipe I/O to control how many jobs are running.
> > > >
> > > > Hmm it must be those deep pipes again then. I removed any quirks
> > > > testing for those from mainline as I suspected it would be ok. Guess
> > > > I"m wrong.
> > >
> > > I shouldn't blame this straight up though if NO_HZ makes it better.
> > > Something else is going wrong... wtf though?
> >
> > Just so we're clear, dynticks has only 'fixed' the single non-parallel
> > make load so far.
> 
> Ok, so some of the basics then. Can you please give me the output of 'top -b' 
> running for a few seconds during the whole affair?

Here you go:

http://selenic.com/baseline
http://selenic.com/underload
 
This is with 2.6.20+rsdl+tickfix at HZ=250.

Something I haven't mentioned about my setup is that I'm using ccache.
And it turns out disabling ccache makes a large difference. Going to
switch back to a NO_HZ kernel and see what that looks like.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 09:29, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > > > newly-started processes.
> > > > > >
> > > > > > Ah that's some nice detective work there. Mainline does some
> > > > > > rather complex accounting on sched_fork including (possibly) a
> > > > > > whole timer tick which rsdl does not do. make forks off
> > > > > > continuously so what you say may well be correct. I'll see if I
> > > > > > can try to revert to the mainline behaviour in sched_fork (which
> > > > > > was obviously there for a reason).
> > > > >
> > > > > Wow! Thanks Matt. You've found a real bug too. This seems to fix
> > > > > the qemu misbehaviour and bitmap errors so far too! Now can you
> > > > > please try this to see if it fixes your problem?
> > > >
> > > > Sorry, it's about the same. I now suspect an accounting glitch
> > > > involving pipe wake-ups.
> > > >
> > > > 5x memload: good
> > > > 5x execload: good
> > > > 5x forkload: good
> > > > 5 parallel makes: mostly good
> > > > make -j 5: bad
> > > >
> > > > So what's different between makes in parallel and make -j 5? Make's
> > > > job server uses pipe I/O to control how many jobs are running.
> > >
> > > Hmm it must be those deep pipes again then. I removed any quirks
> > > testing for those from mainline as I suspected it would be ok. Guess
> > > I"m wrong.
> >
> > I shouldn't blame this straight up though if NO_HZ makes it better.
> > Something else is going wrong... wtf though?
>
> Just so we're clear, dynticks has only 'fixed' the single non-parallel
> make load so far.

Ok, so some of the basics then. Can you please give me the output of 'top -b' 
running for a few seconds during the whole affair?

Thanks very much for your testing so far!

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 10:06, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 10:02:37AM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 09:29, Matt Mackall wrote:
> > > On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > > > > So what's different between makes in parallel and make -j 5?
> > > > > > Make's job server uses pipe I/O to control how many jobs are
> > > > > > running.
> > > > >
> > > > > Hmm it must be those deep pipes again then. I removed any quirks
> > > > > testing for those from mainline as I suspected it would be ok.
> > > > > Guess I"m wrong.
> > > >
> > > > I shouldn't blame this straight up though if NO_HZ makes it better.
> > > > Something else is going wrong... wtf though?
> > >
> > > Just so we're clear, dynticks has only 'fixed' the single non-parallel
> > > make load so far.
> >
> > Ok, back to the pipe idea. Without needing a kernel recompile, can you
> > try running the make -j5 as a SCHED_BATCH task?
>
> Seems the same.
>
> Oddly, nice make -j 5 is better than batch (but not quite up to stock).

Shouldn't be odd. SCHED_BATCH (as Ingo implemented it which is what I'm trying 
to reproduce for RSDL) is meant to give the same cpu as the same nice level, 
but not give low latency. Nice on the other hand will give much less cpu.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 10:02:37AM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 09:29, Matt Mackall wrote:
> > On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > > > So what's different between makes in parallel and make -j 5? Make's
> > > > > job server uses pipe I/O to control how many jobs are running.
> > > >
> > > > Hmm it must be those deep pipes again then. I removed any quirks
> > > > testing for those from mainline as I suspected it would be ok. Guess
> > > > I"m wrong.
> > >
> > > I shouldn't blame this straight up though if NO_HZ makes it better.
> > > Something else is going wrong... wtf though?
> >
> > Just so we're clear, dynticks has only 'fixed' the single non-parallel
> > make load so far.
> 
> Ok, back to the pipe idea. Without needing a kernel recompile, can you try 
> running the make -j5 as a SCHED_BATCH task?

Seems the same.

Oddly, nice make -j 5 is better than batch (but not quite up to stock).

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 09:29, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > > So what's different between makes in parallel and make -j 5? Make's
> > > > job server uses pipe I/O to control how many jobs are running.
> > >
> > > Hmm it must be those deep pipes again then. I removed any quirks
> > > testing for those from mainline as I suspected it would be ok. Guess
> > > I"m wrong.
> >
> > I shouldn't blame this straight up though if NO_HZ makes it better.
> > Something else is going wrong... wtf though?
>
> Just so we're clear, dynticks has only 'fixed' the single non-parallel
> make load so far.

Ok, back to the pipe idea. Without needing a kernel recompile, can you try 
running the make -j5 as a SCHED_BATCH task?

This wrapper will make it possible:
http://freequaos.host.sk/schedtool/schedtool-1.2.9.tar.bz2

then
schedtool -B -e make -j5

If that helps it gives me something to work with.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 09:18:05AM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > > newly-started processes.
> > > > >
> > > > > Ah that's some nice detective work there. Mainline does some rather
> > > > > complex accounting on sched_fork including (possibly) a whole timer
> > > > > tick which rsdl does not do. make forks off continuously so what you
> > > > > say may well be correct. I'll see if I can try to revert to the
> > > > > mainline behaviour in sched_fork (which was obviously there for a
> > > > > reason).
> > > >
> > > > Wow! Thanks Matt. You've found a real bug too. This seems to fix the
> > > > qemu misbehaviour and bitmap errors so far too! Now can you please try
> > > > this to see if it fixes your problem?
> > >
> > > Sorry, it's about the same. I now suspect an accounting glitch involving
> > > pipe wake-ups.
> > >
> > > 5x memload: good
> > > 5x execload: good
> > > 5x forkload: good
> > > 5 parallel makes: mostly good
> > > make -j 5: bad
> > >
> > > So what's different between makes in parallel and make -j 5? Make's
> > > job server uses pipe I/O to control how many jobs are running.
> >
> > Hmm it must be those deep pipes again then. I removed any quirks testing
> > for those from mainline as I suspected it would be ok. Guess I"m wrong.
> 
> I shouldn't blame this straight up though if NO_HZ makes it better. Something 
> else is going wrong... wtf though?

Just so we're clear, dynticks has only 'fixed' the single non-parallel
make load so far.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 09:12:07AM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 08:57, Willy Tarreau wrote:
> > On Fri, Mar 09, 2007 at 03:39:59PM -0600, Matt Mackall wrote:
> > > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > > newly-started processes.
> > > > >
> > > > > Ah that's some nice detective work there. Mainline does some rather
> > > > > complex accounting on sched_fork including (possibly) a whole timer
> > > > > tick which rsdl does not do. make forks off continuously so what you
> > > > > say may well be correct. I'll see if I can try to revert to the
> > > > > mainline behaviour in sched_fork (which was obviously there for a
> > > > > reason).
> > > >
> > > > Wow! Thanks Matt. You've found a real bug too. This seems to fix the
> > > > qemu misbehaviour and bitmap errors so far too! Now can you please try
> > > > this to see if it fixes your problem?
> > >
> > > Sorry, it's about the same. I now suspect an accounting glitch involving
> > > pipe wake-ups.
> > >
> > > 5x memload: good
> > > 5x execload: good
> > > 5x forkload: good
> > > 5 parallel makes: mostly good
> > > make -j 5: bad
> > >
> > > So what's different between makes in parallel and make -j 5? Make's
> > > job server uses pipe I/O to control how many jobs are running.
> >
> > Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
> > that he added bugs when porting to -mm, or that someting in -mm causes
> > the trouble. Your experience with -mm seems so much different from mine
> > with mainline, there must be a difference somewhere !
> 
> Good idea.

2.6.20+RSDL+tickfix+noyield behaves more or less the same under make
-j5 as 2.6.21-rc3-mm1. A bit worse, perhaps. There's no tickless on
2.6.20, so that could explain that.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Willy Tarreau
On Sat, Mar 10, 2007 at 09:12:07AM +1100, Con Kolivas wrote:
(...)
> > Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
> > that he added bugs when porting to -mm, or that someting in -mm causes
> > the trouble. Your experience with -mm seems so much different from mine
> > with mainline, there must be a difference somewhere !
> 
> Good idea.

OK, so let me summarize :

  plain 2.6.20
  + http://www.kernel.org/pub/linux/kernel/v2.6/patch-2.6.20.2.bz2
  + http://ck.kolivas.org/patches/staircase-deadline/sched-rsdl-0.26.patch
  + http://marc.theaimsgroup.com/?l=linux-kernel&m=117347544926731&q=raw

should be a good starting point.

> > Con, is your patch necessary for mainline patch too ? I see that it
> > should apply, but sometimes -mm may justify changes.
> 
> Yes it will be necessary for the mainline patch too.

OK Thanks Con.

Best regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 08:57, Con Kolivas wrote:
> On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > newly-started processes.
> > > >
> > > > Ah that's some nice detective work there. Mainline does some rather
> > > > complex accounting on sched_fork including (possibly) a whole timer
> > > > tick which rsdl does not do. make forks off continuously so what you
> > > > say may well be correct. I'll see if I can try to revert to the
> > > > mainline behaviour in sched_fork (which was obviously there for a
> > > > reason).
> > >
> > > Wow! Thanks Matt. You've found a real bug too. This seems to fix the
> > > qemu misbehaviour and bitmap errors so far too! Now can you please try
> > > this to see if it fixes your problem?
> >
> > Sorry, it's about the same. I now suspect an accounting glitch involving
> > pipe wake-ups.
> >
> > 5x memload: good
> > 5x execload: good
> > 5x forkload: good
> > 5 parallel makes: mostly good
> > make -j 5: bad
> >
> > So what's different between makes in parallel and make -j 5? Make's
> > job server uses pipe I/O to control how many jobs are running.
>
> Hmm it must be those deep pipes again then. I removed any quirks testing
> for those from mainline as I suspected it would be ok. Guess I"m wrong.

I shouldn't blame this straight up though if NO_HZ makes it better. Something 
else is going wrong... wtf though?

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 08:57, Willy Tarreau wrote:
> On Fri, Mar 09, 2007 at 03:39:59PM -0600, Matt Mackall wrote:
> > On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > > My suspicion is the problem lies in giving too much quanta to
> > > > > newly-started processes.
> > > >
> > > > Ah that's some nice detective work there. Mainline does some rather
> > > > complex accounting on sched_fork including (possibly) a whole timer
> > > > tick which rsdl does not do. make forks off continuously so what you
> > > > say may well be correct. I'll see if I can try to revert to the
> > > > mainline behaviour in sched_fork (which was obviously there for a
> > > > reason).
> > >
> > > Wow! Thanks Matt. You've found a real bug too. This seems to fix the
> > > qemu misbehaviour and bitmap errors so far too! Now can you please try
> > > this to see if it fixes your problem?
> >
> > Sorry, it's about the same. I now suspect an accounting glitch involving
> > pipe wake-ups.
> >
> > 5x memload: good
> > 5x execload: good
> > 5x forkload: good
> > 5 parallel makes: mostly good
> > make -j 5: bad
> >
> > So what's different between makes in parallel and make -j 5? Make's
> > job server uses pipe I/O to control how many jobs are running.
>
> Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
> that he added bugs when porting to -mm, or that someting in -mm causes
> the trouble. Your experience with -mm seems so much different from mine
> with mainline, there must be a difference somewhere !

Good idea.

> Con, is your patch necessary for mainline patch too ? I see that it
> should apply, but sometimes -mm may justify changes.

Yes it will be necessary for the mainline patch too.

> Best regards,
> Willy

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Willy Tarreau
On Fri, Mar 09, 2007 at 03:39:59PM -0600, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > My suspicion is the problem lies in giving too much quanta to
> > > > newly-started processes.
> > >
> > > Ah that's some nice detective work there. Mainline does some rather 
> > > complex
> > > accounting on sched_fork including (possibly) a whole timer tick which 
> > > rsdl
> > > does not do. make forks off continuously so what you say may well be
> > > correct. I'll see if I can try to revert to the mainline behaviour in
> > > sched_fork (which was obviously there for a reason).
> > 
> > Wow! Thanks Matt. You've found a real bug too. This seems to fix the qemu
> >  misbehaviour and bitmap errors so far too! Now can you please try this to 
> > see
> >  if it fixes your problem?
> 
> Sorry, it's about the same. I now suspect an accounting glitch involving
> pipe wake-ups.
> 
> 5x memload: good
> 5x execload: good
> 5x forkload: good
> 5 parallel makes: mostly good
> make -j 5: bad
> 
> So what's different between makes in parallel and make -j 5? Make's
> job server uses pipe I/O to control how many jobs are running.

Matt, could you check with plain 2.6.20 + Con's patch ? It is possible
that he added bugs when porting to -mm, or that someting in -mm causes
the trouble. Your experience with -mm seems so much different from mine
with mainline, there must be a difference somewhere !

Con, is your patch necessary for mainline patch too ? I see that it
should apply, but sometimes -mm may justify changes.

Best regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 08:39, Matt Mackall wrote:
> On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> > On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > > My suspicion is the problem lies in giving too much quanta to
> > > > newly-started processes.
> > >
> > > Ah that's some nice detective work there. Mainline does some rather
> > > complex accounting on sched_fork including (possibly) a whole timer
> > > tick which rsdl does not do. make forks off continuously so what you
> > > say may well be correct. I'll see if I can try to revert to the
> > > mainline behaviour in sched_fork (which was obviously there for a
> > > reason).
> >
> > Wow! Thanks Matt. You've found a real bug too. This seems to fix the qemu
> >  misbehaviour and bitmap errors so far too! Now can you please try this
> > to see if it fixes your problem?
>
> Sorry, it's about the same. I now suspect an accounting glitch involving
> pipe wake-ups.
>
> 5x memload: good
> 5x execload: good
> 5x forkload: good
> 5 parallel makes: mostly good
> make -j 5: bad
>
> So what's different between makes in parallel and make -j 5? Make's
> job server uses pipe I/O to control how many jobs are running.

Hmm it must be those deep pipes again then. I removed any quirks testing for 
those from mainline as I suspected it would be ok. Guess I"m wrong.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 08:19:18AM +1100, Con Kolivas wrote:
> On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> > On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > > My suspicion is the problem lies in giving too much quanta to
> > > newly-started processes.
> >
> > Ah that's some nice detective work there. Mainline does some rather complex
> > accounting on sched_fork including (possibly) a whole timer tick which rsdl
> > does not do. make forks off continuously so what you say may well be
> > correct. I'll see if I can try to revert to the mainline behaviour in
> > sched_fork (which was obviously there for a reason).
> 
> Wow! Thanks Matt. You've found a real bug too. This seems to fix the qemu
>  misbehaviour and bitmap errors so far too! Now can you please try this to see
>  if it fixes your problem?

Sorry, it's about the same. I now suspect an accounting glitch involving
pipe wake-ups.

5x memload: good
5x execload: good
5x forkload: good
5 parallel makes: mostly good
make -j 5: bad

So what's different between makes in parallel and make -j 5? Make's
job server uses pipe I/O to control how many jobs are running.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Fri, Mar 09, 2007 at 02:46:24PM -0600, Matt Mackall wrote:
> A priori, this load should be manageable by RSDL as the interactive
> loads are all pretty small. So I wrote a little Python script that
> basically continuously memcpys some 16MB chunks of memory:
> 
> #!/usr/bin/python
> a = "a" * 16 * 1024 * 1024
> while 1:
> b = a[1:] + "b"
> a = b[1:] + "c"
> 
> I've got 1.5G of RAM, so I can run quite a few of these without
> killing my pagecache. This should test whether a) Beryl's actually
> running up against memory bandwidth issues and b) whether "simple"
> static loads work. As you can see, running 5 instances of this script
> leaves me in good shape still. 10 is still in "ok" territory, with top
> showing each getting 9.7-10% of the CPU. 15 starts to feel sluggish.
> 20 the mouse jumps a bit and I got an MP3 skip. 30 is getting pretty
> bad, but still not as bad as the make -j 5 load.
> 
> My suspicion is the problem lies in giving too much quanta to
> newly-started processes.

I've also tried 10+ instances of each of the following:

forkload:
#!/bin/sh
./forkload&

execload:
#!/bin/sh
exec ./execload

And it's quite well-behaved in both cases.

Also, if I run:

for a in 1 2 3 4 5; do mkdir $a; cp .config $a; make O=$a & done

..I get mostly good behavior with some occassional snags. Things run
much better than make -j 5. Unfortunately, that doesn't make my kernel
get built faster.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 08:07, Con Kolivas wrote:
> On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> > My suspicion is the problem lies in giving too much quanta to
> > newly-started processes.
>
> Ah that's some nice detective work there. Mainline does some rather complex
> accounting on sched_fork including (possibly) a whole timer tick which rsdl
> does not do. make forks off continuously so what you say may well be
> correct. I'll see if I can try to revert to the mainline behaviour in
> sched_fork (which was obviously there for a reason).

Wow! Thanks Matt. You've found a real bug too. This seems to fix the qemu
 misbehaviour and bitmap errors so far too! Now can you please try this to see
 if it fixes your problem?

---
 kernel/sched.c |8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

Index: linux-2.6.21-rc3-mm1/kernel/sched.c
===
--- linux-2.6.21-rc3-mm1.orig/kernel/sched.c2007-03-10 08:08:11.0 
+1100
+++ linux-2.6.21-rc3-mm1/kernel/sched.c 2007-03-10 08:13:57.0 +1100
@@ -1560,7 +1560,7 @@ int fastcall wake_up_state(struct task_s
return try_to_wake_up(p, state, 0);
 }
 
-static void task_expired_entitlement(struct rq *rq, struct task_struct *p);
+static void task_running_tick(struct rq *rq, struct task_struct *p);
 /*
  * Perform scheduler related setup for a newly forked process p.
  * p is forked by current.
@@ -1621,10 +1621,8 @@ void fastcall sched_fork(struct task_str
 * left from its timeslice. Taking the runqueue lock is not
 * a problem.
 */
-   struct rq *rq = __task_rq_lock(current);
-
-   task_expired_entitlement(rq, current);
-   __task_rq_unlock(rq);
+   current->time_slice = 1;
+   task_running_tick(cpu_rq(cpu), current);
}
local_irq_enable();
 out:

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 07:46, Matt Mackall wrote:
> Ok, I've now disabled sched_yield (I'm using xorg radeon drivers).

Great.

> So far:
>
>   rc2-mm2   RSDL  RSDL+NO_HZ  RSDL+NO_HZ+no_yield  estimated CPU
> no load
>  berylgood  good  great   great~30% at 600MHz
>  galeon   good  good  goodgood 100% at 600MHz
>  mp3  good  good  goodgood < 5% at 600MHz
>  terminal good  good  goodgood ~0
>  mousegood  good  goodgood ~0
> make
>  beryl  awful ok  good
>  galeon bad   ok  good
>  mp3good  goodgood
>  terminal   bad   goodgood
>  mouse  bad   goodgood

It's sad that sched_yield is still in our graphics card drivers ...

> make -j2
>  beryl  awful bad/ok
>  metacity bad/ok  <- it's not beryl-specifc
>  galeon bad   bad/ok
>  mp3good  good
>  terminal   bad   bad/ok
>  mouse  bad   bad/ok
> make -j5
>  berylokawful awful   awful/bad
>  galeon   okbad   bad bad
>  mp3  good  good  gooda couple skips
>  terminal okbad   bad bad
>  mousegood  bad   bad bad
> memload x5
>  berylok/good
>  galeon   ok/good
>  mp3  good
>  terminal ok/good
>  mouseok/good
>
>
> good = no problems
> ok = noticeable latency
> bad = hard to use
> awful = completely unusable
>
> By the way, make -j5 is my usual kernel compile because it gives me
> the best wall time on this box.
>
> A priori, this load should be manageable by RSDL as the interactive
> loads are all pretty small. So I wrote a little Python script that
> basically continuously memcpys some 16MB chunks of memory:
>
> #!/usr/bin/python
> a = "a" * 16 * 1024 * 1024
> while 1:
> b = a[1:] + "b"
> a = b[1:] + "c"
>
> I've got 1.5G of RAM, so I can run quite a few of these without
> killing my pagecache. This should test whether a) Beryl's actually
> running up against memory bandwidth issues and b) whether "simple"
> static loads work. As you can see, running 5 instances of this script
> leaves me in good shape still. 10 is still in "ok" territory, with top
> showing each getting 9.7-10% of the CPU. 15 starts to feel sluggish.
> 20 the mouse jumps a bit and I got an MP3 skip. 30 is getting pretty
> bad, but still not as bad as the make -j 5 load.
>
> My suspicion is the problem lies in giving too much quanta to
> newly-started processes.

Ah that's some nice detective work there. Mainline does some rather complex 
accounting on sched_fork including (possibly) a whole timer tick which rsdl 
does not do. make forks off continuously so what you say may well be correct. 
I'll see if I can try to revert to the mainline behaviour in sched_fork 
(which was obviously there for a reason).

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 07:15:38AM +1100, Con Kolivas wrote:
> How odd. I would have thought that if an interaction was to occur it would 
> have been without the new feature. Clearly what you describe without NO_HZ is 
> not the expected behaviour with RSDL. I wonder what went wrong. Are you on 
> 100HZ on that laptop? While I expect 100HZ should be ok, it might just not 
> be... My laptop is about the same performance and works fine with 100HZ under 
> load of all sorts BUT I don't have Beryl (which I would have thought swayed 
> things in the opposite direction also).

Note I also did a test with metacity and got more or less the same
results.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Sat, Mar 10, 2007 at 07:26:15AM +1100, Con Kolivas wrote:
> > How odd. I would have thought that if an interaction was to occur it would
> > have been without the new feature. Clearly what you describe without NO_HZ
> > is not the expected behaviour with RSDL. I wonder what went wrong. Are you
> > on 100HZ on that laptop? While I expect 100HZ should be ok, it might just
> > not be... My laptop is about the same performance and works fine with 100HZ
> > under load of all sorts BUT I don't have Beryl (which I would have thought
> > swayed things in the opposite direction also).

HZ=250

> Oh and can you grep dmesg for:
> Scheduler bitmap error

Nope, sorry.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote:
> On Friday 09 March 2007 19:20, Matt Mackall wrote:
> > And I've just rebooted with NO_HZ and things are greatly improved. At
> > idle, Beryl effects are silky smooth (possibly better than stock) and
> > shows less load. Under 'make', Beryl is still responsive as is Galeon.
> > No sign of lagging mouse or typing.
> >
> > Under make -j 5, things are intermittent. Galeon scrolling is
> > sometimes still responsive, but Beryl, terminals and mouse still drag
> > quite a bit.
> 
> I just replied before you sent this one out I think our messages passed each 
> other across the ocean somewhere. I don't quite get what combination of 
> factors you're saying here caused great improvement. Was it enabling NO_HZ on 
> mainline cpu scheduler or disabling NO_HZ or on RSDL?

Ok, I've now disabled sched_yield (I'm using xorg radeon drivers).

So far:

  rc2-mm2   RSDL  RSDL+NO_HZ  RSDL+NO_HZ+no_yield  estimated CPU
no load
 berylgood  good  great   great~30% at 600MHz
 galeon   good  good  goodgood 100% at 600MHz
 mp3  good  good  goodgood < 5% at 600MHz
 terminal good  good  goodgood ~0
 mousegood  good  goodgood ~0
make
 beryl  awful ok  good
 galeon bad   ok  good
 mp3good  goodgood
 terminal   bad   goodgood
 mouse  bad   goodgood
make -j2
 beryl  awful bad/ok
 metacity bad/ok  <- it's not beryl-specifc
 galeon bad   bad/ok
 mp3good  good
 terminal   bad   bad/ok
 mouse  bad   bad/ok
make -j5
 berylokawful awful   awful/bad
 galeon   okbad   bad bad
 mp3  good  good  gooda couple skips
 terminal okbad   bad bad
 mousegood  bad   bad bad
memload x5
 berylok/good
 galeon   ok/good
 mp3  good
 terminal ok/good
 mouseok/good 


good = no problems
ok = noticeable latency
bad = hard to use
awful = completely unusable

By the way, make -j5 is my usual kernel compile because it gives me
the best wall time on this box. 

A priori, this load should be manageable by RSDL as the interactive
loads are all pretty small. So I wrote a little Python script that
basically continuously memcpys some 16MB chunks of memory:

#!/usr/bin/python
a = "a" * 16 * 1024 * 1024
while 1:
b = a[1:] + "b"
a = b[1:] + "c"

I've got 1.5G of RAM, so I can run quite a few of these without
killing my pagecache. This should test whether a) Beryl's actually
running up against memory bandwidth issues and b) whether "simple"
static loads work. As you can see, running 5 instances of this script
leaves me in good shape still. 10 is still in "ok" territory, with top
showing each getting 9.7-10% of the CPU. 15 starts to feel sluggish.
20 the mouse jumps a bit and I got an MP3 skip. 30 is getting pretty
bad, but still not as bad as the make -j 5 load.

My suspicion is the problem lies in giving too much quanta to
newly-started processes.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 07:15, Con Kolivas wrote:
> On Saturday 10 March 2007 05:27, Matt Mackall wrote:
> > On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote:
> > > On Friday 09 March 2007 19:20, Matt Mackall wrote:
> > > > And I've just rebooted with NO_HZ and things are greatly improved. At
> > > > idle, Beryl effects are silky smooth (possibly better than stock) and
> > > > shows less load. Under 'make', Beryl is still responsive as is
> > > > Galeon. No sign of lagging mouse or typing.
> > > >
> > > > Under make -j 5, things are intermittent. Galeon scrolling is
> > > > sometimes still responsive, but Beryl, terminals and mouse still drag
> > > > quite a bit.
> > >
> > > I just replied before you sent this one out I think our messages passed
> > > each other across the ocean somewhere. I don't quite get what
> > > combination of factors you're saying here caused great improvement. Was
> > > it enabling NO_HZ on mainline cpu scheduler or disabling NO_HZ or on
> > > RSDL?
> >
> > Turning on NO_HZ on RSDL greatly improved it. I have not tried NO_HZ
> > on mainline. The first test was with NO_HZ=n, the second was with
> > NO_HZ=y.
>
> How odd. I would have thought that if an interaction was to occur it would
> have been without the new feature. Clearly what you describe without NO_HZ
> is not the expected behaviour with RSDL. I wonder what went wrong. Are you
> on 100HZ on that laptop? While I expect 100HZ should be ok, it might just
> not be... My laptop is about the same performance and works fine with 100HZ
> under load of all sorts BUT I don't have Beryl (which I would have thought
> swayed things in the opposite direction also).

Oh and can you grep dmesg for:
Scheduler bitmap error

If that occurs it's not performing properly. A subtle bug that's busting my 
chops to try and track down.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 05:07, Mark Lord wrote:
> Mmm.. when it's good, it's *really* good.
> My desktop feels snappier and all of that.
>
> No noticeable jerkiness of windows/scrolling,
> which I *do* observe with the stock scheduler.

Thats good.

> But when it's bad, it stinks.
> Like when a "make -j2" kernel rebuild is happening in a background window

And that's bad. When you say "it stinks" is it more than 3 times slower? It 
should be precisely 3 times slower under that load (although low cpu using 
things like audio wont be affected by running 3 times slower). If it feels 
like much more than that much slower, there is a bug there somewhere. As 
another reader suggested, how does it run with the compile 'niced'? How does 
it perform with make (without a -j number).

> This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook).

What HZ are you running? Are you running a Beryl desktop?

> JADP (Just Another Data Point).

Appreciated, thanks.

> Mark

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Saturday 10 March 2007 05:27, Matt Mackall wrote:
> On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote:
> > On Friday 09 March 2007 19:20, Matt Mackall wrote:
> > > And I've just rebooted with NO_HZ and things are greatly improved. At
> > > idle, Beryl effects are silky smooth (possibly better than stock) and
> > > shows less load. Under 'make', Beryl is still responsive as is Galeon.
> > > No sign of lagging mouse or typing.
> > >
> > > Under make -j 5, things are intermittent. Galeon scrolling is
> > > sometimes still responsive, but Beryl, terminals and mouse still drag
> > > quite a bit.
> >
> > I just replied before you sent this one out I think our messages passed
> > each other across the ocean somewhere. I don't quite get what combination
> > of factors you're saying here caused great improvement. Was it enabling
> > NO_HZ on mainline cpu scheduler or disabling NO_HZ or on RSDL?
>
> Turning on NO_HZ on RSDL greatly improved it. I have not tried NO_HZ
> on mainline. The first test was with NO_HZ=n, the second was with
> NO_HZ=y.

How odd. I would have thought that if an interaction was to occur it would 
have been without the new feature. Clearly what you describe without NO_HZ is 
not the expected behaviour with RSDL. I wonder what went wrong. Are you on 
100HZ on that laptop? While I expect 100HZ should be ok, it might just not 
be... My laptop is about the same performance and works fine with 100HZ under 
load of all sorts BUT I don't have Beryl (which I would have thought swayed 
things in the opposite direction also).

> As an aside, we should not name config options NO_* or DISABLE_*
> because of the potential for double negation.

Case in point,  I couldn't figure out what you were saying :)

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Fri, Mar 09, 2007 at 07:39:05PM +1100, Con Kolivas wrote:
> On Friday 09 March 2007 19:20, Matt Mackall wrote:
> > And I've just rebooted with NO_HZ and things are greatly improved. At
> > idle, Beryl effects are silky smooth (possibly better than stock) and
> > shows less load. Under 'make', Beryl is still responsive as is Galeon.
> > No sign of lagging mouse or typing.
> >
> > Under make -j 5, things are intermittent. Galeon scrolling is
> > sometimes still responsive, but Beryl, terminals and mouse still drag
> > quite a bit.
> 
> I just replied before you sent this one out I think our messages passed each 
> other across the ocean somewhere. I don't quite get what combination of 
> factors you're saying here caused great improvement. Was it enabling NO_HZ on 
> mainline cpu scheduler or disabling NO_HZ or on RSDL?

Turning on NO_HZ on RSDL greatly improved it. I have not tried NO_HZ
on mainline. The first test was with NO_HZ=n, the second was with
NO_HZ=y.

My baseline test was with mainline NO_HZ=y.

As an aside, we should not name config options NO_* or DISABLE_*
because of the potential for double negation.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Jeffrey Hundstad


Mark Lord wrote:

Mmm.. when it's good, it's *really* good.
My desktop feels snappier and all of that.

No noticeable jerkiness of windows/scrolling,
which I *do* observe with the stock scheduler.

But when it's bad, it stinks.
Like when a "make -j2" kernel rebuild is happening in a background window



Would you please do that same "make -j2" niced.  Tell us how that feels.

--
Jeffrey Hundstad

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Mark Lord

Mmm.. when it's good, it's *really* good.
My desktop feels snappier and all of that.

No noticeable jerkiness of windows/scrolling,
which I *do* observe with the stock scheduler.

But when it's bad, it stinks.
Like when a "make -j2" kernel rebuild is happening in a background window

This is on a Pentium-M 760 single-core, w/2GB SDRAM (notebook).

JADP (Just Another Data Point).

Cheers

Mark
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Serge Belyshev
William Lee Irwin III <[EMAIL PROTECTED]> writes:

> On Fri, Mar 09, 2007 at 12:07:06PM +0300, Serge Belyshev wrote:
>> If you see sched_yield() when stracing any 3d program, I suggest you
>> to try this bruteforce workaround, which works fine for me,
>> disable sched_yield():
>
> May I suggest LD_PRELOAD of a library consisting of only a nopped
> sched_yield() function in userspace?
>

Sure. This is definitely clearer way to do. You just need to put
export LD_PRELOAD=/path/to/your/lib.so somewhere early enough.

cat > yield.c << EOF
int sched_yield (void)
{
return 0;
}
EOF
gcc yield.c -o yield.so -shared -O2 -fPIC -g
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread William Lee Irwin III
On Fri, Mar 09, 2007 at 12:07:06PM +0300, Serge Belyshev wrote:
> If you see sched_yield() when stracing any 3d program, I suggest you
> to try this bruteforce workaround, which works fine for me,
> disable sched_yield():

May I suggest LD_PRELOAD of a library consisting of only a nopped
sched_yield() function in userspace?


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Serge Belyshev
Con Kolivas <[EMAIL PROTECTED]> writes:

> On Friday 09 March 2007 18:53, Matt Mackall wrote:
...
>>
>> With a single non-parallel make running (all in cache, mind you), the
>> system kicks up into just about 100% CPU usage at full speed. Desktop
>> spinning becomes between 10x to 100x slower (from ~30fps to < 1fps).
>> Galeon scrolling pauses for as much as a second. Mouse movement pauses
>> for as much as a second. Typing in terminals lags noticeably.
>>
>> This is not the expected behavior of a fair, low-latency scheduler.
>
> No indeed it does not sound right at all to me either. Last time I 
> encountered 
> something like this we traced it and hit sched_yield calls somewhere in the 
> graphic pipeline. So first question is, how does mainline perform with the 
> same testcase, and second question is umm whatever it is that is slow is 
> there a way to trace it to see if it yields?

Matt, some 3d drivers are known to do sched_yield() behind user's back,

(notably dri radeon ones, grep for sched_yield:
http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r200/r200_ioctl.c?revision=1.37&view=markup
http://webcvs.freedesktop.org/mesa/Mesa/src/mesa/drivers/dri/r300/radeon_ioctl.c?revision=1.14&view=markup)

thus absolutely killing any desktop interactivity whatsoever.

If you see sched_yield() when stracing any 3d program, I suggest you
to try this bruteforce workaround, which works fine for me,
disable sched_yield():


 kernel/sched.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4285,7 +4285,7 @@ asmlinkage long sys_sched_getaffinity(pi
  * This function yields the current CPU by dropping the priority of current
  * to the lowest priority.
  */
-asmlinkage long sys_sched_yield(void)
+static long sys_sched_yield1(void)
 {
struct rq *rq = this_rq_lock();
struct task_struct *p = current;
@@ -4312,6 +4312,11 @@ asmlinkage long sys_sched_yield(void)
return 0;
 }
 
+asmlinkage long sys_sched_yield(void)
+{
+   return 0;
+}
+
 static void __cond_resched(void)
 {
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4395,7 +4400,7 @@ EXPORT_SYMBOL(cond_resched_softirq);
 void __sched yield(void)
 {
set_current_state(TASK_RUNNING);
-   sys_sched_yield();
+   sys_sched_yield1();
 }
 EXPORT_SYMBOL(yield);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Friday 09 March 2007 19:20, Matt Mackall wrote:
> And I've just rebooted with NO_HZ and things are greatly improved. At
> idle, Beryl effects are silky smooth (possibly better than stock) and
> shows less load. Under 'make', Beryl is still responsive as is Galeon.
> No sign of lagging mouse or typing.
>
> Under make -j 5, things are intermittent. Galeon scrolling is
> sometimes still responsive, but Beryl, terminals and mouse still drag
> quite a bit.

I just replied before you sent this one out I think our messages passed each 
other across the ocean somewhere. I don't quite get what combination of 
factors you're saying here caused great improvement. Was it enabling NO_HZ on 
mainline cpu scheduler or disabling NO_HZ or on RSDL?

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Con Kolivas
On Friday 09 March 2007 18:53, Matt Mackall wrote:
> Well then I suppose something must be broken. When my box is idle, I
> can grab my desktop and spin it around and generate less than 25% CPU
> with the CPU stepped all the way down from 1.7GHz to 600MHz (Beryl is
> actually much snappier than many conventional window managers by doing
> just about everything through GL). By comparison, grabbing the Galeon
> scroll bar and wiggling it will generate 100% CPU (still throttled
> though) but remain relatively smooth.
>
> With a single non-parallel make running (all in cache, mind you), the
> system kicks up into just about 100% CPU usage at full speed. Desktop
> spinning becomes between 10x to 100x slower (from ~30fps to < 1fps).
> Galeon scrolling pauses for as much as a second. Mouse movement pauses
> for as much as a second. Typing in terminals lags noticeably.
>
> This is not the expected behavior of a fair, low-latency scheduler.

No indeed it does not sound right at all to me either. Last time I encountered 
something like this we traced it and hit sched_yield calls somewhere in the 
graphic pipeline. So first question is, how does mainline perform with the 
same testcase, and second question is umm whatever it is that is slow is 
there a way to trace it to see if it yields?

> For reference, this was with HZ=250, PREEMPT, PREEMPT_BKL, and !NO_HZ.

Ah I also wonder if it hasn't broken with NO_HZ. I haven't had a chance to 
even confirm that the code works properly with it, I was only assuming (after 
our last chat). See if turning that off makes a difference?

Thanks for testing!

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Fri, Mar 09, 2007 at 01:53:58AM -0600, Matt Mackall wrote:
> On Fri, Mar 09, 2007 at 05:28:03PM +1100, Con Kolivas wrote:
> > On Friday 09 March 2007 16:39, Matt Mackall wrote:
> > > First off, let me say that I think your approach has great promise,
> > > but I'm afraid it doesn't work so well here yet.
> > >
> > > Box is an R51 Thinkpad, 1.7GHz Pentium M. I'm using a make -j 5 as a
> > > test load.
> > >
> > > With 2.6.21-rc2-mm2, I get slightly sluggish response for opening new
> > > terminals, scrolling in Galeon, and a bit jerky behaviour for spinning
> > > Beryl's 3D desktop. Playing MP3s off an sshfs FUSE mount works fine.
> > > Typing across ssh sessions has no noticeable lag. Mouse pointer
> > > movement is smooth.
> > >
> > > With 2.6.21-rc3-mm1, terminals take longer to open, Galeon is
> > > noticeably more sluggish, and Beryl's desktop switching goes from being
> > > jerky to a 5-second agony. Typing in shells, remote or not,
> > > lags noticeably. Mouse pointer is alternately smooth or jerky. But
> > > MP3s still work great!
> > >
> > > Problems persist with make -j 2 and make.
> > 
> > make -j5 sucks you'll get precisely 1/6th cpu for galeon with this 
> > scheduler 
> > which is perfectly fair and I make no apology for it, nor do I plan to 
> > optimise for it. With make (without jobs) you'll still only get 50% cpu so 
> > it 
> > should be precisely half speed unless you nice it. Does it feel precisely 
> > half speed? It's supposed to. This is one of the drawbacks of a perfectly 
> > fair approach; its... fair and will need more liberal use of nice.
> 
> Well then I suppose something must be broken. When my box is idle, I
> can grab my desktop and spin it around and generate less than 25% CPU
> with the CPU stepped all the way down from 1.7GHz to 600MHz (Beryl is
> actually much snappier than many conventional window managers by doing
> just about everything through GL). By comparison, grabbing the Galeon
> scroll bar and wiggling it will generate 100% CPU (still throttled
> though) but remain relatively smooth.
> 
> With a single non-parallel make running (all in cache, mind you), the
> system kicks up into just about 100% CPU usage at full speed. Desktop
> spinning becomes between 10x to 100x slower (from ~30fps to < 1fps).
> Galeon scrolling pauses for as much as a second. Mouse movement pauses
> for as much as a second. Typing in terminals lags noticeably.
> 
> This is not the expected behavior of a fair, low-latency scheduler.
> 
> For reference, this was with HZ=250, PREEMPT, PREEMPT_BKL, and !NO_HZ.

And I've just rebooted with NO_HZ and things are greatly improved. At
idle, Beryl effects are silky smooth (possibly better than stock) and
shows less load. Under 'make', Beryl is still responsive as is Galeon.
No sign of lagging mouse or typing.

Under make -j 5, things are intermittent. Galeon scrolling is
sometimes still responsive, but Beryl, terminals and mouse still drag
quite a bit.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-09 Thread Matt Mackall
On Fri, Mar 09, 2007 at 05:28:03PM +1100, Con Kolivas wrote:
> On Friday 09 March 2007 16:39, Matt Mackall wrote:
> > First off, let me say that I think your approach has great promise,
> > but I'm afraid it doesn't work so well here yet.
> >
> > Box is an R51 Thinkpad, 1.7GHz Pentium M. I'm using a make -j 5 as a
> > test load.
> >
> > With 2.6.21-rc2-mm2, I get slightly sluggish response for opening new
> > terminals, scrolling in Galeon, and a bit jerky behaviour for spinning
> > Beryl's 3D desktop. Playing MP3s off an sshfs FUSE mount works fine.
> > Typing across ssh sessions has no noticeable lag. Mouse pointer
> > movement is smooth.
> >
> > With 2.6.21-rc3-mm1, terminals take longer to open, Galeon is
> > noticeably more sluggish, and Beryl's desktop switching goes from being
> > jerky to a 5-second agony. Typing in shells, remote or not,
> > lags noticeably. Mouse pointer is alternately smooth or jerky. But
> > MP3s still work great!
> >
> > Problems persist with make -j 2 and make.
> 
> make -j5 sucks you'll get precisely 1/6th cpu for galeon with this scheduler 
> which is perfectly fair and I make no apology for it, nor do I plan to 
> optimise for it. With make (without jobs) you'll still only get 50% cpu so it 
> should be precisely half speed unless you nice it. Does it feel precisely 
> half speed? It's supposed to. This is one of the drawbacks of a perfectly 
> fair approach; its... fair and will need more liberal use of nice.

Well then I suppose something must be broken. When my box is idle, I
can grab my desktop and spin it around and generate less than 25% CPU
with the CPU stepped all the way down from 1.7GHz to 600MHz (Beryl is
actually much snappier than many conventional window managers by doing
just about everything through GL). By comparison, grabbing the Galeon
scroll bar and wiggling it will generate 100% CPU (still throttled
though) but remain relatively smooth.

With a single non-parallel make running (all in cache, mind you), the
system kicks up into just about 100% CPU usage at full speed. Desktop
spinning becomes between 10x to 100x slower (from ~30fps to < 1fps).
Galeon scrolling pauses for as much as a second. Mouse movement pauses
for as much as a second. Typing in terminals lags noticeably.

This is not the expected behavior of a fair, low-latency scheduler.

For reference, this was with HZ=250, PREEMPT, PREEMPT_BKL, and !NO_HZ.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc3-mm1 RSDL results

2007-03-08 Thread Con Kolivas
On Friday 09 March 2007 16:39, Matt Mackall wrote:
> First off, let me say that I think your approach has great promise,
> but I'm afraid it doesn't work so well here yet.
>
> Box is an R51 Thinkpad, 1.7GHz Pentium M. I'm using a make -j 5 as a
> test load.
>
> With 2.6.21-rc2-mm2, I get slightly sluggish response for opening new
> terminals, scrolling in Galeon, and a bit jerky behaviour for spinning
> Beryl's 3D desktop. Playing MP3s off an sshfs FUSE mount works fine.
> Typing across ssh sessions has no noticeable lag. Mouse pointer
> movement is smooth.
>
> With 2.6.21-rc3-mm1, terminals take longer to open, Galeon is
> noticeably more sluggish, and Beryl's desktop switching goes from being
> jerky to a 5-second agony. Typing in shells, remote or not,
> lags noticeably. Mouse pointer is alternately smooth or jerky. But
> MP3s still work great!
>
> Problems persist with make -j 2 and make.

make -j5 sucks you'll get precisely 1/6th cpu for galeon with this scheduler 
which is perfectly fair and I make no apology for it, nor do I plan to 
optimise for it. With make (without jobs) you'll still only get 50% cpu so it 
should be precisely half speed unless you nice it. Does it feel precisely 
half speed? It's supposed to. This is one of the drawbacks of a perfectly 
fair approach; its... fair and will need more liberal use of nice.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/