subject:"\[BUG\] New Kernel Bugs"

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Bron Gondwana

On Thu, Nov 22, 2007 at 10:51:15AM +1100, Bron Gondwana wrote:
> On Thu, Nov 15, 2007 at 08:32:22AM -0800, Linus Torvalds wrote:
> > If this patch makes a difference, please holler. I think it's the correct 
> > thing to do, but I'm not going to actually commit it without somebody 
> > saying that it makes a difference (and preferably Peter Zijlstra and 
> > Andrew acking it too).
> 
> mmap: mmap call failed: errno: 12 errmsg: Cannot allocate memory
> 
> Yep, that's "fixed" the problem alright!  No way this puppy is
> dirtying 2Gb of memory any more.
> 
> http://linux.brong.fastmail.fm/2007-11-22/bmtest.pl

Alternatively perhaps I'm just a moron who used a config file with:
CONFIG_PAGE_OFFSET=0x8000 set to build the new kernel (I hadn't
committed it because it turned out not to solve the issue it was
there for).  That would explain a few things.

[EMAIL PROTECTED] perl]$ free
 total   used   free sharedbuffers cached
Mem:   415062022722841878336  0  112122066536
-/+ buffers/cache: 1945363956084
Swap:  2096472  02096472

That's more the usage I would expect to see.

Now for the downside.  It works again, but it still runs slow.  Seems to
hit (and this is totally unscientific, I'm just watching the numbers
scroll by) at about 12 writes rather than 7 writes, but that's
still not fitting the while file dirty.

I notice that PF_LESS_THROTTLE gets set by nfsd to get an extra 25%
bonus free space allocated.  Potentially dcc could use similar tricks
to claim extra space if that knob is available up in userspace.  I'm
happy to patch dcc as well if I have to, I'm already backporting it,
so adding another little quilt directory and applying it is pretty
trivial (must try guilt/stgit one of these days)

Bron.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 08:32:22AM -0800, Linus Torvalds wrote:
> On Thu, 15 Nov 2007, Bron Gondwana wrote:
> > 
> > I guess we'll be doing the one-liner kernel mod and testing
> > that then.
> 
> The thing to look at is "get_dirty_limits()" in mm/page-writeback.c, and 
> in this particular case it's the
> 
>   unsigned long available_memory = determine_dirtyable_memory();
> 
> that's going to bite you. In particular, note the
> 
>   x -= highmem_dirtyable_memory(x);
> 
> that we do in determine_dirtyable_memory().
> 
> So in this case, if you basically remove that line, it will allow all of 
> memory to be dirtied (including highmem), and then the background_ratio 
> will work on the whole 6GB.
> 
> HOWEVER! It's worth noting that we also have some other old legacy cruft 
> there that may interfere with your code. In particular, if you look at the 
> top of "get_dirty_limits()", it *also* does a
> 
> unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
> global_page_state(NR_ANON_PAGES)) * 100) /
> available_memory;
> 
> dirty_ratio = vm_dirty_ratio;
> if (dirty_ratio > unmapped_ratio / 2)
> dirty_ratio = unmapped_ratio / 2;
> 
> and that whole "unmapped_ratio" comparison is probably bogus these days, 
> since we now take the mapped dirty pages into account. That code harks 
> back to the days before we did that, and dirty ratios only affected 
> non-mapped pages.
> 
> And in particular, now that I look at it, I wonder if it can even go 
> negative (because "available_memory" may be *smaller* than the 
> NR_FILE_MAPPED|ANON_PAGES sum!).
> 
> We'll fix up a negative value anyway (because of the clamping of 
> dirty_ratio to no less than 5), but the point is that the whole 
> "unmapped_ratio" thing probably doesn't make sense any more, and may well 
> make the dirty_ratio not work for you, because you may have a very small 
> unmapped_ratio that effectively makes all dirty limits always clamp to a 
> very small value.
> 
> So regardless, I think you may want to try the appended patch *first*.
> 
> If this patch makes a difference, please holler. I think it's the correct 
> thing to do, but I'm not going to actually commit it without somebody 
> saying that it makes a difference (and preferably Peter Zijlstra and 
> Andrew acking it too).

mmap: mmap call failed: errno: 12 errmsg: Cannot allocate memory

Yep, that's "fixed" the problem alright!  No way this puppy is
dirtying 2Gb of memory any more.

http://linux.brong.fastmail.fm/2007-11-22/bmtest.pl

That said, pushing the size down to 1700 rather than 2000 in that
file makes it run, and the behaviour matches the 2000 Mb case on
2.6.16.55 rather than 2.6.20.20 or 2.6.23.1 (my other test case
kernels that happened to be pre-built on that machine)

[EMAIL PROTECTED] ~]$ free
 total   used   free sharedbuffers cached
Mem:   414983620730562076780  0  220361846096
-/+ buffers/cache: 2049243944912
Swap:  2096472  02096472

That's after running the 1700Mb version.  You can see this machine is our
one remaining 4Gb machine (it's not running any production services unlike
the 6Gb machine, so it's better for testing)

Anyway - looks like this may be a "good enough" solution for out1 if
it can manage an ~2Gb file with 6Gb of memory available.  I'll test
that later today - but I should drag myself into the office now...

Bron.

(patch left attached below for reference)

> Only *after* testing this change is it probably a good idea to test the 
> real hack of then removing the highmem_dirtyable_memory() thing. 
> 
> Peter? Andrew?
> 
>   Linus
> 
> ---
>  mm/page-writeback.c |8 
>  1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 81a91e6..d55cfca 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
> *pbdi_dirty,
>  {
>   int background_ratio;   /* Percentages */
>   int dirty_ratio;
> - int unmapped_ratio;
>   long background;
>   long dirty;
>   unsigned long available_memory = determine_dirtyable_memory();
>   struct task_struct *tsk;
>  
> - unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
> - global_page_state(NR_ANON_PAGES)) * 100) /
> - available_memory;
> -
>   dirty_ratio = vm_dirty_ratio;
> - if (dirty_ratio > unmapped_ratio / 2)
> - dirty_ratio = unmapped_ratio / 2;
> -
>   if (dirty_ratio < 5)
>   dirty_ratio = 5;
>  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Jan Engelhardt

On Thu, 15 Nov 2007 13:47:54 -0800 (PST) Linus Torvalds wrote:
>
>But quite frankly, I refuse to even care about anything past that. If 
>you have 12G (or heaven forbid, even more) in your machine, and you 
>can't be bothered to just upgrade to a 64-bit CPU, then quite frankly, 
>*I* personally can't be bothered to care.
>
>If they have that much RAM (and bought it a few years ago when a 64-bit 
>CPU wasn't an option), they can't be poor.
>
>So the _only_ explanation today for 12GB on a 32-bit machine is
> (a) insanity
>or 
> (b) being so lazy as to not bother to upgrade
>

Just around the corner...

$ ftp ftp
Connected to ftp.gwdg.de.
220-
220-Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen
220-
220-This is a Linux PC (Dell PE-2650, 2 CPUs P4/2800, 12 GB RAM)
220-running SuSE-Linux-8.2 with SuSE kernel 2.4.20-64GB-SMP.

There is no reason to upgrade the hardware - if it works, hey good then.
And I am pretty sure that a few 2 GB sticks are cheaper than a big 
opteron (if you only go by that). It sure is now - and probably even 
back then.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Jan Engelhardt


On Thu, 15 Nov 2007 13:47:54 -0800 (PST) Linus Torvalds wrote:

But quite frankly, I refuse to even care about anything past that. If 
you have 12G (or heaven forbid, even more) in your machine, and you 
can't be bothered to just upgrade to a 64-bit CPU, then quite frankly, 
*I* personally can't be bothered to care.

If they have that much RAM (and bought it a few years ago when a 64-bit 
CPU wasn't an option), they can't be poor.

So the _only_ explanation today for 12GB on a 32-bit machine is
 (a) insanity
or 
 (b) being so lazy as to not bother to upgrade


Just around the corner...

$ ftp ftp
Connected to ftp.gwdg.de.
220-
220-Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen
220-
220-This is a Linux PC (Dell PE-2650, 2 CPUs P4/2800, 12 GB RAM)
220-running SuSE-Linux-8.2 with SuSE kernel 2.4.20-64GB-SMP.

There is no reason to upgrade the hardware - if it works, hey good then.
And I am pretty sure that a few 2 GB sticks are cheaper than a big 
opteron (if you only go by that). It sure is now - and probably even 
back then.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 08:32:22AM -0800, Linus Torvalds wrote:
 On Thu, 15 Nov 2007, Bron Gondwana wrote:
  
  I guess we'll be doing the one-liner kernel mod and testing
  that then.
 
 The thing to look at is get_dirty_limits() in mm/page-writeback.c, and 
 in this particular case it's the
 
   unsigned long available_memory = determine_dirtyable_memory();
 
 that's going to bite you. In particular, note the
 
   x -= highmem_dirtyable_memory(x);
 
 that we do in determine_dirtyable_memory().
 
 So in this case, if you basically remove that line, it will allow all of 
 memory to be dirtied (including highmem), and then the background_ratio 
 will work on the whole 6GB.
 
 HOWEVER! It's worth noting that we also have some other old legacy cruft 
 there that may interfere with your code. In particular, if you look at the 
 top of get_dirty_limits(), it *also* does a
 
 unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
 global_page_state(NR_ANON_PAGES)) * 100) /
 available_memory;
 
 dirty_ratio = vm_dirty_ratio;
 if (dirty_ratio  unmapped_ratio / 2)
 dirty_ratio = unmapped_ratio / 2;
 
 and that whole unmapped_ratio comparison is probably bogus these days, 
 since we now take the mapped dirty pages into account. That code harks 
 back to the days before we did that, and dirty ratios only affected 
 non-mapped pages.
 
 And in particular, now that I look at it, I wonder if it can even go 
 negative (because available_memory may be *smaller* than the 
 NR_FILE_MAPPED|ANON_PAGES sum!).
 
 We'll fix up a negative value anyway (because of the clamping of 
 dirty_ratio to no less than 5), but the point is that the whole 
 unmapped_ratio thing probably doesn't make sense any more, and may well 
 make the dirty_ratio not work for you, because you may have a very small 
 unmapped_ratio that effectively makes all dirty limits always clamp to a 
 very small value.
 
 So regardless, I think you may want to try the appended patch *first*.
 
 If this patch makes a difference, please holler. I think it's the correct 
 thing to do, but I'm not going to actually commit it without somebody 
 saying that it makes a difference (and preferably Peter Zijlstra and 
 Andrew acking it too).

mmap: mmap call failed: errno: 12 errmsg: Cannot allocate memory

Yep, that's fixed the problem alright!  No way this puppy is
dirtying 2Gb of memory any more.

http://linux.brong.fastmail.fm/2007-11-22/bmtest.pl

That said, pushing the size down to 1700 rather than 2000 in that
file makes it run, and the behaviour matches the 2000 Mb case on
2.6.16.55 rather than 2.6.20.20 or 2.6.23.1 (my other test case
kernels that happened to be pre-built on that machine)

[EMAIL PROTECTED] ~]$ free
 total   used   free sharedbuffers cached
Mem:   414983620730562076780  0  220361846096
-/+ buffers/cache: 2049243944912
Swap:  2096472  02096472

That's after running the 1700Mb version.  You can see this machine is our
one remaining 4Gb machine (it's not running any production services unlike
the 6Gb machine, so it's better for testing)

Anyway - looks like this may be a good enough solution for out1 if
it can manage an ~2Gb file with 6Gb of memory available.  I'll test
that later today - but I should drag myself into the office now...

Bron.

(patch left attached below for reference)

 Only *after* testing this change is it probably a good idea to test the 
 real hack of then removing the highmem_dirtyable_memory() thing. 
 
 Peter? Andrew?
 
   Linus
 
 ---
  mm/page-writeback.c |8 
  1 files changed, 0 insertions(+), 8 deletions(-)
 
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index 81a91e6..d55cfca 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
 *pbdi_dirty,
  {
   int background_ratio;   /* Percentages */
   int dirty_ratio;
 - int unmapped_ratio;
   long background;
   long dirty;
   unsigned long available_memory = determine_dirtyable_memory();
   struct task_struct *tsk;
  
 - unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
 - global_page_state(NR_ANON_PAGES)) * 100) /
 - available_memory;
 -
   dirty_ratio = vm_dirty_ratio;
 - if (dirty_ratio  unmapped_ratio / 2)
 - dirty_ratio = unmapped_ratio / 2;
 -
   if (dirty_ratio  5)
   dirty_ratio = 5;
  
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-21 Thread Bron Gondwana

On Thu, Nov 22, 2007 at 10:51:15AM +1100, Bron Gondwana wrote:
 On Thu, Nov 15, 2007 at 08:32:22AM -0800, Linus Torvalds wrote:
  If this patch makes a difference, please holler. I think it's the correct 
  thing to do, but I'm not going to actually commit it without somebody 
  saying that it makes a difference (and preferably Peter Zijlstra and 
  Andrew acking it too).
 
 mmap: mmap call failed: errno: 12 errmsg: Cannot allocate memory
 
 Yep, that's fixed the problem alright!  No way this puppy is
 dirtying 2Gb of memory any more.
 
 http://linux.brong.fastmail.fm/2007-11-22/bmtest.pl

Alternatively perhaps I'm just a moron who used a config file with:
CONFIG_PAGE_OFFSET=0x8000 set to build the new kernel (I hadn't
committed it because it turned out not to solve the issue it was
there for).  That would explain a few things.

[EMAIL PROTECTED] perl]$ free
 total   used   free sharedbuffers cached
Mem:   415062022722841878336  0  112122066536
-/+ buffers/cache: 1945363956084
Swap:  2096472  02096472

That's more the usage I would expect to see.

Now for the downside.  It works again, but it still runs slow.  Seems to
hit (and this is totally unscientific, I'm just watching the numbers
scroll by) at about 12 writes rather than 7 writes, but that's
still not fitting the while file dirty.

I notice that PF_LESS_THROTTLE gets set by nfsd to get an extra 25%
bonus free space allocated.  Potentially dcc could use similar tricks
to claim extra space if that knob is available up in userspace.  I'm
happy to patch dcc as well if I have to, I'm already backporting it,
so adding another little quilt directory and applying it is pretty
trivial (must try guilt/stgit one of these days)

Bron.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Willy Tarreau

On Sun, Nov 18, 2007 at 03:56:11PM +0100, Ingo Molnar wrote:
> 
> * Pavel Machek <[EMAIL PROTECTED]> wrote:
> 
> > On Tue 2007-11-13 12:50:08, Mark Lord wrote:
> > > Ingo Molnar wrote:
> > > >
> > > >for example git-bisect was godsent. I remember that 
> > > >years ago bisection of a bug was a very laborous task 
> > > >so that it was only used as a final, last-ditch 
> > > >approach for really nasty bugs. Today we can 
> > > >autonomouly bisect build bugs via a simple shell 
> > > >command around "git-bisect run", without any human 
> > > >interaction! This freed up testing resources 
> > > ..
> > > 
> > > It's only a godsend for the few people who happen to be 
> > > kernel developers
> > > and who happen to already use git.
> > > 
> > > It's a 540MByte download over a slow link for everyone 
> > > else.
> > 
> > Hmmm, clean-cg is 7.7G on my machine, and yes I tried 
> > git-prune-packed. What am I doing wrong?
> 
> "git-repack -a -d" gives me ~220 MB:
> 
>   $ du -s .git
>   222064  .git
> 
> anyone who can download a 43 MB tar.bz2 tarball for a kernel release 
> should be able to afford a _one time_ download size of 250 MB (the size 
> of the current kernel.org git repository). If not, burning a CD or DVD 
> and carrying it home ought to do the trick. Git is very 
> bandwidth-efficient after that point - lots of people behind narrow 
> pipes are using it - it's just the initial clone that takes time. And 
> given all the history and metadata that the git repository carries (full 
> changelogs, annotations, etc.) it's a no-brainer that kernel developers 
> should be using it.
> 
> (and you can shrink the 250 MB further down by using shallow clones, 
> etc.)
> 
> yes, some people complained when distros stopped doing floppy installs. 
> Some people complained when distros stopped doing CD installs. Yes, i've 
> myself done a 250+ MB download over a 56 kbit modem in the past, and 
> while it indeed took overnight to finish, it's very much doable. It's 
> not really qualitatively different from the 1.5 hours a kernel tar.bz2 
> took to download.

Probably that once in a while, we should set up a complete tree in a
tar.bz2 format on kernel.org. It would help a lot of people behind small
pipes. I have been encountering problems with git-clone when the link is
unstable. After the smallest error, it erases everything and you have to
retry from start, which is quite frustrating and expensive.

At least, downloading a tar.bz2 with FTP would be easier and a lot more
reliable. Also, people could download it from their workplace and bring
it home.

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 01:14:32PM -0800, Linus Torvalds wrote:

Sorry about not replying to this earlier.  I actually got a weekend
away from the computer pretty much last weekend - took the kids
swimming, helped a friend clear dead wood from around her house
before the fire season.  Shocking I know.

> On Thu, 15 Nov 2007, Linus Torvalds wrote:
> > 
> > Unacceptable. We used to do exactly what your patch does, and it got fixed 
> > once. We're not introducing that fundamentally broken concept again.
> 
> Examples of non-broken solutions:
>  (a) always use lowmem sizes (what we do now)
>  (b) always use total mem sizes (sane but potentially dangerous: but the 
>  VM pressure should work! It has serious bounce-buffer issues, though, 
>  which is why I think it's crazy even if it's otherwise consistent)
>  (c) make all dirty counting be *purely* per-bdi, so that everybody can 
>  disagree on what the limits are, but at least they also then use 
>  different counters
> 
> So it's just the "different writers look at the same dirty counts but then 
> interpret it to mean totally different things" that I think is so 
> fundamentally bogus. I'm not claiming that what we do now is the only way 
> to do things, I just don't think your approach is tenable.
> 
> Btw, I actually suspect that while (a) is what we do now, for the specific 
> case that Bron has, we could have a /proc/sys/vm option to just enable 
> (b). So we don't have to have just one consistent model, we can allow odd 
> users (and Bron sounds like one - sorry Bron ;) to just force other, odd, 
> but consistent models.

Hey, if Andrew Morton can tell us we find all the interesting bugs, you
can call me odd.  I've been called worse!

We also run ReiserFS (3 of course, I tried 4 and it et my laptop disk)
on all our production IMAP servers.  Tried ext3 and the performance was
so horrible that our users hated us (and I hated being woken in the
night by things timing out and paging me).  And I'm spending far too
long still writing C thanks to Cyrus having enough bugs to keep me busy
for the rest of my natural life if I don't break and go write my own
IMAP server at some point. *clears throat*

> I'd also like to point out that while the "bounce buffer" issue is not so 
> much a HIGHMEM issue on its own (it's really about the device DMA limits, 
> which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
> special is that without HIGHMEM the bounce buffers generally work 
> perfectly fine.
> 
> The problem with HIGHMEM is that it causes various metadata (dentries, 
> inodes, page struct tables etc) to eat up memory "prime real estate" under 
> the same kind of conditions that also dirty a lot of memory. So the reason 
> we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
> mapping DMA limits, and to a large degree the fact that non-highmem memory 
> is special in general, and it is usually the non-highmem areas that are 
> constrained - and need to be protected.

I'm going to finish off writing a decent test case so I can reliably
reproduce the problem first, and then go compile a small set of kernels
with the various patches that have been thrown around here and see if
they solve the problems for me.

Thankfully I don't have the same problem you do Linus - I don't care if
any particular patch isn't consistent - isn't fair in the general sense
- even "doesn't work for anyone else".  So long as it's stable and it
works on this machine I'm happy to support it through the next couple
of years until we either get a world facing 64 bit machine with the
spare capacity to run DCC or we drop DCC.

The only reason to upgrade the kernel there at all is keeping up-to-date
with security patches, and the relative tradeoffs of backporting (or
expecting Adrian Bunk to keep doing it for us) rather than maintaing
a small patch to keep the behaviour of one thing we like.

And to all of you in this thread (especially Linus and Peter) - thanks
heaps for grabbing on to a throw away line in an unrelated discussion
and putting the work in to:

a) explain the problem and the cause to me before I put in heaps of
   work tracking it down; and

b) putting together some patches for me to test.

I saw a couple of days ago someone posted an "Ask Slashdot" whether
6 weeks was an appropriate time for a software vendor to get a fix
out to a customer and implying that the customer was an unrealistic
whiner to expect anyone to be able to do better.  I'll be able to
point to this thread if anyone suggests that you can't get decent
support on Linux!

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Bron Gondwana

On Sun, Nov 18, 2007 at 04:13:18PM -0700, Daniel Phillips wrote:
> On Thursday 15 November 2007 14:24, Rob Mueller wrote:
> > > That's my personal opinion, and I realize that some of the
> > > commercial vendors may care about their insane customers'
> > > satisfaction, but I'm simply not interested in insane users. If
> > > they have that much RAM (and bought it a few years ago when a
> > > 64-bit CPU wasn't an option), they can't be poor.
> >
> > From our perspective, the main issue is that some of these machines
> > we spent quite a bit of money on the big RAM (for it's day) + lots of
> > 15k RPM SCSI drives + multi-year support contracts. They're highly IO
> > bound, and barely use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs.
> > It's hard to justify junking those machines < 5 years.
> >
> > We have a couple of 6G machines and some 8G machines using PAE. On
> > the whole, they actually have been working really well (hmmm, apart
> > from the recent dirty pages issue + reiserfs data=journal leaks +
> > inodes in lowmem limits)
> 
> Junk everything except the 15K drives, you will be glad you did.  Too 
> bad about those multi-year support contracts, hopefully you got a deal 
> on them.

Actually, the 15K drives are the bit we're getting the least use out
of now, since we're moving everything to external SATA units that
are more easily swapable.

> Prediction: after these dirty pages issues are gone, there will be more 
> dirty page issues because the notion of dirty page limit is 
> fundamentally broken.  Your smartest recourse is to re-motherboard to a 
> place where the dirty page limit borkage does not hurt as much, and in 
> the process you will get a cheap hardware upgrade.  Everybody will be 
> happy, the sun will come out, the birds will sing.

Or just keep running 2.6.16 where it's all been working quite fine
thanks very much, or maintain a simple patch that rips all that out
since we don't care too much about "fairness" - we only run a couple
of things on that machine and they run fine.

Bron ( going to settle down and really test this stuff to make sure
   we have an acceptable "fix" for us then do it! )
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Daniel Phillips

On Thursday 15 November 2007 14:24, Rob Mueller wrote:
> > That's my personal opinion, and I realize that some of the
> > commercial vendors may care about their insane customers'
> > satisfaction, but I'm simply not interested in insane users. If
> > they have that much RAM (and bought it a few years ago when a
> > 64-bit CPU wasn't an option), they can't be poor.
>
> From our perspective, the main issue is that some of these machines
> we spent quite a bit of money on the big RAM (for it's day) + lots of
> 15k RPM SCSI drives + multi-year support contracts. They're highly IO
> bound, and barely use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs.
> It's hard to justify junking those machines < 5 years.
>
> We have a couple of 6G machines and some 8G machines using PAE. On
> the whole, they actually have been working really well (hmmm, apart
> from the recent dirty pages issue + reiserfs data=journal leaks +
> inodes in lowmem limits)

Junk everything except the 15K drives, you will be glad you did.  Too 
bad about those multi-year support contracts, hopefully you got a deal 
on them.

Prediction: after these dirty pages issues are gone, there will be more 
dirty page issues because the notion of dirty page limit is 
fundamentally broken.  Your smartest recourse is to re-motherboard to a 
place where the dirty page limit borkage does not hurt as much, and in 
the process you will get a cheap hardware upgrade.  Everybody will be 
happy, the sun will come out, the birds will sing.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-18 Thread Theodore Tso

On Sat, Nov 17, 2007 at 01:20:10PM +0100, Adrian Bunk wrote:
> > But a bisect takes around 7 compiles.
> >...
> 
> I don't understand that number.
> 
> The common case are regressions in -rc1, and a bisection of
> at about 7000 commits takes around 13 compiles.

Worst case it would take 13.  In practice I've seen less.  Part of it
is I suspect that I'm starting with something more recent than the
previous 2.6.23, but rather a daily -git13 or -git14 tree.  Part of it
may be because git can sometimes be more efficient by cutting off
entire branches with trail builds, since git history isn't completely
linear, but rather pulls of various branches together.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Rene Herman


On 18-11-07 15:35, James Bottomley wrote:


clean-cg? But failure to run "git repack -a -d" every once in a while?


Actually, the best command is

git gc

which does a repack (into a single pack file rather than an incremenal), 
and then removes all the objects now in the pack.  If, like me, you work 
on temporary branches which you keep rebasing, you can add a --prune to 
gc which will erase all unreferenced objects as it packs (use this one 
with care.  I usually never use it but run a git prune -n just to see 
what would be removed, and then run git prune separately if it looks OK).


Thanks for the comment. That managed to indeed shave a few extra bytes off 
my already "repack -a -d" packed repo still.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Ingo Molnar

* Pavel Machek <[EMAIL PROTECTED]> wrote:

> On Tue 2007-11-13 12:50:08, Mark Lord wrote:
> > Ingo Molnar wrote:
> > >
> > >for example git-bisect was godsent. I remember that 
> > >years ago bisection of a bug was a very laborous task 
> > >so that it was only used as a final, last-ditch 
> > >approach for really nasty bugs. Today we can 
> > >autonomouly bisect build bugs via a simple shell 
> > >command around "git-bisect run", without any human 
> > >interaction! This freed up testing resources 
> > ..
> > 
> > It's only a godsend for the few people who happen to be 
> > kernel developers
> > and who happen to already use git.
> > 
> > It's a 540MByte download over a slow link for everyone 
> > else.
> 
> Hmmm, clean-cg is 7.7G on my machine, and yes I tried 
> git-prune-packed. What am I doing wrong?

"git-repack -a -d" gives me ~220 MB:

  $ du -s .git
  222064  .git

anyone who can download a 43 MB tar.bz2 tarball for a kernel release 
should be able to afford a _one time_ download size of 250 MB (the size 
of the current kernel.org git repository). If not, burning a CD or DVD 
and carrying it home ought to do the trick. Git is very 
bandwidth-efficient after that point - lots of people behind narrow 
pipes are using it - it's just the initial clone that takes time. And 
given all the history and metadata that the git repository carries (full 
changelogs, annotations, etc.) it's a no-brainer that kernel developers 
should be using it.

(and you can shrink the 250 MB further down by using shallow clones, 
etc.)

yes, some people complained when distros stopped doing floppy installs. 
Some people complained when distros stopped doing CD installs. Yes, i've 
myself done a 250+ MB download over a 56 kbit modem in the past, and 
while it indeed took overnight to finish, it's very much doable. It's 
not really qualitatively different from the 1.5 hours a kernel tar.bz2 
took to download.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread James Bottomley

On Sun, 2007-11-18 at 13:58 +0100, Rene Herman wrote:
> On 18-11-07 13:44, Pavel Machek wrote:
> 
> > On Tue 2007-11-13 12:50:08, Mark Lord wrote:
> 
> >> It's a 540MByte download over a slow link for everyone 
> >> else.
> > 
> > Hmmm, clean-cg is 7.7G on my machine, and yes I tried
> > git-prune-packed. What am I doing wrong?
> 
> clean-cg? But failure to run "git repack -a -d" every once in a while?

Actually, the best command is 

git gc

which does a repack (into a single pack file rather than an incremenal),
and then removes all the objects now in the pack.  If, like me, you work
on temporary branches which you keep rebasing, you can add a --prune to
gc which will erase all unreferenced objects as it packs (use this one
with care.  I usually never use it but run a git prune -n just to see
what would be removed, and then run git prune separately if it looks
OK).

James

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Rene Herman


On 18-11-07 13:44, Pavel Machek wrote:


On Tue 2007-11-13 12:50:08, Mark Lord wrote:


It's a 540MByte download over a slow link for everyone 
else.


Hmmm, clean-cg is 7.7G on my machine, and yes I tried
git-prune-packed. What am I doing wrong?


clean-cg? But failure to run "git repack -a -d" every once in a while?

Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Pavel Machek

On Tue 2007-11-13 12:50:08, Mark Lord wrote:
> Ingo Molnar wrote:
> >
> >for example git-bisect was godsent. I remember that 
> >years ago bisection of a bug was a very laborous task 
> >so that it was only used as a final, last-ditch 
> >approach for really nasty bugs. Today we can 
> >autonomouly bisect build bugs via a simple shell 
> >command around "git-bisect run", without any human 
> >interaction! This freed up testing resources 
> ..
> 
> It's only a godsend for the few people who happen to be 
> kernel developers
> and who happen to already use git.
> 
> It's a 540MByte download over a slow link for everyone 
> else.

Hmmm, clean-cg is 7.7G on my machine, and yes I tried
git-prune-packed. What am I doing wrong?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Pavel Machek

On Tue 2007-11-13 12:50:08, Mark Lord wrote:
 Ingo Molnar wrote:
 
 for example git-bisect was godsent. I remember that 
 years ago bisection of a bug was a very laborous task 
 so that it was only used as a final, last-ditch 
 approach for really nasty bugs. Today we can 
 autonomouly bisect build bugs via a simple shell 
 command around git-bisect run, without any human 
 interaction! This freed up testing resources 
 ..
 
 It's only a godsend for the few people who happen to be 
 kernel developers
 and who happen to already use git.
 
 It's a 540MByte download over a slow link for everyone 
 else.

Hmmm, clean-cg is 7.7G on my machine, and yes I tried
git-prune-packed. What am I doing wrong?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Rene Herman


On 18-11-07 13:44, Pavel Machek wrote:


On Tue 2007-11-13 12:50:08, Mark Lord wrote:


It's a 540MByte download over a slow link for everyone 
else.


Hmmm, clean-cg is 7.7G on my machine, and yes I tried
git-prune-packed. What am I doing wrong?


clean-cg? But failure to run git repack -a -d every once in a while?

Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread James Bottomley


On Sun, 2007-11-18 at 13:58 +0100, Rene Herman wrote:
 On 18-11-07 13:44, Pavel Machek wrote:
 
  On Tue 2007-11-13 12:50:08, Mark Lord wrote:
 
  It's a 540MByte download over a slow link for everyone 
  else.
  
  Hmmm, clean-cg is 7.7G on my machine, and yes I tried
  git-prune-packed. What am I doing wrong?
 
 clean-cg? But failure to run git repack -a -d every once in a while?

Actually, the best command is 

git gc

which does a repack (into a single pack file rather than an incremenal),
and then removes all the objects now in the pack.  If, like me, you work
on temporary branches which you keep rebasing, you can add a --prune to
gc which will erase all unreferenced objects as it packs (use this one
with care.  I usually never use it but run a git prune -n just to see
what would be removed, and then run git prune separately if it looks
OK).

James


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Ingo Molnar


* Pavel Machek [EMAIL PROTECTED] wrote:

 On Tue 2007-11-13 12:50:08, Mark Lord wrote:
  Ingo Molnar wrote:
  
  for example git-bisect was godsent. I remember that 
  years ago bisection of a bug was a very laborous task 
  so that it was only used as a final, last-ditch 
  approach for really nasty bugs. Today we can 
  autonomouly bisect build bugs via a simple shell 
  command around git-bisect run, without any human 
  interaction! This freed up testing resources 
  ..
  
  It's only a godsend for the few people who happen to be 
  kernel developers
  and who happen to already use git.
  
  It's a 540MByte download over a slow link for everyone 
  else.
 
 Hmmm, clean-cg is 7.7G on my machine, and yes I tried 
 git-prune-packed. What am I doing wrong?

git-repack -a -d gives me ~220 MB:

  $ du -s .git
  222064  .git

anyone who can download a 43 MB tar.bz2 tarball for a kernel release 
should be able to afford a _one time_ download size of 250 MB (the size 
of the current kernel.org git repository). If not, burning a CD or DVD 
and carrying it home ought to do the trick. Git is very 
bandwidth-efficient after that point - lots of people behind narrow 
pipes are using it - it's just the initial clone that takes time. And 
given all the history and metadata that the git repository carries (full 
changelogs, annotations, etc.) it's a no-brainer that kernel developers 
should be using it.

(and you can shrink the 250 MB further down by using shallow clones, 
etc.)

yes, some people complained when distros stopped doing floppy installs. 
Some people complained when distros stopped doing CD installs. Yes, i've 
myself done a 250+ MB download over a 56 kbit modem in the past, and 
while it indeed took overnight to finish, it's very much doable. It's 
not really qualitatively different from the 1.5 hours a kernel tar.bz2 
took to download.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Rene Herman


On 18-11-07 15:35, James Bottomley wrote:


clean-cg? But failure to run git repack -a -d every once in a while?


Actually, the best command is

git gc

which does a repack (into a single pack file rather than an incremenal), 
and then removes all the objects now in the pack.  If, like me, you work 
on temporary branches which you keep rebasing, you can add a --prune to 
gc which will erase all unreferenced objects as it packs (use this one 
with care.  I usually never use it but run a git prune -n just to see 
what would be removed, and then run git prune separately if it looks OK).


Thanks for the comment. That managed to indeed shave a few extra bytes off 
my already repack -a -d packed repo still.


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-18 Thread Theodore Tso

On Sat, Nov 17, 2007 at 01:20:10PM +0100, Adrian Bunk wrote:
  But a bisect takes around 7 compiles.
 ...
 
 I don't understand that number.
 
 The common case are regressions in -rc1, and a bisection of
 at about 7000 commits takes around 13 compiles.

Worst case it would take 13.  In practice I've seen less.  Part of it
is I suspect that I'm starting with something more recent than the
previous 2.6.23, but rather a daily -git13 or -git14 tree.  Part of it
may be because git can sometimes be more efficient by cutting off
entire branches with trail builds, since git history isn't completely
linear, but rather pulls of various branches together.

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Daniel Phillips

On Thursday 15 November 2007 14:24, Rob Mueller wrote:
  That's my personal opinion, and I realize that some of the
  commercial vendors may care about their insane customers'
  satisfaction, but I'm simply not interested in insane users. If
  they have that much RAM (and bought it a few years ago when a
  64-bit CPU wasn't an option), they can't be poor.

 From our perspective, the main issue is that some of these machines
 we spent quite a bit of money on the big RAM (for it's day) + lots of
 15k RPM SCSI drives + multi-year support contracts. They're highly IO
 bound, and barely use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs.
 It's hard to justify junking those machines  5 years.

 We have a couple of 6G machines and some 8G machines using PAE. On
 the whole, they actually have been working really well (hmmm, apart
 from the recent dirty pages issue + reiserfs data=journal leaks +
 inodes in lowmem limits)

Junk everything except the 15K drives, you will be glad you did.  Too 
bad about those multi-year support contracts, hopefully you got a deal 
on them.

Prediction: after these dirty pages issues are gone, there will be more 
dirty page issues because the notion of dirty page limit is 
fundamentally broken.  Your smartest recourse is to re-motherboard to a 
place where the dirty page limit borkage does not hurt as much, and in 
the process you will get a cheap hardware upgrade.  Everybody will be 
happy, the sun will come out, the birds will sing.

Regards,

Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Bron Gondwana

On Sun, Nov 18, 2007 at 04:13:18PM -0700, Daniel Phillips wrote:
 On Thursday 15 November 2007 14:24, Rob Mueller wrote:
   That's my personal opinion, and I realize that some of the
   commercial vendors may care about their insane customers'
   satisfaction, but I'm simply not interested in insane users. If
   they have that much RAM (and bought it a few years ago when a
   64-bit CPU wasn't an option), they can't be poor.
 
  From our perspective, the main issue is that some of these machines
  we spent quite a bit of money on the big RAM (for it's day) + lots of
  15k RPM SCSI drives + multi-year support contracts. They're highly IO
  bound, and barely use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs.
  It's hard to justify junking those machines  5 years.
 
  We have a couple of 6G machines and some 8G machines using PAE. On
  the whole, they actually have been working really well (hmmm, apart
  from the recent dirty pages issue + reiserfs data=journal leaks +
  inodes in lowmem limits)
 
 Junk everything except the 15K drives, you will be glad you did.  Too 
 bad about those multi-year support contracts, hopefully you got a deal 
 on them.

Actually, the 15K drives are the bit we're getting the least use out
of now, since we're moving everything to external SATA units that
are more easily swapable.
 
 Prediction: after these dirty pages issues are gone, there will be more 
 dirty page issues because the notion of dirty page limit is 
 fundamentally broken.  Your smartest recourse is to re-motherboard to a 
 place where the dirty page limit borkage does not hurt as much, and in 
 the process you will get a cheap hardware upgrade.  Everybody will be 
 happy, the sun will come out, the birds will sing.

Or just keep running 2.6.16 where it's all been working quite fine
thanks very much, or maintain a simple patch that rips all that out
since we don't care too much about fairness - we only run a couple
of things on that machine and they run fine.

Bron ( going to settle down and really test this stuff to make sure
   we have an acceptable fix for us then do it! )
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-18 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 01:14:32PM -0800, Linus Torvalds wrote:
 
Sorry about not replying to this earlier.  I actually got a weekend
away from the computer pretty much last weekend - took the kids
swimming, helped a friend clear dead wood from around her house
before the fire season.  Shocking I know.
 
 On Thu, 15 Nov 2007, Linus Torvalds wrote:
  
  Unacceptable. We used to do exactly what your patch does, and it got fixed 
  once. We're not introducing that fundamentally broken concept again.
 
 Examples of non-broken solutions:
  (a) always use lowmem sizes (what we do now)
  (b) always use total mem sizes (sane but potentially dangerous: but the 
  VM pressure should work! It has serious bounce-buffer issues, though, 
  which is why I think it's crazy even if it's otherwise consistent)
  (c) make all dirty counting be *purely* per-bdi, so that everybody can 
  disagree on what the limits are, but at least they also then use 
  different counters
 
 So it's just the different writers look at the same dirty counts but then 
 interpret it to mean totally different things that I think is so 
 fundamentally bogus. I'm not claiming that what we do now is the only way 
 to do things, I just don't think your approach is tenable.
 
 Btw, I actually suspect that while (a) is what we do now, for the specific 
 case that Bron has, we could have a /proc/sys/vm option to just enable 
 (b). So we don't have to have just one consistent model, we can allow odd 
 users (and Bron sounds like one - sorry Bron ;) to just force other, odd, 
 but consistent models.

Hey, if Andrew Morton can tell us we find all the interesting bugs, you
can call me odd.  I've been called worse!

We also run ReiserFS (3 of course, I tried 4 and it et my laptop disk)
on all our production IMAP servers.  Tried ext3 and the performance was
so horrible that our users hated us (and I hated being woken in the
night by things timing out and paging me).  And I'm spending far too
long still writing C thanks to Cyrus having enough bugs to keep me busy
for the rest of my natural life if I don't break and go write my own
IMAP server at some point. *clears throat*

 I'd also like to point out that while the bounce buffer issue is not so 
 much a HIGHMEM issue on its own (it's really about the device DMA limits, 
 which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
 special is that without HIGHMEM the bounce buffers generally work 
 perfectly fine.
 
 The problem with HIGHMEM is that it causes various metadata (dentries, 
 inodes, page struct tables etc) to eat up memory prime real estate under 
 the same kind of conditions that also dirty a lot of memory. So the reason 
 we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
 mapping DMA limits, and to a large degree the fact that non-highmem memory 
 is special in general, and it is usually the non-highmem areas that are 
 constrained - and need to be protected.

I'm going to finish off writing a decent test case so I can reliably
reproduce the problem first, and then go compile a small set of kernels
with the various patches that have been thrown around here and see if
they solve the problems for me.

Thankfully I don't have the same problem you do Linus - I don't care if
any particular patch isn't consistent - isn't fair in the general sense
- even doesn't work for anyone else.  So long as it's stable and it
works on this machine I'm happy to support it through the next couple
of years until we either get a world facing 64 bit machine with the
spare capacity to run DCC or we drop DCC.

The only reason to upgrade the kernel there at all is keeping up-to-date
with security patches, and the relative tradeoffs of backporting (or
expecting Adrian Bunk to keep doing it for us) rather than maintaing
a small patch to keep the behaviour of one thing we like.

And to all of you in this thread (especially Linus and Peter) - thanks
heaps for grabbing on to a throw away line in an unrelated discussion
and putting the work in to:

a) explain the problem and the cause to me before I put in heaps of
   work tracking it down; and

b) putting together some patches for me to test.

I saw a couple of days ago someone posted an Ask Slashdot whether
6 weeks was an appropriate time for a software vendor to get a fix
out to a customer and implying that the customer was an unrealistic
whiner to expect anyone to be able to do better.  I'll be able to
point to this thread if anyone suggests that you can't get decent
support on Linux!

Bron.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: size of git repository (was Re: [BUG] New Kernel Bugs)

2007-11-18 Thread Willy Tarreau

On Sun, Nov 18, 2007 at 03:56:11PM +0100, Ingo Molnar wrote:
 
 * Pavel Machek [EMAIL PROTECTED] wrote:
 
  On Tue 2007-11-13 12:50:08, Mark Lord wrote:
   Ingo Molnar wrote:
   
   for example git-bisect was godsent. I remember that 
   years ago bisection of a bug was a very laborous task 
   so that it was only used as a final, last-ditch 
   approach for really nasty bugs. Today we can 
   autonomouly bisect build bugs via a simple shell 
   command around git-bisect run, without any human 
   interaction! This freed up testing resources 
   ..
   
   It's only a godsend for the few people who happen to be 
   kernel developers
   and who happen to already use git.
   
   It's a 540MByte download over a slow link for everyone 
   else.
  
  Hmmm, clean-cg is 7.7G on my machine, and yes I tried 
  git-prune-packed. What am I doing wrong?
 
 git-repack -a -d gives me ~220 MB:
 
   $ du -s .git
   222064  .git
 
 anyone who can download a 43 MB tar.bz2 tarball for a kernel release 
 should be able to afford a _one time_ download size of 250 MB (the size 
 of the current kernel.org git repository). If not, burning a CD or DVD 
 and carrying it home ought to do the trick. Git is very 
 bandwidth-efficient after that point - lots of people behind narrow 
 pipes are using it - it's just the initial clone that takes time. And 
 given all the history and metadata that the git repository carries (full 
 changelogs, annotations, etc.) it's a no-brainer that kernel developers 
 should be using it.
 
 (and you can shrink the 250 MB further down by using shallow clones, 
 etc.)
 
 yes, some people complained when distros stopped doing floppy installs. 
 Some people complained when distros stopped doing CD installs. Yes, i've 
 myself done a 250+ MB download over a 56 kbit modem in the past, and 
 while it indeed took overnight to finish, it's very much doable. It's 
 not really qualitatively different from the 1.5 hours a kernel tar.bz2 
 took to download.

Probably that once in a while, we should set up a complete tree in a
tar.bz2 format on kernel.org. It would help a lot of people behind small
pipes. I have been encountering problems with git-clone when the link is
unstable. After the smallest error, it erases everything and you have to
retry from start, which is quite frustrating and expensive.

At least, downloading a tar.bz2 with FTP would be easier and a lot more
reliable. Also, people could download it from their workplace and bring
it home.

Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-17 Thread Adrian Bunk

On Fri, Nov 16, 2007 at 02:46:18PM -0500, Theodore Tso wrote:
> On Fri, Nov 16, 2007 at 01:20:16PM -0500, Daniel Barkalow wrote:
> > Compared to getting useful suggestions from a mailing list, especially 
> > before you've gotten anybody's attention? Hours or overnight isn't 
> > particularly long, and doesn't take up much of your time if you've got a 
> > working kernel to use while it's working.
> 
> But a bisect takes around 7 compiles.
>...

I don't understand that number.

The common case are regressions in -rc1, and a bisection of
at about 7000 commits takes around 13 compiles.

> - Ted

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-17 Thread Adrian Bunk

On Fri, Nov 16, 2007 at 02:46:18PM -0500, Theodore Tso wrote:
 On Fri, Nov 16, 2007 at 01:20:16PM -0500, Daniel Barkalow wrote:
  Compared to getting useful suggestions from a mailing list, especially 
  before you've gotten anybody's attention? Hours or overnight isn't 
  particularly long, and doesn't take up much of your time if you've got a 
  working kernel to use while it's working.
 
 But a bisect takes around 7 compiles.
...

I don't understand that number.

The common case are regressions in -rc1, and a bisection of
at about 7000 commits takes around 13 compiles.

 - Ted

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Use poof for linux-omap (Was: [BUG] New Kernel Bugs)

2007-11-16 Thread Tony Lindgren

Hi Dave,

* David Miller <[EMAIL PROTECTED]> [071114 02:09]:



> 
> In fact, *poof*, there it is, [EMAIL PROTECTED] is there and
> available for anyone who wants to use it.

Can you please use your *poof* trick one more time to set up
[EMAIL PROTECTED]

We've (as in linux-omap community) would like to move from subscriber
only list at [EMAIL PROTECTED] to vger as we're
starting to get more patches and comments on LKML. For related
discussion on [EMAIL PROTECTED], see [1].

Regards,

Tony

[1] 
http://linux.omap.com/pipermail/linux-omap-open-source/2007-November/011980.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Theodore Tso

On Fri, Nov 16, 2007 at 01:20:16PM -0500, Daniel Barkalow wrote:
> Compared to getting useful suggestions from a mailing list, especially 
> before you've gotten anybody's attention? Hours or overnight isn't 
> particularly long, and doesn't take up much of your time if you've got a 
> working kernel to use while it's working.

But a bisect takes around 7 compiles.  And even when it takes only an
hour, that's enough time for you to get started working on something
else, and saving all of your context so you can at that point try
booting into a kernel really is quite annoying.  Hence the suggestion
for a way for users to download commonly used snapshot points for
bisect runs.  Yes, it will require some central infrastructure, but if
it allows for more distributed debugging, this would be a good thing.

  - Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Daniel Barkalow

On Fri, 16 Nov 2007, Romano Giannetti wrote:

> 
> (Cc: trimmed a bit).
> 
> On Thu, 2007-11-15 at 11:19 -0500, Daniel Barkalow wrote:
> > On Thu, 15 Nov 2007, Theodore Tso wrote:
> [...]
> > > A full kernel build with everything selected can take good 30 minutes or 
> > > more, and that's on a fast dual-core machine with 4gigs of memory and 
> > > 7200rpm disk drives. On a slower, memory limited laptop, doing a single 
> > > kernel build can take more time than the user has patiences; multiply 
> > > that by 7 or 8 build and test boots, and it starts to get tiresome.
> > 
> > None of this is going to take as long, 
> 
> Well, the compile phase can. Especially if the first time you try to
> compile the kernel with EXTRAVERSION=`git describe` which force almost a
> full rebuild every time...

Compared to getting useful suggestions from a mailing list, especially 
before you've gotten anybody's attention? Hours or overnight isn't 
particularly long, and doesn't take up much of your time if you've got a 
working kernel to use while it's working.

> But the worst problem is that a full recompile, with a distro .config,
> will take hours on my 2.66GHz/CoreDuo/1G ram. Trimming down .config is
> fundamental to be able to bisect effectively, but it's not an easy thing
> to do for an unexperienced user (and a painful one for all the rest of
> us). 
> 
> What would be an invaluable help would be a tool that generates
> a .config with all the modules and subsystems I am using *now*. Should
> be possible in principle by parsing KConfig and Makefiles and using as
> input the current .config and lsmod... is it possible to map the kernel
> object name to the option enabling it?

I don't think there's anything set up for that, aside from the actual 
build system generating it, and I don't know how hard that would be to 
repurpose for generating a configuration.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Romano Giannetti

(Cc: trimmed a bit).

On Thu, 2007-11-15 at 11:19 -0500, Daniel Barkalow wrote:
> On Thu, 15 Nov 2007, Theodore Tso wrote:
[...]
> > A full kernel build with everything selected can take good 30 minutes or 
> > more, and that's on a fast dual-core machine with 4gigs of memory and 
> > 7200rpm disk drives. On a slower, memory limited laptop, doing a single 
> > kernel build can take more time than the user has patiences; multiply 
> > that by 7 or 8 build and test boots, and it starts to get tiresome.
> 
> None of this is going to take as long, 

Well, the compile phase can. Especially if the first time you try to
compile the kernel with EXTRAVERSION=`git describe` which force almost a
full rebuild every time...

But the worst problem is that a full recompile, with a distro .config,
will take hours on my 2.66GHz/CoreDuo/1G ram. Trimming down .config is
fundamental to be able to bisect effectively, but it's not an easy thing
to do for an unexperienced user (and a painful one for all the rest of
us). 

What would be an invaluable help would be a tool that generates
a .config with all the modules and subsystems I am using *now*. Should
be possible in principle by parsing KConfig and Makefiles and using as
input the current .config and lsmod... is it possible to map the kernel
object name to the option enabling it?

Romano 

-- 
Sorry for the disclaimer --- ¡I cannot stop it!

--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso 
del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, 
le informamos que cualquier forma de distribución, reproducción o uso de esta 
comunicación y/o de la información contenida en la misma están estrictamente 
prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por 
favor, notifíquelo inmediatamente al remitente contestando a este mensaje y 
proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive 
use of the intended addressee. If you are not the intended addressee, please 
note that any form of distribution, copying or use of this communication or the 
information in it is strictly prohibited by law. If you have received this 
communication in error, please immediately notify the sender by reply e-mail 
and destroy this message. Thank you for your cooperation. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Romano Giannetti


(Cc: trimmed a bit).

On Thu, 2007-11-15 at 11:19 -0500, Daniel Barkalow wrote:
 On Thu, 15 Nov 2007, Theodore Tso wrote:
[...]
  A full kernel build with everything selected can take good 30 minutes or 
  more, and that's on a fast dual-core machine with 4gigs of memory and 
  7200rpm disk drives. On a slower, memory limited laptop, doing a single 
  kernel build can take more time than the user has patiences; multiply 
  that by 7 or 8 build and test boots, and it starts to get tiresome.
 
 None of this is going to take as long, 

Well, the compile phase can. Especially if the first time you try to
compile the kernel with EXTRAVERSION=`git describe` which force almost a
full rebuild every time...

But the worst problem is that a full recompile, with a distro .config,
will take hours on my 2.66GHz/CoreDuo/1G ram. Trimming down .config is
fundamental to be able to bisect effectively, but it's not an easy thing
to do for an unexperienced user (and a painful one for all the rest of
us). 

What would be an invaluable help would be a tool that generates
a .config with all the modules and subsystems I am using *now*. Should
be possible in principle by parsing KConfig and Makefiles and using as
input the current .config and lsmod... is it possible to map the kernel
object name to the option enabling it?

Romano 


-- 
Sorry for the disclaimer --- ¡I cannot stop it!



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso 
del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, 
le informamos que cualquier forma de distribución, reproducción o uso de esta 
comunicación y/o de la información contenida en la misma están estrictamente 
prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por 
favor, notifíquelo inmediatamente al remitente contestando a este mensaje y 
proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive 
use of the intended addressee. If you are not the intended addressee, please 
note that any form of distribution, copying or use of this communication or the 
information in it is strictly prohibited by law. If you have received this 
communication in error, please immediately notify the sender by reply e-mail 
and destroy this message. Thank you for your cooperation. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Daniel Barkalow

On Fri, 16 Nov 2007, Romano Giannetti wrote:

 
 (Cc: trimmed a bit).
 
 On Thu, 2007-11-15 at 11:19 -0500, Daniel Barkalow wrote:
  On Thu, 15 Nov 2007, Theodore Tso wrote:
 [...]
   A full kernel build with everything selected can take good 30 minutes or 
   more, and that's on a fast dual-core machine with 4gigs of memory and 
   7200rpm disk drives. On a slower, memory limited laptop, doing a single 
   kernel build can take more time than the user has patiences; multiply 
   that by 7 or 8 build and test boots, and it starts to get tiresome.
  
  None of this is going to take as long, 
 
 Well, the compile phase can. Especially if the first time you try to
 compile the kernel with EXTRAVERSION=`git describe` which force almost a
 full rebuild every time...

Compared to getting useful suggestions from a mailing list, especially 
before you've gotten anybody's attention? Hours or overnight isn't 
particularly long, and doesn't take up much of your time if you've got a 
working kernel to use while it's working.

 But the worst problem is that a full recompile, with a distro .config,
 will take hours on my 2.66GHz/CoreDuo/1G ram. Trimming down .config is
 fundamental to be able to bisect effectively, but it's not an easy thing
 to do for an unexperienced user (and a painful one for all the rest of
 us). 
 
 What would be an invaluable help would be a tool that generates
 a .config with all the modules and subsystems I am using *now*. Should
 be possible in principle by parsing KConfig and Makefiles and using as
 input the current .config and lsmod... is it possible to map the kernel
 object name to the option enabling it?

I don't think there's anything set up for that, aside from the actual 
build system generating it, and I don't know how hard that would be to 
repurpose for generating a configuration.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-16 Thread Theodore Tso

On Fri, Nov 16, 2007 at 01:20:16PM -0500, Daniel Barkalow wrote:
 Compared to getting useful suggestions from a mailing list, especially 
 before you've gotten anybody's attention? Hours or overnight isn't 
 particularly long, and doesn't take up much of your time if you've got a 
 working kernel to use while it's working.

But a bisect takes around 7 compiles.  And even when it takes only an
hour, that's enough time for you to get started working on something
else, and saving all of your context so you can at that point try
booting into a kernel really is quite annoying.  Hence the suggestion
for a way for users to download commonly used snapshot points for
bisect runs.  Yes, it will require some central infrastructure, but if
it allows for more distributed debugging, this would be a good thing.

  - Ted
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Use poof for linux-omap (Was: [BUG] New Kernel Bugs)

2007-11-16 Thread Tony Lindgren

Hi Dave,

* David Miller [EMAIL PROTECTED] [071114 02:09]:

snip

 
 In fact, *poof*, there it is, [EMAIL PROTECTED] is there and
 available for anyone who wants to use it.

Can you please use your *poof* trick one more time to set up
[EMAIL PROTECTED]

We've (as in linux-omap community) would like to move from subscriber
only list at [EMAIL PROTECTED] to vger as we're
starting to get more patches and comments on LKML. For related
discussion on [EMAIL PROTECTED], see [1].

Regards,

Tony

[1] 
http://linux.omap.com/pipermail/linux-omap-open-source/2007-November/011980.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Alan Cox

> So the _only_ explanation today for 12GB on a 32-bit machine is
>  (a) insanity
> or 
>  (b) being so lazy as to not bother to upgrade
> and in either case, my personal reaction is "I'm *not* crazy, and yes, I'm 
> lazy too, and I can't give a rats *ss about those problems".

12GB-16GB worked well historically so its a regression. Above 16GB its
all utterly mad.

You forgot reason (c) though

(c) 32bit is a tested approved certified etc environment - essentially
conservativsm and paranoia, and its hard to explain to some of these
people that the right answer really is less RAM or 64bit, especially as
they may already know it but have a 12 month process to prove and certify
a system configuration.

> HIGHMEM was a mistake in the first place. It's one that we can live with, 
> but I refuse to support it more than it needs to be supported. And 12GB is 
> *way* past the end of what is worth supporting.

Highmem to 4GB was sensible. Highmem to 8GB was pushing it.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread J. Bruce Fields

On Thu, Nov 15, 2007 at 01:50:43PM +1100, Neil Brown wrote:
> Virtual Folders.
> 
> I use VM mode in EMACS, but I believe some other mail readers have the
> same functionality.
> I have a virtual folder called "nfs" which shows me all mail in my
> inbox which has the string 'nfs' or 'lockd' in a To, Cc, or Subject
> field.  When I visit that folder, I see all mail about nfs, whether it
> was sent to me personally, or to a relevant list, or to lkml.

Hm (googling around for "mutt" and "virtual folders"): looks like I can
get most of the way there in mutt with some macros based on its "limit"
command:

http://www.tummy.com/journals/entries/jafo_20060303_00

Thanks.--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Rob Mueller




That's my personal opinion, and I realize that some of the commercial
vendors may care about their insane customers' satisfaction, but I'm
simply not interested in insane users. If they have that much RAM (and
bought it a few years ago when a 64-bit CPU wasn't an option), they can't
be poor.


From our perspective, the main issue is that some of these machines we spent 
quite a bit of money on the big RAM (for it's day) + lots of 15k RPM SCSI 
drives + multi-year support contracts. They're highly IO bound, and barely 
use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs. It's hard to justify 
junking those machines < 5 years.


We have a couple of 6G machines and some 8G machines using PAE. On the 
whole, they actually have been working really well (hmmm, apart from the 
recent dirty pages issue + reiserfs data=journal leaks + inodes in lowmem 
limits)


Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Chris Friesen wrote:
> 
> We've got some 32-bit 8GB boxes for which both of these would hold true.

Still not enough of a reason for me to care.

Remember - I'm the guy who refused to merge RH's 4G:4G patches because I 
thought they were an unsupportable nightmare.

I care a lot about future supportability, and HIGHMEM is there purely as a 
temporary wart and blip on the screen.

I did acknowledge that others may care more, but the fact is, I suspect 
that it's going to be cheaper to literally buy and ship a new machine to a 
customer than to really "suppport" it in any other form.

Side note: HIGHMEM64G works perfectly fine with 12GB of RAM under 
*limited*loads*. If your customer does certain well-defined and simple 
things that don't put huge and varied loads on the VFS or VM layer, then 
12GB+ is probably fine regardless.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Chris Friesen


Linus Torvalds wrote:


So the _only_ explanation today for 12GB on a 32-bit machine is
 (a) insanity
or 
 (b) being so lazy as to not bother to upgrade
and in either case, my personal reaction is "I'm *not* crazy, and yes, I'm 
lazy too, and I can't give a rats *ss about those problems".


How about...

c) they bought it at the beginning of a project and are stuck with it 
because they aren't getting any more money for hardware


d) they've shipped it to the field and have to support it

We've got some 32-bit 8GB boxes for which both of these would hold true.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Peter Zijlstra wrote:
> 
> But this problem is already an issue, Anton recently had a case where a
> 12GB highmem box locked up due to NTFS running out of lowmem - or
> something like that.

Yeah. I always considered HIGHMEM to just be unusable. It's ok for 
extending to 2-4GB (ie HIGHMEM4G, not 64G), and it's probably borderline 
usable for 4-8G if you are careful.

But quite frankly, I refuse to even care about anything past that. If you 
have 12G (or heaven forbid, even more) in your machine, and you can't be 
bothered to just upgrade to a 64-bit CPU, then quite frankly, *I* 
personally can't be bothered to care.

That's my personal opinion, and I realize that some of the commercial 
vendors may care about their insane customers' satisfaction, but I'm 
simply not interested in insane users. If they have that much RAM (and 
bought it a few years ago when a 64-bit CPU wasn't an option), they can't 
be poor.

So the _only_ explanation today for 12GB on a 32-bit machine is
 (a) insanity
or 
 (b) being so lazy as to not bother to upgrade
and in either case, my personal reaction is "I'm *not* crazy, and yes, I'm 
lazy too, and I can't give a rats *ss about those problems".

HIGHMEM was a mistake in the first place. It's one that we can live with, 
but I refuse to support it more than it needs to be supported. And 12GB is 
*way* past the end of what is worth supporting.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 13:14 -0800, Linus Torvalds wrote:
> 
> On Thu, 15 Nov 2007, Linus Torvalds wrote:
> > 
> > Unacceptable. We used to do exactly what your patch does, and it got fixed 
> > once. We're not introducing that fundamentally broken concept again.
> 
> Examples of non-broken solutions:
>  (a) always use lowmem sizes (what we do now)
>  (b) always use total mem sizes (sane but potentially dangerous: but the 
>  VM pressure should work! It has serious bounce-buffer issues, though, 
>  which is why I think it's crazy even if it's otherwise consistent)
>  (c) make all dirty counting be *purely* per-bdi, so that everybody can 
>  disagree on what the limits are, but at least they also then use 
>  different counters

I think that (c) is doable. If its worth the effort, who knows,
apparently there still are people using 32bit kernels on boxen with
mucho memory.

> So it's just the "different writers look at the same dirty counts but then 
> interpret it to mean totally different things" that I think is so 
> fundamentally bogus. I'm not claiming that what we do now is the only way 
> to do things, I just don't think your approach is tenable.

Agreed, the per mapping thing was utter crap.

> I'd also like to point out that while the "bounce buffer" issue is not so 
> much a HIGHMEM issue on its own (it's really about the device DMA limits, 
> which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
> special is that without HIGHMEM the bounce buffers generally work 
> perfectly fine.
> 
> The problem with HIGHMEM is that it causes various metadata (dentries, 
> inodes, page struct tables etc) to eat up memory "prime real estate" under 
> the same kind of conditions that also dirty a lot of memory. So the reason 
> we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
> mapping DMA limits, and to a large degree the fact that non-highmem memory 
> is special in general, and it is usually the non-highmem areas that are 
> constrained - and need to be protected.

But this problem is already an issue, Anton recently had a case where a
12GB highmem box locked up due to NTFS running out of lowmem - or
something like that.

And I think that with the targeted slab reclaim (or slab defrag as its
apparently still called) we can properly fix this side of the problem. I
think Rik was looking into doing so.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Linus Torvalds wrote:
> 
> The problem with HIGHMEM is that it causes various metadata (dentries, 
> inodes, page struct tables etc) to eat up memory "prime real estate" under 
> the same kind of conditions that also dirty a lot of memory. So the reason 
> we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
> mapping DMA limits, and to a large degree the fact that non-highmem memory 
> is special in general, and it is usually the non-highmem areas that are 
> constrained - and need to be protected.

Final note on this (promise): 

I'd really be very interested to hear if the patch I *do* think makes 
sense (ie the removal of the old "unmapped_ratio" logic) actually already 
solves most of Bron's problems.

It may well be that that unmapped_ratio logic effectively undid the system 
configuration changes that Bron has done. It doesn't matter if Bron has

>From our sysctl.conf:
# This should help reduce flushing on Cache::FastMmap files
vm.dirty_background_ratio = 50
vm.dirty_expire_centisecs = 9000
vm.dirty_ratio = 80
vm.dirty_writeback_centisecs = 3000

if it turns out that the "unmapped_ratio" logic turns the 80% back down to 
5%.

It may well be that 80% of the non-highmem memory is plenty good enough! 

Sure, older kernels allowed even more of memory to be dirty (since they 
didn't count dirty mappings at all), but we may have a case where the fact 
that we discount the HIGHMEM stuff isn't the major problem in itself, and 
that the dirty_ratio sysctl should be ok - but just gets screwed over by 
that unmapped_ratio logic.

So Bron, if you can test that patch, I'd love to hear if it matters. It 
may not make any difference (maybe you don't actually trigger the 
unmapped_ratio logic at all), but I think it has the potential for being 
totally broken for you.

People that don't change the dirty_ratio from the default values would 
generally never care, because the default dirty-ratio is *already* so low 
that even if the unmapped_ratio logic triggers, it won't much matter!

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Ben Dooks

On Tue, Nov 13, 2007 at 10:34:37PM +, Russell King wrote:
> On Tue, Nov 13, 2007 at 06:25:16PM +, Alan Cox wrote:
> > > Given the wide range of ARM platforms today, it is utterly idiotic to
> > > expect a single person to be able to provide responses for all ARM bugs.
> > > I for one wish I'd never *VOLUNTEERED* to be a part of the kernel
> > > bugzilla, and really *WISH* I could pull out of that function.
> > 
> > You can. Perhaps that bugzilla needs to point to some kind of
> > [EMAIL PROTECTED] list for the various ARM platform
> > maintainers ?
> 
> That might work - though it would be hard to get all the platform
> maintainers to be signed up to yet another mailing list, I'm sure
> sufficient would do.

As long as it would just be bug reports, I'm sure that most of us
could be persuaded to subscribe. Adding another list for general
discussions is probably not going to be read, the current list
provides more than enough to keep us busy.

-- 
Ben

Q:  What's a light-year?
A:  One-third less calories than a regular year.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 21:59 +0100, Peter Zijlstra wrote:
> On Thu, 2007-11-15 at 12:56 -0800, Linus Torvalds wrote:
> > 
> > On Thu, 15 Nov 2007, Peter Zijlstra wrote:
> > > 
> > > Something like this ought to do I guess. Although my
> > > mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
> > > better.
> > 
> > No, this absolutely sucks.
> 
> Agreed, I was just about to send out an email saying that.

Say all buffer cache users were against default_backing_dev_info, and
we'd give default_backing_dev_info less, that should work out, right?

( I'm not yet clear on if buffer cache already uses
default_backing_dev_info or not, bdget() seems to suggest it does )





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Linus Torvalds wrote:
> 
> Unacceptable. We used to do exactly what your patch does, and it got fixed 
> once. We're not introducing that fundamentally broken concept again.

Examples of non-broken solutions:
 (a) always use lowmem sizes (what we do now)
 (b) always use total mem sizes (sane but potentially dangerous: but the 
 VM pressure should work! It has serious bounce-buffer issues, though, 
 which is why I think it's crazy even if it's otherwise consistent)
 (c) make all dirty counting be *purely* per-bdi, so that everybody can 
 disagree on what the limits are, but at least they also then use 
 different counters

So it's just the "different writers look at the same dirty counts but then 
interpret it to mean totally different things" that I think is so 
fundamentally bogus. I'm not claiming that what we do now is the only way 
to do things, I just don't think your approach is tenable.

Btw, I actually suspect that while (a) is what we do now, for the specific 
case that Bron has, we could have a /proc/sys/vm option to just enable 
(b). So we don't have to have just one consistent model, we can allow odd 
users (and Bron sounds like one - sorry Bron ;) to just force other, odd, 
but consistent models.

I'd also like to point out that while the "bounce buffer" issue is not so 
much a HIGHMEM issue on its own (it's really about the device DMA limits, 
which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
special is that without HIGHMEM the bounce buffers generally work 
perfectly fine.

The problem with HIGHMEM is that it causes various metadata (dentries, 
inodes, page struct tables etc) to eat up memory "prime real estate" under 
the same kind of conditions that also dirty a lot of memory. So the reason 
we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
mapping DMA limits, and to a large degree the fact that non-highmem memory 
is special in general, and it is usually the non-highmem areas that are 
constrained - and need to be protected.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 12:56 -0800, Linus Torvalds wrote:
> 
> On Thu, 15 Nov 2007, Peter Zijlstra wrote:
> > 
> > Something like this ought to do I guess. Although my
> > mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
> > better.
> 
> No, this absolutely sucks.

Agreed, I was just about to send out an email saying that..

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Peter Zijlstra wrote:
> 
> Something like this ought to do I guess. Although my
> mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
> better.

No, this absolutely sucks.

Why?

It's totally unacceptable to have per-mapping notions of how much memory 
we have. We used to do *exactly* that, and it's idiocy.

The reason it's unacceptable idiocy is that it means that two processes 
that access different files will then have *TOTALLY*DIFFERENT* notions of 
what the "dirty limit" is. And as a result, one process will happily write 
lots and lots of dirty stuff and never throttle, and the other process 
will have to throttle all the time - and clean up after the process that 
didn't!

See?

The fact is, because we count dirty pages as one resource, we must also 
have *one* limit.

So this patch is a huge regression. You might not notice it, because if 
everybody writes to the same kind of mapping, nobody will be hurt (they 
all have effectively the same global limit anyway), but you *will* notice 
if you ever have two different values of "highmem".

Unacceptable. We used to do exactly what your patch does, and it got fixed 
once. We're not introducing that fundamentally broken concept again.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 20:40 +0100, Peter Zijlstra wrote:

> As for the highmem part, that was due to buffer cache, and unfortunately
> that is still true. Although maybe we can do something smart with the
> per-bdi stuff.

Something like this ought to do I guess. Although my
mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
better.

Uncompiled, untested

Not-Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
---
 mm/page-writeback.c |   28 
 1 file changed, 20 insertions(+), 8 deletions(-)

Index: linux-2.6/mm/page-writeback.c
===
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -280,27 +280,28 @@ static unsigned long highmem_dirtyable_m
 #endif
 }
 
-static unsigned long determine_dirtyable_memory(void)
+static unsigned long determine_dirtyable_memory(int highmem)
 {
unsigned long x;
 
x = global_page_state(NR_FREE_PAGES)
+ global_page_state(NR_INACTIVE)
+ global_page_state(NR_ACTIVE);
-   x -= highmem_dirtyable_memory(x);
+   if (!highmem)
+   x -= highmem_dirtyable_memory(x);
return x + 1;   /* Ensure that we never return 0 */
 }
 
 static void
 get_dirty_limits(long *pbackground, long *pdirty, long *pbdi_dirty,
-struct backing_dev_info *bdi)
+struct backing_dev_info *bdi, int highmem)
 {
int background_ratio;   /* Percentages */
int dirty_ratio;
int unmapped_ratio;
long background;
long dirty;
-   unsigned long available_memory = determine_dirtyable_memory();
+   unsigned long available_memory = determine_dirtyable_memory(highmem);
struct task_struct *tsk;
 
unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
@@ -346,6 +347,16 @@ get_dirty_limits(long *pbackground, long
}
 }
 
+static inline int mapping_is_buffercache(struct address_space *mapping)
+{
+   struct super_block *sb = mapping->host->i_sb;
+
+   if (sb && sb->s_bdev && sb->s_bdev->bd_inode->i_mapping != mapping)
+   return 0;
+
+   return 1;
+}
+
 /*
  * balance_dirty_pages() must be called by processes which are generating dirty
  * data.  It looks at the number of dirty pages in the machine and will force
@@ -364,6 +375,7 @@ static void balance_dirty_pages(struct a
unsigned long write_chunk = sync_writeback_pages();
 
struct backing_dev_info *bdi = mapping->backing_dev_info;
+   int highmem = !mapping_is_buffercache(mapping);
 
for (;;) {
struct writeback_control wbc = {
@@ -375,7 +387,7 @@ static void balance_dirty_pages(struct a
};
 
get_dirty_limits(_thresh, _thresh,
-   _thresh, bdi);
+   _thresh, bdi, highmem);
bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
@@ -394,7 +406,7 @@ static void balance_dirty_pages(struct a
writeback_inodes();
pages_written += write_chunk - wbc.nr_to_write;
get_dirty_limits(_thresh, _thresh,
-  _thresh, bdi);
+  _thresh, bdi, highmem);
}
 
/*
@@ -503,7 +515,7 @@ void throttle_vm_writeout(gfp_t gfp_mask
long dirty_thresh;
 
 for ( ; ; ) {
-   get_dirty_limits(_thresh, _thresh, NULL, NULL);
+   get_dirty_limits(_thresh, _thresh, NULL, NULL, 
1);
 
 /*
  * Boost the allowable dirty threshold a bit for page
@@ -546,7 +558,7 @@ static void background_writeout(unsigned
long background_thresh;
long dirty_thresh;
 
-   get_dirty_limits(_thresh, _thresh, NULL, NULL);
+   get_dirty_limits(_thresh, _thresh, NULL, NULL, 
1);
if (global_page_state(NR_FILE_DIRTY) +
global_page_state(NR_UNSTABLE_NFS) < background_thresh
&& min_pages <= 0)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 08:32 -0800, Linus Torvalds wrote:
> 
> On Thu, 15 Nov 2007, Bron Gondwana wrote:
> > 
> > I guess we'll be doing the one-liner kernel mod and testing
> > that then.
> 
> The thing to look at is "get_dirty_limits()" in mm/page-writeback.c, and 
> in this particular case it's the
> 
>   unsigned long available_memory = determine_dirtyable_memory();
> 
> that's going to bite you. In particular, note the
> 
>   x -= highmem_dirtyable_memory(x);
> 
> that we do in determine_dirtyable_memory().
> 
> So in this case, if you basically remove that line, it will allow all of 
> memory to be dirtied (including highmem), and then the background_ratio 
> will work on the whole 6GB.
> 
> HOWEVER! It's worth noting that we also have some other old legacy cruft 
> there that may interfere with your code. In particular, if you look at the 
> top of "get_dirty_limits()", it *also* does a
> 
> unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
> global_page_state(NR_ANON_PAGES)) * 100) /
> available_memory;
> 
> dirty_ratio = vm_dirty_ratio;
> if (dirty_ratio > unmapped_ratio / 2)
> dirty_ratio = unmapped_ratio / 2;
> 
> and that whole "unmapped_ratio" comparison is probably bogus these days, 
> since we now take the mapped dirty pages into account. That code harks 
> back to the days before we did that, and dirty ratios only affected 
> non-mapped pages.
> 
> And in particular, now that I look at it, I wonder if it can even go 
> negative (because "available_memory" may be *smaller* than the 
> NR_FILE_MAPPED|ANON_PAGES sum!).
> 
> We'll fix up a negative value anyway (because of the clamping of 
> dirty_ratio to no less than 5), but the point is that the whole 
> "unmapped_ratio" thing probably doesn't make sense any more, and may well 
> make the dirty_ratio not work for you, because you may have a very small 
> unmapped_ratio that effectively makes all dirty limits always clamp to a 
> very small value.
> 
> So regardless, I think you may want to try the appended patch *first*.
> 
> If this patch makes a difference, please holler. I think it's the correct 
> thing to do, but I'm not going to actually commit it without somebody 
> saying that it makes a difference (and preferably Peter Zijlstra and 
> Andrew acking it too).
> 
> Only *after* testing this change is it probably a good idea to test the 
> real hack of then removing the highmem_dirtyable_memory() thing. 
> 
> Peter? Andrew?

I wondered about that part the other day when I went through the BDI
dirty code due to that iozone thing..

The initial commit states:

commit d90e4590519d196004efbb308d0d47596ee4befe
Author: akpm 
Date:   Sun Oct 13 16:33:20 2002 +

[PATCH] reduce the dirty threshold when there's a lot of mapped

Dirty memory thresholds are currently set by /proc/sys/vm/dirty_ratio.

Background writeout levels are controlled by
/proc/sys/vm/dirty_background_ratio.

Problem is that these levels are hard to get right - they are too
static.  If there is a lot of mapped memory around then the 40%
clamping level causes too much dirty data.  We do lots of scanning in
page reclaim, and the VM generally starts getting into distress.  Extra
swapping, extra page unmapping.

It would be much better to simply tell the caller of write(2) to slow
down - to write out their dirty data sooner, to make those written
pages trivially reclaimable.  Penalise the offender, not the innocent
page allocators.

This patch changes the writer throttling code so that we clamp down
much harder on writers if there is a lot of mapped memory in the
machine.  We only permit memory dirtiers to dirty up to 50% of unmapped
memory before forcing them to clean their own pagecache.

BKrev: 3da9a050Mz7H6VkAR9xo6ongavTMrw

But because dirty mapped pages are no longer special, I'd say the reason
for its existance is gone. So,

Acked-by: Peter Zijlstra <[EMAIL PROTECTED]>

As for the highmem part, that was due to buffer cache, and unfortunately
that is still true. Although maybe we can do something smart with the
per-bdi stuff.

> ---
>  mm/page-writeback.c |8 
>  1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 81a91e6..d55cfca 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
> *pbdi_dirty,
>  {
>   int background_ratio;   /* Percentages */
>   int dirty_ratio;
> - int unmapped_ratio;
>   long background;
>   long dirty;
>   unsigned long available_memory = determine_dirtyable_memory();
>   struct task_struct *tsk;
>  
> - unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
> - global_page_state(NR_ANON_PAGES)) * 100) /
> -

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Daniel Barkalow

On Thu, 15 Nov 2007, Theodore Tso wrote:

> On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
> > I don't see any reason that we couldn't have a tool accessible to Ubuntu 
> > users that does a real "git bisect". Git is really good at being scripted 
> > by fancy GUIs. It should be easy enough to have a drop down with all of 
> > the Ubuntu kernel package releases, where the user selects what works and 
> > what doesn't.
> 
> It's possible users who haven't yet downloaded a git repository have
> to surmount some obstacles that might cause them to lose interest.
> First, they have to download some 190 megs of git repository, and if
> they have a slow link, that can take a while, and then they have to
> build each kernel, which can take a while.

It should be possible for it to clone only the portion that they actually 
care about based on where the known-good version is. It should also (in 
theory, anyway) be possible to put off some amount of the download until 
it's actually going to be relevant.

> A full kernel build with everything selected can take good 30 minutes or 
> more, and that's on a fast dual-core machine with 4gigs of memory and 
> 7200rpm disk drives. On a slower, memory limited laptop, doing a single 
> kernel build can take more time than the user has patiences; multiply 
> that by 7 or 8 build and test boots, and it starts to get tiresome.

None of this is going to take as long, even on a slow link and a slow 
computer, as waiting for a response to a mailing list post. It'd annoy 
users who are specifically waiting for it, but if the interface is that 
the user says "kernel package X didn't work but the current kernel does", 
and it says "I'll let you know when I've got something to test", and the 
user watches a DVD, and afterward finds a message saying there's something 
to test, and tries it, and reports how it went, and the process repeats 
until it narrows it down to a single commit after a couple of days of the 
user getting occasional responses, it's not that different from asking for 
help online.

> And then on top of that there are the issues about whether there is
> enough support for dealing with hitting kernel revisions that fail due
> to other bugs getting merged in during the -rc1 process, etc.

Could have a distro-provided mask of things that aren't worth testing and 
possibly back-ported fixes for revisions in particular ranges.

> I agree that a tool that automated the bisection process and walked
> the user through it would be helpful, but I believe it would be
> possible for us do better.

That would probably help for giving the user something to try right away. 
I still think that the main cost to the user is the number of times that 
the user has to stop doing stuff to reboot with a kernel to test, whether 
the test kernels are available quickly from the distro site, slowly built 
locally, or slowly as suggested by humans helping online.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds

On Thu, 15 Nov 2007, Bron Gondwana wrote:
> 
> I guess we'll be doing the one-liner kernel mod and testing
> that then.

The thing to look at is "get_dirty_limits()" in mm/page-writeback.c, and 
in this particular case it's the

unsigned long available_memory = determine_dirtyable_memory();

that's going to bite you. In particular, note the

x -= highmem_dirtyable_memory(x);

that we do in determine_dirtyable_memory().

So in this case, if you basically remove that line, it will allow all of 
memory to be dirtied (including highmem), and then the background_ratio 
will work on the whole 6GB.

HOWEVER! It's worth noting that we also have some other old legacy cruft 
there that may interfere with your code. In particular, if you look at the 
top of "get_dirty_limits()", it *also* does a

unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES)) * 100) /
available_memory;

dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
dirty_ratio = unmapped_ratio / 2;

and that whole "unmapped_ratio" comparison is probably bogus these days, 
since we now take the mapped dirty pages into account. That code harks 
back to the days before we did that, and dirty ratios only affected 
non-mapped pages.

And in particular, now that I look at it, I wonder if it can even go 
negative (because "available_memory" may be *smaller* than the 
NR_FILE_MAPPED|ANON_PAGES sum!).

We'll fix up a negative value anyway (because of the clamping of 
dirty_ratio to no less than 5), but the point is that the whole 
"unmapped_ratio" thing probably doesn't make sense any more, and may well 
make the dirty_ratio not work for you, because you may have a very small 
unmapped_ratio that effectively makes all dirty limits always clamp to a 
very small value.

So regardless, I think you may want to try the appended patch *first*.

If this patch makes a difference, please holler. I think it's the correct 
thing to do, but I'm not going to actually commit it without somebody 
saying that it makes a difference (and preferably Peter Zijlstra and 
Andrew acking it too).

Only *after* testing this change is it probably a good idea to test the 
real hack of then removing the highmem_dirtyable_memory() thing. 

Peter? Andrew?

Linus

---
 mm/page-writeback.c |8 
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 81a91e6..d55cfca 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
*pbdi_dirty,
 {
int background_ratio;   /* Percentages */
int dirty_ratio;
-   int unmapped_ratio;
long background;
long dirty;
unsigned long available_memory = determine_dirtyable_memory();
struct task_struct *tsk;

-   unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
-   global_page_state(NR_ANON_PAGES)) * 100) /
-   available_memory;
-
dirty_ratio = vm_dirty_ratio;
-   if (dirty_ratio > unmapped_ratio / 2)
-   dirty_ratio = unmapped_ratio / 2;
-
if (dirty_ratio < 5)
dirty_ratio = 5;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Theodore Tso

On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
> I don't see any reason that we couldn't have a tool accessible to Ubuntu 
> users that does a real "git bisect". Git is really good at being scripted 
> by fancy GUIs. It should be easy enough to have a drop down with all of 
> the Ubuntu kernel package releases, where the user selects what works and 
> what doesn't.

It's possible users who haven't yet downloaded a git repository have
to surmount some obstacles that might cause them to lose interest.
First, they have to download some 190 megs of git repository, and if
they have a slow link, that can take a while, and then they have to
build each kernel, which can take a while.  A full kernel build with
everything selected can take good 30 minutes or more, and that's on a
fast dual-core machine with 4gigs of memory and 7200rpm disk drives.
On a slower, memory limited laptop, doing a single kernel build can
take more time than the user has patiences; multiply that by 7 or 8
build and test boots, and it starts to get tiresome.  

And then on top of that there are the issues about whether there is
enough support for dealing with hitting kernel revisions that fail due
to other bugs getting merged in during the -rc1 process, etc.

I agree that a tool that automated the bisection process and walked
the user through it would be helpful, but I believe it would be
possible for us do better.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Rene Herman


On 15-11-07 14:00, Jörn Engel wrote:

And even without mails being held hostage for weeks, every single 
moderation mail is annoying.  Like the one I'm sure to receive after 
sending this out.


Certainly. Upto this thread I wasn't actually aware the list was doing that. 
While it might be informative once, getting it each time quickly gets old. 
Don't know if mailman can do anything like it but I'd suggest anyone running 
a non-subscriber-moderation list configure it to send such messages at most 
once a  per address or some such. And just disable the message 
if it cannot do that.


Fortunately, alsa-devel is (almost) no longer such a list anyway as it's 
moving to vger. Hurrah. David -- thanks.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Takashi Iwai

At Thu, 15 Nov 2007 14:17:27 +0100,
Olivier Galibert wrote:
> 
> On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
> > Totally unrelated indeed so why are spouting crap? If the kohab list has a 
> > problem take it up with them but keep ALSA out of it. alsa-devel has only 
> > ever moderated out spam -- nothing else.
> 
> That is incorrect.  Hopefully it is the case now though, since my
> experience of the subject was years ago.

Yeah, it was really years ago that we once switched to the open list.
Funny that people never forget such a thing :)


Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Olivier Galibert

On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
> Totally unrelated indeed so why are spouting crap? If the kohab list has a 
> problem take it up with them but keep ALSA out of it. alsa-devel has only 
> ever moderated out spam -- nothing else.

That is incorrect.  Hopefully it is the case now though, since my
experience of the subject was years ago.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Jörn Engel

On Thu, 15 November 2007 13:26:51 +0100, Rene Herman wrote:
> 
> Can you please just shelve this crap? You have a way of knowing that "ALSA 
> will accept you" and that is knowing or assuming that the ALSA project 
> doesn't consist of drooling retards.

Well, my experience with moderation has been that moderated mails are
stuck in some queue for weeks.  Two seperate lists, neither of them was
alsa.  If also is doing a better job, great.  But it still has to live
with the general reputation of non-subscriber moderation.

> When a project list goes to the difficulty of moderating non-subscribers it 
> has made the explicit choice to _not_ become subscriber only. Then refusing 
> valid non-subscribers after all makes no sense whatsoever. I'm sorry you 
> got your feelings hurt by that other list but it was no doubt an accident; 
> take it up with them.

Been there, done that.  In spite of people not being drooling retards,
the amount of time and effort they invest into either moderation or
improving the ruleset is quite limited.  Problems persist.

And even without mails being held hostage for weeks, every single
moderation mail is annoying.  Like the one I'm sure to receive after
sending this out.

Jörn

-- 
Joern's library part 5:
http://www.faqs.org/faqs/compression-faq/part2/section-9.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Rene Herman


On 15-11-07 13:02, Bron Gondwana wrote:


I get the same information from both project websites: "moderated for
non-members, public archives" - no way of knowing that ALSA will accept
me informing them of something they would be interested without
committing to reading or bit-bucketing their list.


Can you please just shelve this crap? You have a way of knowing that "ALSA 
will accept you" and that is knowing or assuming that the ALSA project 
doesn't consist of drooling retards.


When a project list goes to the difficulty of moderating non-subscribers it 
has made the explicit choice to _not_ become subscriber only. Then refusing 
valid non-subscribers after all makes no sense whatsoever. I'm sorry you got 
your feelings hurt by that other list but it was no doubt an accident; take 
it up with them.


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
> On 15-11-07 05:16, Bron Gondwana wrote:
>
>> Totally unrelated - I sent something to the kolab mailing list a couple
>
> [ ... ]
>
>> I'm sure if I had something that I considered worth informing the ALSA 
>> project of, I'd be wary of spending the same effort writing a good post
>> knowing it may be dropped in between the by a list moderator just selecing 
>> all and bouncing them.
>
> Totally unrelated indeed so why are spouting crap? If the kohab list has a 
> problem take it up with them but keep ALSA out of it. alsa-devel has only 
> ever moderated out spam -- nothing else.

As an outsider to the list, how do I know what your policy will be
other than "I've been rejected out of hand by someone else's list, 
so my experience is that member only lists aren't willing to listen
to something I have to say unless I make the effort to sign up and
have yet another folder accumulating unread messages".  I don't.

Well, ok - maybe I do here since I've let myself be dragged in to
the debate.  Oops.

I get the same information from both project websites: "moderated
for non-members, public archives" - no way of knowing that ALSA
will accept me informing them of something they would be interested
without committing to reading or bit-bucketing their list.

The alternative is to subscribe just long enough to send something
and then unsubscribe again or cold-email a member and ask them
to pass a message along.  Or post and hope it doesn't get rejected,
not even knowing for a day or so.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Bron Gondwana

On Wed, Nov 14, 2007 at 09:53:38PM -0800, Linus Torvalds wrote:
> 
> 
> On Wed, 14 Nov 2007, Linus Torvalds wrote:
> > 
> > So even at 100% dirty limits, it won't let you dirty more than 1GB on the 
> > default 32-bit setup.
> 
> Side note: all of these are obviously still just heuristics. If you really 
> *do* run on a 32-bit kernel, and you want to have the pain, I'm sure you 
> can just disable the dirty limits with a one-liner kernel mod. And if it's 
> useful enough, we can certainly expose flags like that.. Not that I expect 
> that much anybody else will ever care, but it's not like it's wrong to 
> expose the silly heuristics the kernel has to users that have very 
> specific loads.
> 
> That said, I still do hope you aren't actually using HIGHMEM64G. I was 
> really hoping that the people who had enough moolah to buy >4GB of RAM had 
> long since also upgraded to a 64-bit machine ;)

I'm afraid we are, which probably explains it.

We have a bunch of 64 bit machines, but this particular machine
is one of our somewhat more ancient IBM x235 machines.  It's
got stacks of fast SCSI drives and a couple of hyperthreading
Xeons in it.  Very nice machine in its day, and very reliable
which is why we have kept them, even though at 6RU it chews
through disk space.

Unfortunately none of the 64 bit machines are world facing,
and we're running HIGHMEM64G on a bunch of machines both for
consistency value and because we only have one machine left
with only 2Gb.

I guess we'll be doing the one-liner kernel mod and testing
that then.  I'd certainly like to build a test case anyway
so I'm not spending too much time rebooting that machine,
it's also our outbound SMTP gateway.

And I'll keep in mind finding a 64 bit capable machine for
the role when I can.

Thanks for the feedback on this - I'll come back with more
details once we've done some testing, but this sounds likely,
and I don't think DCC is going to change how it works, so
we're stuck supporting it.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Bron Gondwana

On Wed, Nov 14, 2007 at 09:53:38PM -0800, Linus Torvalds wrote:
 
 
 On Wed, 14 Nov 2007, Linus Torvalds wrote:
  
  So even at 100% dirty limits, it won't let you dirty more than 1GB on the 
  default 32-bit setup.
 
 Side note: all of these are obviously still just heuristics. If you really 
 *do* run on a 32-bit kernel, and you want to have the pain, I'm sure you 
 can just disable the dirty limits with a one-liner kernel mod. And if it's 
 useful enough, we can certainly expose flags like that.. Not that I expect 
 that much anybody else will ever care, but it's not like it's wrong to 
 expose the silly heuristics the kernel has to users that have very 
 specific loads.
 
 That said, I still do hope you aren't actually using HIGHMEM64G. I was 
 really hoping that the people who had enough moolah to buy 4GB of RAM had 
 long since also upgraded to a 64-bit machine ;)

I'm afraid we are, which probably explains it.

We have a bunch of 64 bit machines, but this particular machine
is one of our somewhat more ancient IBM x235 machines.  It's
got stacks of fast SCSI drives and a couple of hyperthreading
Xeons in it.  Very nice machine in its day, and very reliable
which is why we have kept them, even though at 6RU it chews
through disk space.

Unfortunately none of the 64 bit machines are world facing,
and we're running HIGHMEM64G on a bunch of machines both for
consistency value and because we only have one machine left
with only 2Gb.

I guess we'll be doing the one-liner kernel mod and testing
that then.  I'd certainly like to build a test case anyway
so I'm not spending too much time rebooting that machine,
it's also our outbound SMTP gateway.

And I'll keep in mind finding a 64 bit capable machine for
the role when I can.

Thanks for the feedback on this - I'll come back with more
details once we've done some testing, but this sounds likely,
and I don't think DCC is going to change how it works, so
we're stuck supporting it.

Bron.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Bron Gondwana

On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
 On 15-11-07 05:16, Bron Gondwana wrote:

 Totally unrelated - I sent something to the kolab mailing list a couple

 [ ... ]

 I'm sure if I had something that I considered worth informing the ALSA 
 project of, I'd be wary of spending the same effort writing a good post
 knowing it may be dropped in between the by a list moderator just selecing 
 all and bouncing them.

 Totally unrelated indeed so why are spouting crap? If the kohab list has a 
 problem take it up with them but keep ALSA out of it. alsa-devel has only 
 ever moderated out spam -- nothing else.

As an outsider to the list, how do I know what your policy will be
other than I've been rejected out of hand by someone else's list, 
so my experience is that member only lists aren't willing to listen
to something I have to say unless I make the effort to sign up and
have yet another folder accumulating unread messages.  I don't.

Well, ok - maybe I do here since I've let myself be dragged in to
the debate.  Oops.

I get the same information from both project websites: moderated
for non-members, public archives - no way of knowing that ALSA
will accept me informing them of something they would be interested
without committing to reading or bit-bucketing their list.

The alternative is to subscribe just long enough to send something
and then unsubscribe again or cold-email a member and ask them
to pass a message along.  Or post and hope it doesn't get rejected,
not even knowing for a day or so.

Bron.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Rene Herman


On 15-11-07 13:02, Bron Gondwana wrote:


I get the same information from both project websites: moderated for
non-members, public archives - no way of knowing that ALSA will accept
me informing them of something they would be interested without
committing to reading or bit-bucketing their list.


Can you please just shelve this crap? You have a way of knowing that ALSA 
will accept you and that is knowing or assuming that the ALSA project 
doesn't consist of drooling retards.


When a project list goes to the difficulty of moderating non-subscribers it 
has made the explicit choice to _not_ become subscriber only. Then refusing 
valid non-subscribers after all makes no sense whatsoever. I'm sorry you got 
your feelings hurt by that other list but it was no doubt an accident; take 
it up with them.


Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Jörn Engel

On Thu, 15 November 2007 13:26:51 +0100, Rene Herman wrote:
 
 Can you please just shelve this crap? You have a way of knowing that ALSA 
 will accept you and that is knowing or assuming that the ALSA project 
 doesn't consist of drooling retards.

Well, my experience with moderation has been that moderated mails are
stuck in some queue for weeks.  Two seperate lists, neither of them was
alsa.  If also is doing a better job, great.  But it still has to live
with the general reputation of non-subscriber moderation.

 When a project list goes to the difficulty of moderating non-subscribers it 
 has made the explicit choice to _not_ become subscriber only. Then refusing 
 valid non-subscribers after all makes no sense whatsoever. I'm sorry you 
 got your feelings hurt by that other list but it was no doubt an accident; 
 take it up with them.

Been there, done that.  In spite of people not being drooling retards,
the amount of time and effort they invest into either moderation or
improving the ruleset is quite limited.  Problems persist.

And even without mails being held hostage for weeks, every single
moderation mail is annoying.  Like the one I'm sure to receive after
sending this out.

Jörn

-- 
Joern's library part 5:
http://www.faqs.org/faqs/compression-faq/part2/section-9.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Olivier Galibert

On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
 Totally unrelated indeed so why are spouting crap? If the kohab list has a 
 problem take it up with them but keep ALSA out of it. alsa-devel has only 
 ever moderated out spam -- nothing else.

That is incorrect.  Hopefully it is the case now though, since my
experience of the subject was years ago.

  OG.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Takashi Iwai

At Thu, 15 Nov 2007 14:17:27 +0100,
Olivier Galibert wrote:
 
 On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
  Totally unrelated indeed so why are spouting crap? If the kohab list has a 
  problem take it up with them but keep ALSA out of it. alsa-devel has only 
  ever moderated out spam -- nothing else.
 
 That is incorrect.  Hopefully it is the case now though, since my
 experience of the subject was years ago.

Yeah, it was really years ago that we once switched to the open list.
Funny that people never forget such a thing :)


Takashi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Rene Herman


On 15-11-07 14:00, Jörn Engel wrote:

And even without mails being held hostage for weeks, every single 
moderation mail is annoying.  Like the one I'm sure to receive after 
sending this out.


Certainly. Upto this thread I wasn't actually aware the list was doing that. 
While it might be informative once, getting it each time quickly gets old. 
Don't know if mailman can do anything like it but I'd suggest anyone running 
a non-subscriber-moderation list configure it to send such messages at most 
once a time-period per address or some such. And just disable the message 
if it cannot do that.


Fortunately, alsa-devel is (almost) no longer such a list anyway as it's 
moving to vger. Hurrah. David -- thanks.


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Theodore Tso

On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
 I don't see any reason that we couldn't have a tool accessible to Ubuntu 
 users that does a real git bisect. Git is really good at being scripted 
 by fancy GUIs. It should be easy enough to have a drop down with all of 
 the Ubuntu kernel package releases, where the user selects what works and 
 what doesn't.

It's possible users who haven't yet downloaded a git repository have
to surmount some obstacles that might cause them to lose interest.
First, they have to download some 190 megs of git repository, and if
they have a slow link, that can take a while, and then they have to
build each kernel, which can take a while.  A full kernel build with
everything selected can take good 30 minutes or more, and that's on a
fast dual-core machine with 4gigs of memory and 7200rpm disk drives.
On a slower, memory limited laptop, doing a single kernel build can
take more time than the user has patiences; multiply that by 7 or 8
build and test boots, and it starts to get tiresome.  

And then on top of that there are the issues about whether there is
enough support for dealing with hitting kernel revisions that fail due
to other bugs getting merged in during the -rc1 process, etc.

I agree that a tool that automated the bisection process and walked
the user through it would be helpful, but I believe it would be
possible for us do better.

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Bron Gondwana wrote:
 
 I guess we'll be doing the one-liner kernel mod and testing
 that then.

The thing to look at is get_dirty_limits() in mm/page-writeback.c, and 
in this particular case it's the

unsigned long available_memory = determine_dirtyable_memory();

that's going to bite you. In particular, note the

x -= highmem_dirtyable_memory(x);

that we do in determine_dirtyable_memory().

So in this case, if you basically remove that line, it will allow all of 
memory to be dirtied (including highmem), and then the background_ratio 
will work on the whole 6GB.

HOWEVER! It's worth noting that we also have some other old legacy cruft 
there that may interfere with your code. In particular, if you look at the 
top of get_dirty_limits(), it *also* does a

unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES)) * 100) /
available_memory;

dirty_ratio = vm_dirty_ratio;
if (dirty_ratio  unmapped_ratio / 2)
dirty_ratio = unmapped_ratio / 2;

and that whole unmapped_ratio comparison is probably bogus these days, 
since we now take the mapped dirty pages into account. That code harks 
back to the days before we did that, and dirty ratios only affected 
non-mapped pages.

And in particular, now that I look at it, I wonder if it can even go 
negative (because available_memory may be *smaller* than the 
NR_FILE_MAPPED|ANON_PAGES sum!).

We'll fix up a negative value anyway (because of the clamping of 
dirty_ratio to no less than 5), but the point is that the whole 
unmapped_ratio thing probably doesn't make sense any more, and may well 
make the dirty_ratio not work for you, because you may have a very small 
unmapped_ratio that effectively makes all dirty limits always clamp to a 
very small value.

So regardless, I think you may want to try the appended patch *first*.

If this patch makes a difference, please holler. I think it's the correct 
thing to do, but I'm not going to actually commit it without somebody 
saying that it makes a difference (and preferably Peter Zijlstra and 
Andrew acking it too).

Only *after* testing this change is it probably a good idea to test the 
real hack of then removing the highmem_dirtyable_memory() thing. 

Peter? Andrew?

Linus

---
 mm/page-writeback.c |8 
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 81a91e6..d55cfca 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
*pbdi_dirty,
 {
int background_ratio;   /* Percentages */
int dirty_ratio;
-   int unmapped_ratio;
long background;
long dirty;
unsigned long available_memory = determine_dirtyable_memory();
struct task_struct *tsk;
 
-   unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
-   global_page_state(NR_ANON_PAGES)) * 100) /
-   available_memory;
-
dirty_ratio = vm_dirty_ratio;
-   if (dirty_ratio  unmapped_ratio / 2)
-   dirty_ratio = unmapped_ratio / 2;
-
if (dirty_ratio  5)
dirty_ratio = 5;
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Daniel Barkalow

On Thu, 15 Nov 2007, Theodore Tso wrote:

 On Wed, Nov 14, 2007 at 06:23:34PM -0500, Daniel Barkalow wrote:
  I don't see any reason that we couldn't have a tool accessible to Ubuntu 
  users that does a real git bisect. Git is really good at being scripted 
  by fancy GUIs. It should be easy enough to have a drop down with all of 
  the Ubuntu kernel package releases, where the user selects what works and 
  what doesn't.
 
 It's possible users who haven't yet downloaded a git repository have
 to surmount some obstacles that might cause them to lose interest.
 First, they have to download some 190 megs of git repository, and if
 they have a slow link, that can take a while, and then they have to
 build each kernel, which can take a while.

It should be possible for it to clone only the portion that they actually 
care about based on where the known-good version is. It should also (in 
theory, anyway) be possible to put off some amount of the download until 
it's actually going to be relevant.

 A full kernel build with everything selected can take good 30 minutes or 
 more, and that's on a fast dual-core machine with 4gigs of memory and 
 7200rpm disk drives. On a slower, memory limited laptop, doing a single 
 kernel build can take more time than the user has patiences; multiply 
 that by 7 or 8 build and test boots, and it starts to get tiresome.

None of this is going to take as long, even on a slow link and a slow 
computer, as waiting for a response to a mailing list post. It'd annoy 
users who are specifically waiting for it, but if the interface is that 
the user says kernel package X didn't work but the current kernel does, 
and it says I'll let you know when I've got something to test, and the 
user watches a DVD, and afterward finds a message saying there's something 
to test, and tries it, and reports how it went, and the process repeats 
until it narrows it down to a single commit after a couple of days of the 
user getting occasional responses, it's not that different from asking for 
help online.

 And then on top of that there are the issues about whether there is
 enough support for dealing with hitting kernel revisions that fail due
 to other bugs getting merged in during the -rc1 process, etc.

Could have a distro-provided mask of things that aren't worth testing and 
possibly back-ported fixes for revisions in particular ranges.

 I agree that a tool that automated the bisection process and walked
 the user through it would be helpful, but I believe it would be
 possible for us do better.

That would probably help for giving the user something to try right away. 
I still think that the main cost to the user is the number of times that 
the user has to stop doing stuff to reboot with a kernel to test, whether 
the test kernels are available quickly from the distro site, slowly built 
locally, or slowly as suggested by humans helping online.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 08:32 -0800, Linus Torvalds wrote:
 
 On Thu, 15 Nov 2007, Bron Gondwana wrote:
  
  I guess we'll be doing the one-liner kernel mod and testing
  that then.
 
 The thing to look at is get_dirty_limits() in mm/page-writeback.c, and 
 in this particular case it's the
 
   unsigned long available_memory = determine_dirtyable_memory();
 
 that's going to bite you. In particular, note the
 
   x -= highmem_dirtyable_memory(x);
 
 that we do in determine_dirtyable_memory().
 
 So in this case, if you basically remove that line, it will allow all of 
 memory to be dirtied (including highmem), and then the background_ratio 
 will work on the whole 6GB.
 
 HOWEVER! It's worth noting that we also have some other old legacy cruft 
 there that may interfere with your code. In particular, if you look at the 
 top of get_dirty_limits(), it *also* does a
 
 unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
 global_page_state(NR_ANON_PAGES)) * 100) /
 available_memory;
 
 dirty_ratio = vm_dirty_ratio;
 if (dirty_ratio  unmapped_ratio / 2)
 dirty_ratio = unmapped_ratio / 2;
 
 and that whole unmapped_ratio comparison is probably bogus these days, 
 since we now take the mapped dirty pages into account. That code harks 
 back to the days before we did that, and dirty ratios only affected 
 non-mapped pages.
 
 And in particular, now that I look at it, I wonder if it can even go 
 negative (because available_memory may be *smaller* than the 
 NR_FILE_MAPPED|ANON_PAGES sum!).
 
 We'll fix up a negative value anyway (because of the clamping of 
 dirty_ratio to no less than 5), but the point is that the whole 
 unmapped_ratio thing probably doesn't make sense any more, and may well 
 make the dirty_ratio not work for you, because you may have a very small 
 unmapped_ratio that effectively makes all dirty limits always clamp to a 
 very small value.
 
 So regardless, I think you may want to try the appended patch *first*.
 
 If this patch makes a difference, please holler. I think it's the correct 
 thing to do, but I'm not going to actually commit it without somebody 
 saying that it makes a difference (and preferably Peter Zijlstra and 
 Andrew acking it too).
 
 Only *after* testing this change is it probably a good idea to test the 
 real hack of then removing the highmem_dirtyable_memory() thing. 
 
 Peter? Andrew?

I wondered about that part the other day when I went through the BDI
dirty code due to that iozone thing..

The initial commit states:

commit d90e4590519d196004efbb308d0d47596ee4befe
Author: akpm akpm
Date:   Sun Oct 13 16:33:20 2002 +

[PATCH] reduce the dirty threshold when there's a lot of mapped

Dirty memory thresholds are currently set by /proc/sys/vm/dirty_ratio.

Background writeout levels are controlled by
/proc/sys/vm/dirty_background_ratio.

Problem is that these levels are hard to get right - they are too
static.  If there is a lot of mapped memory around then the 40%
clamping level causes too much dirty data.  We do lots of scanning in
page reclaim, and the VM generally starts getting into distress.  Extra
swapping, extra page unmapping.

It would be much better to simply tell the caller of write(2) to slow
down - to write out their dirty data sooner, to make those written
pages trivially reclaimable.  Penalise the offender, not the innocent
page allocators.

This patch changes the writer throttling code so that we clamp down
much harder on writers if there is a lot of mapped memory in the
machine.  We only permit memory dirtiers to dirty up to 50% of unmapped
memory before forcing them to clean their own pagecache.

BKrev: 3da9a050Mz7H6VkAR9xo6ongavTMrw

But because dirty mapped pages are no longer special, I'd say the reason
for its existance is gone. So,

Acked-by: Peter Zijlstra [EMAIL PROTECTED]

As for the highmem part, that was due to buffer cache, and unfortunately
that is still true. Although maybe we can do something smart with the
per-bdi stuff.

 ---
  mm/page-writeback.c |8 
  1 files changed, 0 insertions(+), 8 deletions(-)
 
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index 81a91e6..d55cfca 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -297,20 +297,12 @@ get_dirty_limits(long *pbackground, long *pdirty, long 
 *pbdi_dirty,
  {
   int background_ratio;   /* Percentages */
   int dirty_ratio;
 - int unmapped_ratio;
   long background;
   long dirty;
   unsigned long available_memory = determine_dirtyable_memory();
   struct task_struct *tsk;
  
 - unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
 - global_page_state(NR_ANON_PAGES)) * 100) /
 - available_memory;
 -
   dirty_ratio = vm_dirty_ratio;

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 20:40 +0100, Peter Zijlstra wrote:

 As for the highmem part, that was due to buffer cache, and unfortunately
 that is still true. Although maybe we can do something smart with the
 per-bdi stuff.

Something like this ought to do I guess. Although my
mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
better.

Uncompiled, untested

Not-Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---
 mm/page-writeback.c |   28 
 1 file changed, 20 insertions(+), 8 deletions(-)

Index: linux-2.6/mm/page-writeback.c
===
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -280,27 +280,28 @@ static unsigned long highmem_dirtyable_m
 #endif
 }
 
-static unsigned long determine_dirtyable_memory(void)
+static unsigned long determine_dirtyable_memory(int highmem)
 {
unsigned long x;
 
x = global_page_state(NR_FREE_PAGES)
+ global_page_state(NR_INACTIVE)
+ global_page_state(NR_ACTIVE);
-   x -= highmem_dirtyable_memory(x);
+   if (!highmem)
+   x -= highmem_dirtyable_memory(x);
return x + 1;   /* Ensure that we never return 0 */
 }
 
 static void
 get_dirty_limits(long *pbackground, long *pdirty, long *pbdi_dirty,
-struct backing_dev_info *bdi)
+struct backing_dev_info *bdi, int highmem)
 {
int background_ratio;   /* Percentages */
int dirty_ratio;
int unmapped_ratio;
long background;
long dirty;
-   unsigned long available_memory = determine_dirtyable_memory();
+   unsigned long available_memory = determine_dirtyable_memory(highmem);
struct task_struct *tsk;
 
unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
@@ -346,6 +347,16 @@ get_dirty_limits(long *pbackground, long
}
 }
 
+static inline int mapping_is_buffercache(struct address_space *mapping)
+{
+   struct super_block *sb = mapping-host-i_sb;
+
+   if (sb  sb-s_bdev  sb-s_bdev-bd_inode-i_mapping != mapping)
+   return 0;
+
+   return 1;
+}
+
 /*
  * balance_dirty_pages() must be called by processes which are generating dirty
  * data.  It looks at the number of dirty pages in the machine and will force
@@ -364,6 +375,7 @@ static void balance_dirty_pages(struct a
unsigned long write_chunk = sync_writeback_pages();
 
struct backing_dev_info *bdi = mapping-backing_dev_info;
+   int highmem = !mapping_is_buffercache(mapping);
 
for (;;) {
struct writeback_control wbc = {
@@ -375,7 +387,7 @@ static void balance_dirty_pages(struct a
};
 
get_dirty_limits(background_thresh, dirty_thresh,
-   bdi_thresh, bdi);
+   bdi_thresh, bdi, highmem);
bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
if (bdi_nr_reclaimable + bdi_nr_writeback = bdi_thresh)
@@ -394,7 +406,7 @@ static void balance_dirty_pages(struct a
writeback_inodes(wbc);
pages_written += write_chunk - wbc.nr_to_write;
get_dirty_limits(background_thresh, dirty_thresh,
-  bdi_thresh, bdi);
+  bdi_thresh, bdi, highmem);
}
 
/*
@@ -503,7 +515,7 @@ void throttle_vm_writeout(gfp_t gfp_mask
long dirty_thresh;
 
 for ( ; ; ) {
-   get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL);
+   get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL, 
1);
 
 /*
  * Boost the allowable dirty threshold a bit for page
@@ -546,7 +558,7 @@ static void background_writeout(unsigned
long background_thresh;
long dirty_thresh;
 
-   get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL);
+   get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL, 
1);
if (global_page_state(NR_FILE_DIRTY) +
global_page_state(NR_UNSTABLE_NFS)  background_thresh
 min_pages = 0)



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Peter Zijlstra wrote:
 
 Something like this ought to do I guess. Although my
 mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
 better.

No, this absolutely sucks.

Why?

It's totally unacceptable to have per-mapping notions of how much memory 
we have. We used to do *exactly* that, and it's idiocy.

The reason it's unacceptable idiocy is that it means that two processes 
that access different files will then have *TOTALLY*DIFFERENT* notions of 
what the dirty limit is. And as a result, one process will happily write 
lots and lots of dirty stuff and never throttle, and the other process 
will have to throttle all the time - and clean up after the process that 
didn't!

See?

The fact is, because we count dirty pages as one resource, we must also 
have *one* limit.

So this patch is a huge regression. You might not notice it, because if 
everybody writes to the same kind of mapping, nobody will be hurt (they 
all have effectively the same global limit anyway), but you *will* notice 
if you ever have two different values of highmem.

Unacceptable. We used to do exactly what your patch does, and it got fixed 
once. We're not introducing that fundamentally broken concept again.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 12:56 -0800, Linus Torvalds wrote:
 
 On Thu, 15 Nov 2007, Peter Zijlstra wrote:
  
  Something like this ought to do I guess. Although my
  mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
  better.
 
 No, this absolutely sucks.

Agreed, I was just about to send out an email saying that..

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 21:59 +0100, Peter Zijlstra wrote:
 On Thu, 2007-11-15 at 12:56 -0800, Linus Torvalds wrote:
  
  On Thu, 15 Nov 2007, Peter Zijlstra wrote:
   
   Something like this ought to do I guess. Although my
   mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
   better.
  
  No, this absolutely sucks.
 
 Agreed, I was just about to send out an email saying that.

Say all buffer cache users were against default_backing_dev_info, and
we'd give default_backing_dev_info less, that should work out, right?

( I'm not yet clear on if buffer cache already uses
default_backing_dev_info or not, bdget() seems to suggest it does )





-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Linus Torvalds wrote:
 
 Unacceptable. We used to do exactly what your patch does, and it got fixed 
 once. We're not introducing that fundamentally broken concept again.

Examples of non-broken solutions:
 (a) always use lowmem sizes (what we do now)
 (b) always use total mem sizes (sane but potentially dangerous: but the 
 VM pressure should work! It has serious bounce-buffer issues, though, 
 which is why I think it's crazy even if it's otherwise consistent)
 (c) make all dirty counting be *purely* per-bdi, so that everybody can 
 disagree on what the limits are, but at least they also then use 
 different counters

So it's just the different writers look at the same dirty counts but then 
interpret it to mean totally different things that I think is so 
fundamentally bogus. I'm not claiming that what we do now is the only way 
to do things, I just don't think your approach is tenable.

Btw, I actually suspect that while (a) is what we do now, for the specific 
case that Bron has, we could have a /proc/sys/vm option to just enable 
(b). So we don't have to have just one consistent model, we can allow odd 
users (and Bron sounds like one - sorry Bron ;) to just force other, odd, 
but consistent models.

I'd also like to point out that while the bounce buffer issue is not so 
much a HIGHMEM issue on its own (it's really about the device DMA limits, 
which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
special is that without HIGHMEM the bounce buffers generally work 
perfectly fine.

The problem with HIGHMEM is that it causes various metadata (dentries, 
inodes, page struct tables etc) to eat up memory prime real estate under 
the same kind of conditions that also dirty a lot of memory. So the reason 
we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
mapping DMA limits, and to a large degree the fact that non-highmem memory 
is special in general, and it is usually the non-highmem areas that are 
constrained - and need to be protected.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread Ben Dooks

On Tue, Nov 13, 2007 at 10:34:37PM +, Russell King wrote:
 On Tue, Nov 13, 2007 at 06:25:16PM +, Alan Cox wrote:
   Given the wide range of ARM platforms today, it is utterly idiotic to
   expect a single person to be able to provide responses for all ARM bugs.
   I for one wish I'd never *VOLUNTEERED* to be a part of the kernel
   bugzilla, and really *WISH* I could pull out of that function.
  
  You can. Perhaps that bugzilla needs to point to some kind of
  [EMAIL PROTECTED] list for the various ARM platform
  maintainers ?
 
 That might work - though it would be hard to get all the platform
 maintainers to be signed up to yet another mailing list, I'm sure
 sufficient would do.

As long as it would just be bug reports, I'm sure that most of us
could be persuaded to subscribe. Adding another list for general
discussions is probably not going to be read, the current list
provides more than enough to keep us busy.

-- 
Ben

Q:  What's a light-year?
A:  One-third less calories than a regular year.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Linus Torvalds wrote:
 
 The problem with HIGHMEM is that it causes various metadata (dentries, 
 inodes, page struct tables etc) to eat up memory prime real estate under 
 the same kind of conditions that also dirty a lot of memory. So the reason 
 we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
 mapping DMA limits, and to a large degree the fact that non-highmem memory 
 is special in general, and it is usually the non-highmem areas that are 
 constrained - and need to be protected.

Final note on this (promise): 

I'd really be very interested to hear if the patch I *do* think makes 
sense (ie the removal of the old unmapped_ratio logic) actually already 
solves most of Bron's problems.

It may well be that that unmapped_ratio logic effectively undid the system 
configuration changes that Bron has done. It doesn't matter if Bron has

From our sysctl.conf:
# This should help reduce flushing on Cache::FastMmap files
vm.dirty_background_ratio = 50
vm.dirty_expire_centisecs = 9000
vm.dirty_ratio = 80
vm.dirty_writeback_centisecs = 3000

if it turns out that the unmapped_ratio logic turns the 80% back down to 
5%.

It may well be that 80% of the non-highmem memory is plenty good enough! 

Sure, older kernels allowed even more of memory to be dirty (since they 
didn't count dirty mappings at all), but we may have a case where the fact 
that we discount the HIGHMEM stuff isn't the major problem in itself, and 
that the dirty_ratio sysctl should be ok - but just gets screwed over by 
that unmapped_ratio logic.

So Bron, if you can test that patch, I'd love to hear if it matters. It 
may not make any difference (maybe you don't actually trigger the 
unmapped_ratio logic at all), but I think it has the potential for being 
totally broken for you.

People that don't change the dirty_ratio from the default values would 
generally never care, because the default dirty-ratio is *already* so low 
that even if the unmapped_ratio logic triggers, it won't much matter!

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Peter Zijlstra


On Thu, 2007-11-15 at 13:14 -0800, Linus Torvalds wrote:
 
 On Thu, 15 Nov 2007, Linus Torvalds wrote:
  
  Unacceptable. We used to do exactly what your patch does, and it got fixed 
  once. We're not introducing that fundamentally broken concept again.
 
 Examples of non-broken solutions:
  (a) always use lowmem sizes (what we do now)
  (b) always use total mem sizes (sane but potentially dangerous: but the 
  VM pressure should work! It has serious bounce-buffer issues, though, 
  which is why I think it's crazy even if it's otherwise consistent)
  (c) make all dirty counting be *purely* per-bdi, so that everybody can 
  disagree on what the limits are, but at least they also then use 
  different counters

I think that (c) is doable. If its worth the effort, who knows,
apparently there still are people using 32bit kernels on boxen with
mucho memory.

 So it's just the different writers look at the same dirty counts but then 
 interpret it to mean totally different things that I think is so 
 fundamentally bogus. I'm not claiming that what we do now is the only way 
 to do things, I just don't think your approach is tenable.

Agreed, the per mapping thing was utter crap.

 I'd also like to point out that while the bounce buffer issue is not so 
 much a HIGHMEM issue on its own (it's really about the device DMA limits, 
 which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is 
 special is that without HIGHMEM the bounce buffers generally work 
 perfectly fine.
 
 The problem with HIGHMEM is that it causes various metadata (dentries, 
 inodes, page struct tables etc) to eat up memory prime real estate under 
 the same kind of conditions that also dirty a lot of memory. So the reason 
 we disallow HIGHMEM from dirty limits is only *partly* the per-device or 
 mapping DMA limits, and to a large degree the fact that non-highmem memory 
 is special in general, and it is usually the non-highmem areas that are 
 constrained - and need to be protected.

But this problem is already an issue, Anton recently had a case where a
12GB highmem box locked up due to NTFS running out of lowmem - or
something like that.

And I think that with the targeted slab reclaim (or slab defrag as its
apparently still called) we can properly fix this side of the problem. I
think Rik was looking into doing so.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Peter Zijlstra wrote:
 
 But this problem is already an issue, Anton recently had a case where a
 12GB highmem box locked up due to NTFS running out of lowmem - or
 something like that.

Yeah. I always considered HIGHMEM to just be unusable. It's ok for 
extending to 2-4GB (ie HIGHMEM4G, not 64G), and it's probably borderline 
usable for 4-8G if you are careful.

But quite frankly, I refuse to even care about anything past that. If you 
have 12G (or heaven forbid, even more) in your machine, and you can't be 
bothered to just upgrade to a 64-bit CPU, then quite frankly, *I* 
personally can't be bothered to care.

That's my personal opinion, and I realize that some of the commercial 
vendors may care about their insane customers' satisfaction, but I'm 
simply not interested in insane users. If they have that much RAM (and 
bought it a few years ago when a 64-bit CPU wasn't an option), they can't 
be poor.

So the _only_ explanation today for 12GB on a 32-bit machine is
 (a) insanity
or 
 (b) being so lazy as to not bother to upgrade
and in either case, my personal reaction is I'm *not* crazy, and yes, I'm 
lazy too, and I can't give a rats *ss about those problems.

HIGHMEM was a mistake in the first place. It's one that we can live with, 
but I refuse to support it more than it needs to be supported. And 12GB is 
*way* past the end of what is worth supporting.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Chris Friesen


Linus Torvalds wrote:


So the _only_ explanation today for 12GB on a 32-bit machine is
 (a) insanity
or 
 (b) being so lazy as to not bother to upgrade
and in either case, my personal reaction is I'm *not* crazy, and yes, I'm 
lazy too, and I can't give a rats *ss about those problems.


How about...

c) they bought it at the beginning of a project and are stuck with it 
because they aren't getting any more money for hardware


d) they've shipped it to the field and have to support it

We've got some 32-bit 8GB boxes for which both of these would hold true.

Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Linus Torvalds



On Thu, 15 Nov 2007, Chris Friesen wrote:
 
 We've got some 32-bit 8GB boxes for which both of these would hold true.

Still not enough of a reason for me to care.

Remember - I'm the guy who refused to merge RH's 4G:4G patches because I 
thought they were an unsupportable nightmare.

I care a lot about future supportability, and HIGHMEM is there purely as a 
temporary wart and blip on the screen.

I did acknowledge that others may care more, but the fact is, I suspect 
that it's going to be cheaper to literally buy and ship a new machine to a 
customer than to really suppport it in any other form.

Side note: HIGHMEM64G works perfectly fine with 12GB of RAM under 
*limited*loads*. If your customer does certain well-defined and simple 
things that don't put huge and varied loads on the VFS or VM layer, then 
12GB+ is probably fine regardless.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Rob Mueller




That's my personal opinion, and I realize that some of the commercial
vendors may care about their insane customers' satisfaction, but I'm
simply not interested in insane users. If they have that much RAM (and
bought it a few years ago when a 64-bit CPU wasn't an option), they can't
be poor.


From our perspective, the main issue is that some of these machines we spent 
quite a bit of money on the big RAM (for it's day) + lots of 15k RPM SCSI 
drives + multi-year support contracts. They're highly IO bound, and barely 
use 10-20% of their old 2.4Ghz Prestonia Xeon CPUs. It's hard to justify 
junking those machines  5 years.


We have a couple of 6G machines and some 8G machines using PAE. On the 
whole, they actually have been working really well (hmmm, apart from the 
recent dirty pages issue + reiserfs data=journal leaks + inodes in lowmem 
limits)


Rob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-15 Thread J. Bruce Fields

On Thu, Nov 15, 2007 at 01:50:43PM +1100, Neil Brown wrote:
 Virtual Folders.
 
 I use VM mode in EMACS, but I believe some other mail readers have the
 same functionality.
 I have a virtual folder called nfs which shows me all mail in my
 inbox which has the string 'nfs' or 'lockd' in a To, Cc, or Subject
 field.  When I visit that folder, I see all mail about nfs, whether it
 was sent to me personally, or to a relevant list, or to lkml.

Hm (googling around for mutt and virtual folders): looks like I can
get most of the way there in mutt with some macros based on its limit
command:

http://www.tummy.com/journals/entries/jafo_20060303_00

Thanks.--b.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mmap dirty limits on 32 bit kernels (Was: [BUG] New Kernel Bugs)

2007-11-15 Thread Alan Cox

 So the _only_ explanation today for 12GB on a 32-bit machine is
  (a) insanity
 or 
  (b) being so lazy as to not bother to upgrade
 and in either case, my personal reaction is I'm *not* crazy, and yes, I'm 
 lazy too, and I can't give a rats *ss about those problems.

12GB-16GB worked well historically so its a regression. Above 16GB its
all utterly mad.

You forgot reason (c) though

(c) 32bit is a tested approved certified etc environment - essentially
conservativsm and paranoia, and its hard to explain to some of these
people that the right answer really is less RAM or 64bit, especially as
they may already know it but have a 12 month process to prove and certify
a system configuration.

 HIGHMEM was a mistake in the first place. It's one that we can live with, 
 but I refuse to support it more than it needs to be supported. And 12GB is 
 *way* past the end of what is worth supporting.

Highmem to 4GB was sensible. Highmem to 8GB was pushing it.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-14 Thread Rene Herman


On 15-11-07 05:16, Bron Gondwana wrote:


Totally unrelated - I sent something to the kolab mailing list a couple


[ ... ]

I'm sure if I had something that I considered worth informing the ALSA 
project of, I'd be wary of spending the same effort writing a good post
knowing it may be dropped in between the by a list moderator just 
selecing all and bouncing them.


Totally unrelated indeed so why are spouting crap? If the kohab list has a 
problem take it up with them but keep ALSA out of it. alsa-devel has only 
ever moderated out spam -- nothing else.


ene
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Linus Torvalds

On Wed, 14 Nov 2007, Linus Torvalds wrote:
> 
> So even at 100% dirty limits, it won't let you dirty more than 1GB on the 
> default 32-bit setup.

Side note: all of these are obviously still just heuristics. If you really 
*do* run on a 32-bit kernel, and you want to have the pain, I'm sure you 
can just disable the dirty limits with a one-liner kernel mod. And if it's 
useful enough, we can certainly expose flags like that.. Not that I expect 
that much anybody else will ever care, but it's not like it's wrong to 
expose the silly heuristics the kernel has to users that have very 
specific loads.

That said, I still do hope you aren't actually using HIGHMEM64G. I was 
really hoping that the people who had enough moolah to buy >4GB of RAM had 
long since also upgraded to a 64-bit machine ;)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Linus Torvalds

On Thu, 15 Nov 2007, Bron Gondwana wrote:
> 
> So we've already been running those settings for a while.  They didn't
> help.

Ok, so something else is up. If the mmap file is 2G, and you have 6G of 
RAM, you shouldn't be hitting the dirty limits with those setups.

Of course, it may still be that some accounting thing is simply off, and 
the dirty limits trigger *despite* all the proper config settings ;)

> Guess we'd better get on to figuring building a simple test app.

Yeah, if you have something that others can see in action, that is sure 
going to get more people to look at it. 

That said - I'm sincerely hoping that you're not running on a 32-bit 
kernel. Because if so, those percentages are percentages of *normal* 
memory, not highmem (that got changed at one point after people ran out of 
lowmem).

So even at 100% dirty limits, it won't let you dirty more than 1GB on the 
default 32-bit setup.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Bron Gondwana

On Wed, Nov 14, 2007 at 08:24:53PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 15 Nov 2007, Bron Gondwana wrote:
> > 
> > And congratulations to him for that.  We almost entirely dropped 2.6.16,
> > but there's a regression some time since then that makes large MMAPed
> > files a major pain (specifically the dcc database clean takes about 5
> > minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels)
> > 
> > But we keep putting off writing a small testcase that can repeat the
> > issue so we can bisect it - because it's working fine with 2.6.16 on
> > that machine.
> 
> Heh. I suspect you don't even need to bisect it.
> 
> The big difference with large mmap'ed files is that later kernels will 
> actually track dirty ratios for dirty mmap'ed pages. Earlier kernels never 
> did.
> 
> So in older kernels, you can dirty as much memory as you want, and the 
> kernel will never try to write it back (well - "never" here means one of 
> either (a) you ask it to with msync or (b) you run out of memory, when the 
> kernel then totally falls down and the machine is essentially unusuable).
> 
> So *if* the symptom seems to be that the later kernels do a lot more IO, 
> then try to change 
> 
>   /proc/sys/vm/dirty_[background_]ratio
> 
> which is just a percentage of memory (defaults to 5% for background and 
> 10% for foreground dirtying). Turn them both up a lot (say to 50 and 80 
> percent respectively) and see if that makes a difference.

>From our sysctl.conf:
# This should help reduce flushing on Cache::FastMmap files
vm.dirty_background_ratio = 50
vm.dirty_expire_centisecs = 9000
vm.dirty_ratio = 80
vm.dirty_writeback_centisecs = 3000

So we've already been running those settings for a while.  They didn't
help.

We also gave this thing its very own dedicated ServeRAID card and
associated RAID1 set of high speed SCSI drives (mainly because they
were just sitting there already attached to the machine and unused,
we don't love DCC that much) and it didn't help.  Helped the rest of
the machine now that the system drive wasn't being pegged 100% for
12 hours a day, but it didn't speed things up any.

It was making some pretty random little scattered changes all through
that file.  Hmm.. here's what the developers said about it:

  First dbclean creates a new dcc_db file by copying from the old file.
  As it copies, it decides whether each record is worth keeping.
  That involves looking up the checksums in the old hash table.  This
  is as almost afast a simple /bin/cp if the old dcc_db and dcc_db.hash
  files fit in RAM.

  The dbclean creates a new dcc_db.hash file.  This starts with
  creating an empty new dcc_db.hash file.  Then the new dcc_db and
  dcc_db.hash files are mapped into memory, and dbclean creates pointers
  to each checksum in the dcc_db file in the dcc_db.hash file.

  While dbclean is running, dccd unmaps everything and tries to stay out
  of the way.

> If so, you'll be the first one to officially even notice this change, I 
> think.

Yay for us.  Thankfully it doesn't affect Cyrus's MMAP usage (read only
with direct seek and write calls to change anything, then remap) or we
would have suffered pretty badly!

Guess we'd better get on to figuring building a simple test app.  The
mmap file that DCC uses is about 2Gb if that makes any difference:

-rw-r--r-- 1 dcc  dcc  2035138560 Nov 15 00:15 dcc_db
-rw-r--r-- 1 dcc  dcc   516612096 Nov 14 06:27 dcc_db.hash

The machine has 6Gb of memory and should be able to fit these
files fine:

[EMAIL PROTECTED] hm]$ free
 total   used   free sharedbuffers cached
Mem:   62323645758112 474252  0  417563002528
-/+ buffers/cache:27138283518536
Swap:  2048248  749441973304

And here's what top says about the process:
  15   0 1914m  57m  41m D5  1.0 346:07.79 dccd

This is on: 2.6.16.55-reiserfix-fai 
  (one small patch to reiserfs, and built with netboot support for FAI)

So yeah - we'll try to get a clearer idea of what it's doing, but the
knob twiddle didn't work for us.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Bron Gondwana

On Tue, Nov 13, 2007 at 10:56:01PM +0100, Christian Kujau wrote:
> On Tue, 13 Nov 2007, Andrew Morton wrote:
>> There are a number of process things we _could_ do.  Like
>> - have bugfix-only kernel releases
>
> Adrian Bunk does (did?) this with 2.6.16.x, although it always seemed to me 
> like an unrewarded one man show. AFAIK not even the big distros are begging 
> for bugfix-only versions, as they too want to have (sell) new features. 
> Mission critical systems might want to require such versions, but I guess 
> they're using heavily customized trees anyway.

And congratulations to him for that.  We almost entirely dropped 2.6.16,
but there's a regression some time since then that makes large MMAPed
files a major pain (specifically the dcc database clean takes about 5
minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels)

But we keep putting off writing a small testcase that can repeat the
issue so we can bisect it - because it's working fine with 2.6.16 on
that machine.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Linus Torvalds

On Thu, 15 Nov 2007, Bron Gondwana wrote:
> 
> And congratulations to him for that.  We almost entirely dropped 2.6.16,
> but there's a regression some time since then that makes large MMAPed
> files a major pain (specifically the dcc database clean takes about 5
> minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels)
> 
> But we keep putting off writing a small testcase that can repeat the
> issue so we can bisect it - because it's working fine with 2.6.16 on
> that machine.

Heh. I suspect you don't even need to bisect it.

The big difference with large mmap'ed files is that later kernels will 
actually track dirty ratios for dirty mmap'ed pages. Earlier kernels never 
did.

So in older kernels, you can dirty as much memory as you want, and the 
kernel will never try to write it back (well - "never" here means one of 
either (a) you ask it to with msync or (b) you run out of memory, when the 
kernel then totally falls down and the machine is essentially unusuable).

So *if* the symptom seems to be that the later kernels do a lot more IO, 
then try to change 

/proc/sys/vm/dirty_[background_]ratio

which is just a percentage of memory (defaults to 5% for background and 
10% for foreground dirtying). Turn them both up a lot (say to 50 and 80 
percent respectively) and see if that makes a difference.

If so, you'll be the first one to officially even notice this change, I 
think.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-14 Thread Bron Gondwana

On Wed, Nov 14, 2007 at 12:46:24PM +0100, Rene Herman wrote:
> On 14-11-07 11:07, David Miller wrote:
>
> Added Jaroslav and Takashi to the already extensive CC
>
>> From: Russell King <[EMAIL PROTECTED]>
>
>>> So, when are you creating a replacement alsa-devel mailing list on
>>> vger?  That's also subscribers-only.
>> The operative term is "alternative" rather than "replacement".
>> Perhaps this misunderstanding is what you're so upset about.
>> And yes, that alsa list bugs the crap out of me too.  I'm more than
>> happy to provide an alternative for that one as well.
>
> [EMAIL PROTECTED] is not subscriber-only. Same as that arm list, 
> it's _moderated_ for non-subscribers and given that I and other moderators 
> have been doing our best to moderate quickly (I tend to stay logged in to 
> the moderation interface all day for example) what specifically bugged the 
> crap out of you? It's not something a poster needs to concern himself with.

Totally unrelated - I sent something to the kolab mailing list a couple
of days ago (it's moderated for non subscribers) informing them that I
had found the cause of some Cyrus bugs that they had problems with in
the past and providing a link to my post to the cyrus list with the
patches attached.

It sat in the moderation queue and then was rejected with "non
subscriber post to subscription only list".  Not only was the reponse a
day later when I had moved on to other things, but it got me really
pissed off that I had put some effort into providing a good quality post
that outlined the specific issues and how they applied to their project,
and had been summarily dismissed, probably without the effort being put
in.

There's no way for a non-subscriber to know in advance if the list they
are trying to post to will do that to them, completely negating the
effort put in to writing something worthwhile to inform that community.
It's insular, and it sucks.

So yeah, my attitude now is that the Kolab folks can go screw themselves
and track down the fix on their own or wait until I've convinced
upstream to accept the fixes (likely) and they have moved to the new
version (unlikely for a long time, and meanwhile they're missing out on
the performance increases that having a more stable skiplist library 
would give them)

I'm sure if I had something that I considered worth informing the ALSA
project of, I'd be wary of spending the same effort writing a good
post knowing it may be dropped in between the by a list moderator just
selecing all and bouncing them.

Bron.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Neil Brown

On Tuesday November 13, [EMAIL PROTECTED] wrote:
> On Tuesday 13 November 2007 07:08, Mark Lord wrote:
> > Ingo Molnar wrote:
> > ..
> >
> > > This is all QA-101 that _cannot be argued against on a rational basis_,
> > > it's just that these sorts of things have been largely ignored for
> > > years, in favor of the all-too-easy "open source means many eyeballs and
> > > that is our QA" answer, which is a _good_ answer but by far not the most
> > > intelligent answer! Today "many eyeballs" is simply not good enough and
> > > nature (and other OS projects) will route us around if we dont change.
> >
> > ..
> >
> > QA-101 and "many eyeballs" are not at all in opposition.
> > The latter is how we find out about bugs on uncommon hardware,
> > and the former is what we need to track them and overall quality.
> >
> > A HUGE problem I have with current "efforts", is that once someone
> > reports a bug, the onus seems to be 99% on the *reporter* to find
> > the exact line of code or commit.  Ghad what a repressive method.
> 
> This is the only method that scales.

That sounds overly hash, and the rest of you mail sounds much more
moderate and sensible - I can only assume you were using hyperbole??

Putting the "onus on the reporter" is simply not going to work unless
you have a business relationship.  In the community, we are all
volunteering our time (well, maybe my employer is volunteering my time
to do community support, but the effect is the same).

I would hope that the focus of developers is to empower bug reporters
to provide further information (and as has been said, "git bisect" is
a great empowerer).  Some people will be incredibly help, especially
if you ask politely and say thankyou.  Others won't for any of a
number of reasons - and maybe that means their bug won't get fixed.

To my eyes, the "only method that scales" is investing effort in
encouraging and training bug reporters.  Some of that effort might not
produce results, but when others among those you have encouraged start
answering the newbee questions on the list and save you the time, you
get a distinct feeling that it was all worth while.


I think we are in agreement - I just wanted to take issue with that
one sentence :-)  The rest is great.

NeilBrown

> 
> Developer has only 24 hours in each day, and sometimes he needs to eat,
> sleep, and maybe even pay attention to e.g. his kids.
> 
> But bug reporters are much more numerous and they have more
> hours in one day combined.
> 
> BUT - it means that developers should try to increase user base,
> not scare users away.
> 
> > And if the "developer" who broke the damn thing, or who at least
> > "claims" to be supporting that code, cannot "reproduce" the bug,
> > they drop it completely.
> 
> Developer should let reporter know that reporter needs to help
> a bit here. Sometimes a bit of hand holding is needed, but it
> pays off because you breed more qualified testers/bug reporters.
> 
> > Contrast that flawed approach with how Linus does things..
> > he thinks through the symptoms, matches them to the code,
> > and figures out what the few possibilities might be,
> > and feeds back some trial balloon patches for the bug reporter to try.
> >
> > MUCH better.
> >
> > And remember, *I'm* an old-time Linux kernel developer.. just think about
> > the people reporting bugs who haven't been around here since 1992..
> 
> Yes. Developers should not grow more and more unhelpful
> and arrogant towards their users just because inexperienced
> users send incomplete/poorly written bug reports.
> They need to provide help, not humiliate/ignore.
> 
> I think we agree here.
> --
> vda
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Neil Brown

On Wednesday November 14, [EMAIL PROTECTED] wrote:
> On Wed, Nov 14, 2007 at 09:38:20AM -0800, Randy Dunlap wrote:
> > On Wed, 14 Nov 2007 15:08:47 +0100 Ingo Molnar wrote:
> > > so please stop this "too busy and too noisy" nonsense already. It was 
> > > nonsense 10 years ago and it's nonsense today. In 10 years the kernel 
> > > grew from a 1 million lines codebase to an 8 million lines codebase, so 
> > > what? Deal with it and be intelligent about filtering your information 
> > > influx instead of imposing a hard pre-filtering criteria that restricts 
> > > intelligent processing of information.
> > 
> > So you have a preferred method of handling email.  Please don't
> > force it on the rest of us.
> 
> I'd be curious for any pointers on tools, actually.  I "read" (ok, skim)
> lkml but still overlook relevant bug reports occasionally.
> (Fortunately, between Trond and Andrew and others forwarding things it's
> not actually a problem, but I'm still curious).

Virtual Folders.

I use VM mode in EMACS, but I believe some other mail readers have the
same functionality.
I have a virtual folder called "nfs" which shows me all mail in my
inbox which has the string 'nfs' or 'lockd' in a To, Cc, or Subject
field.  When I visit that folder, I see all mail about nfs, whether it
was sent to me personally, or to a relevant list, or to lkml.

Admittedly if someone doesn't bother to choose a meaningful Subject,
then I might miss that.  I think this mostly happens when Andrew sends
a "-mm" announcement, asked people to change the subject line when
following up, and someone follows up without changing the subject line
and say "NFS doesn't work any more".

I have another virtual folder which matches "md" and "raid" and
"mdadm" in any header (so when the people from coraid.com talk about
ATA over Ethernet, that gets badly filed, but it is a small cost).

Then I have the "bkernel" (boring kernel) folder for all mail from
lkml that doesn't mention nfs or raid or md, and isn't from or to
me.  That folder I skim every week or so and just read the juicy
debates and look for interesting tidbits from interesting people -
then delete the whole folder, mostly unread.

I don't think I could cope with mail without virtual folders.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Daniel Barkalow

On Tue, 13 Nov 2007, Theodore Tso wrote:

> There are two parts to this.  One is a Ubuntu development kernel which
> we can give to large numbers of people to expand our testing pool.
> But if we don't do a better job of responding to bug reports that
> would be generated by expanded testing this won't necessarily help us.
> 
> The other an automated set of standard pre-built bisection points so
> that testers can more easily localize a bug down to a few hundred
> commits without needing to learn how to use "git bisect" (think Ubuntu
> users).

I don't see any reason that we couldn't have a tool accessible to Ubuntu 
users that does a real "git bisect". Git is really good at being scripted 
by fancy GUIs. It should be easy enough to have a drop down with all of 
the Ubuntu kernel package releases, where the user selects what works and 
what doesn't. Then the tool clones a git repository with flags to only get 
relevant parts, and then leads a bisect run, where it's also 
configuring, building, and installing the kernels (as a different grub 
entry), and providing instructions in general. Fundamentally, "git bisect" 
is a really low-interaction process: you tell it a couple of commits, and 
then it does stuff, and then you tell it "I tested, and it worked" or "I 
tested, and it had the problem" or "Something else went wrong", and it 
asks you something new. Other than that, it just takes time (and a build 
system hook, which this tool would handle for the kernel). Eventually, it 
tells you what to report, and you do so.

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Linus Torvalds

On Thu, 15 Nov 2007, Heikki Orsila wrote:
> 
> See
>   http://bugzilla.kernel.org/show_bug.cgi?id=9321
> 
> for more information.

That's a pretty unhelpful thing. It doesn't describe the breakage at all, 
so there is hardly much "more info". 

You've also apparently made all the attachements "octet-streams", so they 
are singularly painful to look at (ie no normal web-browser will show 
them, you have to save them to a file and look at them there). 

That said, I think I'll revert it, since it certainly fits the bill, but 
that's a really quite unreadable bug-report.

I finally found the actual description of the problem (by following 
multiple links), but really, if people want me to revert things, I would 
*strongly* suggest they make it *obvious* what's going on in the email to 
me that says "please revert".

Because quite frankly, if that email doesn't explain why something should 
be reverted (and just points to other things), it doesn't really cut it.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Heikki Orsila

On Wed, Nov 14, 2007 at 11:54:16AM -0800, Linus Torvalds wrote:
> Actually, I'm pretty happy reverting patches that cause regressions even 
> if it *can* be "fixed for release". If there isn't a fix available within 
> a day or two, it should get reverted.
> ...
> Also, please notice the latter part of the suggestion above: even if 
> somebody has bisected down their problem to a specific commit, I really 
> *do* want to hear that actually undoing the commit on top of the current 
> tree acually fixes it again, because sometimes that just isn't the case - 
> sometimes you end up having various interactions that means that reverting 
> a commit might simply not even work.

Ok. drivers/net/skge has been broken for several weeks. I have manually 
fixed the driver at each rc* release since then.

Please revert skge changes, the commit that broke driver is
7fb7ac241162dc51ec0f7644d4a97b2855213c32

See
http://bugzilla.kernel.org/show_bug.cgi?id=9321

for more information. I think Stephen Hemminger is working on the skge 
fix, but it has been several days since I've heard anything from him.

-- 
Heikki Orsila   Barbie's law:
[EMAIL PROTECTED]   "Math is hard, let's go shopping!"
http://www.iki.fi/shd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Gabriel C

Denys Vlasenko wrote:
> hi Matthew,
> 
> On Wednesday 14 November 2007 06:35, Hannes Reinecke wrote:
>> Matthew Wilcox wrote:
>>> On Wed, Nov 14, 2007 at 12:46:20AM -0700, Denys Vlasenko wrote:
 Finally they replied and asked to rediff it against their
 git tree. I did that and sent patches back. No reply since then.

 And mind you, the patch is not trying to do anything
 complex, it mostly moves code around, removes 'inline',
 adds 'const'. What should I think about it?
>>> I'm waiting for an ACK/NAK from Hannes, the maintainer.  What should I
>>> do?
> 
> You could have informed me about this, and I would talk to Hannes
> myself. This would free up your mind from keeping track of this
> particular patch.
> Parallelize development, prevent things from being forgotten.
> 
> 
> Hi Hannes,
> 
>> I haven't actually been able to test it here (too busy, sorry). If someone
>> else confirms it does it's job then
>>
>> Acked-by: Hannes Reinecke <[EMAIL PROTECTED]>
> 
> It's not in my mailbox on this machine, gladly we have lkml archived
> in the Net. Here is a positive tester report:
> 
> http://lkml.org/lkml/2007/10/15/168:
> 
> ==
> Date  Mon, 15 Oct 2007 15:53:08 +0200
> From  Gabriel C <>
> Subject   Re: [PATCH 0/3] debloat aic7xxx and aic79xx drivers
> 
>>> Compile tested and applies cleanly to 2.6.23.
>>> I don't have this hardware anymore and cannot run test these patches.
>> I can test these patches on an aic7892 controller later on today if you 
> want.
> 
> Works fine for me tested on :
> 
> 03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] 
> (rev 02)
> 
> Gabriel
> ===

I still run the patches on a box with 2.6.23 and one with 2.6.24-rc1 without 
any problems.
Didn't tested rc2/current git but I can if is needed.

If you need an Tested-by.. let me know.
   
> 
> --
> vda


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] New Kernel Bugs

2007-11-14 Thread Denys Vlasenko

hi Matthew,

On Wednesday 14 November 2007 06:35, Hannes Reinecke wrote:
> Matthew Wilcox wrote:
> > On Wed, Nov 14, 2007 at 12:46:20AM -0700, Denys Vlasenko wrote:
> >> Finally they replied and asked to rediff it against their
> >> git tree. I did that and sent patches back. No reply since then.
> >>
> >> And mind you, the patch is not trying to do anything
> >> complex, it mostly moves code around, removes 'inline',
> >> adds 'const'. What should I think about it?
> >
> > I'm waiting for an ACK/NAK from Hannes, the maintainer.  What should I
> > do?

You could have informed me about this, and I would talk to Hannes
myself. This would free up your mind from keeping track of this
particular patch.
Parallelize development, prevent things from being forgotten.


Hi Hannes,

> I haven't actually been able to test it here (too busy, sorry). If someone
> else confirms it does it's job then
>
> Acked-by: Hannes Reinecke <[EMAIL PROTECTED]>

It's not in my mailbox on this machine, gladly we have lkml archived
in the Net. Here is a positive tester report:

http://lkml.org/lkml/2007/10/15/168:

==
DateMon, 15 Oct 2007 15:53:08 +0200
FromGabriel C <>
Subject Re: [PATCH 0/3] debloat aic7xxx and aic79xx drivers

>> Compile tested and applies cleanly to 2.6.23.
>> I don't have this hardware anymore and cannot run test these patches.
> 
> I can test these patches on an aic7892 controller later on today if you 
want.

Works fine for me tested on :

03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] 
(rev 02)

Gabriel
===

--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 402 matches

Mail list logo