Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-10-02 Thread Randy Dunlap
On Tue, 02 Oct 2007 15:36:01 +0200 Peter Zijlstra wrote: > On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote: > > > (Searches for the lockstat documentation) > > > > Did we forget to do that? > > yeah,... > > /me quickly whips up something Thanks. Just some typos noted below. > Signed

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-10-02 Thread Peter Zijlstra
On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote: > (Searches for the lockstat documentation) > > Did we forget to do that? yeah,... /me quickly whips up something Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- Documentation/lockstat.txt | 119 +++

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-10-01 Thread Chuck Ebbert
On 09/29/2007 07:04 AM, Fengguang Wu wrote: > On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote: >> Hi, >> >> In my testing, a unresponsive file system can hang all I/O in the system. >> This is not seen in 2.4. >> >> I started 20 threads doing I/O on a NFS share. They are just doing 4K >> w

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-29 Thread Peter Zijlstra
On Sat, 2007-09-29 at 20:28 +0800, Fengguang Wu wrote: > On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote: > > On the patch itself, not sure if it would have been enough. As soon as > > there is a single dirty inode on the list one would get caught in the > > same problem as before.

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-29 Thread Fengguang Wu
On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote: > > On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote: > > On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote: > > > Hi, > > > > > > In my testing, a unresponsive file system can hang all I/O in the system. > > > This is no

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-29 Thread Peter Zijlstra
On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote: > On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote: > > Hi, > > > > In my testing, a unresponsive file system can hang all I/O in the system. > > This is not seen in 2.4. > > > > I started 20 threads doing I/O on a NFS share. They ar

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-29 Thread Fengguang Wu
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote: > Hi, > > In my testing, a unresponsive file system can hang all I/O in the system. > This is not seen in 2.4. > > I started 20 threads doing I/O on a NFS share. They are just doing 4K > writes in a loop. > > Now I stop NFS server hosting

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Daniel Phillips
On Friday 28 September 2007 06:35, Peter Zijlstra wrote: > ,,,it would be grand (and dangerous) if we could provide for a > button that would just kill off all outstanding pages against a dead > device. Substitute "resources" for "pages" and you begin to get an idea of how tricky that actually is

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Daniel Phillips
On Thursday 27 September 2007 23:50, Andrew Morton wrote: > Actually we perhaps could address this at the VFS level in another > way. Processes which are writing to the dead NFS server will > eventually block in balance_dirty_pages() once they've exceeded the > memory limits and will remain blocked

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
No change in behavior even in case of low memory systems. I confirmed it running on 1Gig machine. Thanks --Chakri On 9/28/07, Chakri n <[EMAIL PROTECTED]> wrote: > Here is a the snapshot of vmstats when the problem happened. I believe > this could help a little. > > crash> kmem -V >NR_FRE

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
Here is a the snapshot of vmstats when the problem happened. I believe this could help a little. crash> kmem -V NR_FREE_PAGES: 680853 NR_INACTIVE: 95380 NR_ACTIVE: 26891 NR_ANON_PAGES: 2507 NR_FILE_MAPPED: 1832 NR_FILE_PAGES: 119779 NR_FILE_DIR

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 16:32:18 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote: > > On Fri, 28 Sep 2007 15:52:28 -0400 > > Trond Myklebust <[EMAIL PROTECTED]> wrote: > > > > > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote: > > > > O

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Trond Myklebust
On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote: > On Fri, 28 Sep 2007 15:52:28 -0400 > Trond Myklebust <[EMAIL PROTECTED]> wrote: > > > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote: > > > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> > > > wrote: > > > >

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Daniel Phillips
On Friday 28 September 2007 12:52, Trond Myklebust wrote: > I'm not sure that the hang that is illustrated here is so special. It > is an example of a bog-standard ext3 write, that ends up calling the > NFS client, which is hanging. The fact that it happens to be hanging > on the nfsd process is mo

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 15:52:28 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote: > > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> > > wrote: > > > Looking back, they were getting caught up in > > > balance_dirty_page

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Trond Myklebust
On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote: > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > > Looking back, they were getting caught up in > > balance_dirty_pages_ratelimited() and friends. See the attached > > example... > > that one is nfs-on-loopbac

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote: > > On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]> > > wrote: > > > Do these patches also cause the memory reclaimers to steer clear of >

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 20:48:59 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote: > > > Do you know where the stalls are occurring? throttle_vm_writeout(), or via > > direct calls to congestion_wait() from page_alloc.c and vmscan.c? (runni

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Trond Myklebust
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote: > On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > > Do these patches also cause the memory reclaimers to steer clear of > > devices that are congested (and stop waiting on a congested device if > > they see that

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote: > Do you know where the stalls are occurring? throttle_vm_writeout(), or via > direct calls to congestion_wait() from page_alloc.c and vmscan.c? (running > sysrq-w five or ten times will probably be enough to determine this) would it make

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote: > On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote: > > > Actually we perhaps could address this at the VFS level in another way. > > Processes which are writing to the dead NFS server will eventually block in >

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 07:28:52 -0600 [EMAIL PROTECTED] (Jonathan Corbet) wrote: > Andrew wrote: > > It's unrelated to the actual value of dirty_thresh: if the machine fills up > > with dirty (or unstable) NFS pages then eventually new writers will block > > until that condition clears. > > > > 2.4

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Trond Myklebust
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote: > Actually we perhaps could address this at the VFS level in another way. > Processes which are writing to the dead NFS server will eventually block in > balance_dirty_pages() once they've exceeded the memory limits and will > remain blocked

Re: [linux-pm] Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Alan Stern
On Fri, 28 Sep 2007, Peter Zijlstra wrote: > On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote: > > Is it really NFS-related? I was trying to back up my 2.6.23-rc8 system > > to an external USB drive the other day when something flaked and the > > drive fell off the bus. That, too, was s

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote: > Andrew wrote: > > It's unrelated to the actual value of dirty_thresh: if the machine fills up > > with dirty (or unstable) NFS pages then eventually new writers will block > > until that condition clears. > > > > 2.4 doesn't have this pro

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Jonathan Corbet
Andrew wrote: > It's unrelated to the actual value of dirty_thresh: if the machine fills up > with dirty (or unstable) NFS pages then eventually new writers will block > until that condition clears. > > 2.4 doesn't have this problem at low levels of dirty data because 2.4 > VFS/MM doesn't account

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
It's works on .23-rc8-mm2 with out any problems. "dd" process does not hang any more. Thanks for all the help. Cheers --Chakri On 9/28/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > [ and one copy for the list too ] > > On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote: > > It's 2.6.23-rc6.

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
[ and one copy for the list too ] On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote: > It's 2.6.23-rc6. Could you try .23-rc8-mm2. It includes the per bdi stuff. signature.asc Description: This is a digitally signed message part

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
It's 2.6.23-rc6. Thanks --Chakri On 9/28/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote: > > Thanks for explaining the adaptive logic. > > > > > However other devices will at that moment try to maintain a limit of 0, > > > which ends up being sim

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote: > Thanks for explaining the adaptive logic. > > > However other devices will at that moment try to maintain a limit of 0, > > which ends up being similar to a sync mount. > > > > So they'll not get stuck, but they will be slow. > > > > > > Sync s

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
Thanks for explaining the adaptive logic. > However other devices will at that moment try to maintain a limit of 0, > which ends up being similar to a sync mount. > > So they'll not get stuck, but they will be slow. > > Sync should be ok, when the situation is bad like this and some one hijacked

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
[ please don't top-post! ] On Fri, 2007-09-28 at 01:27 -0700, Chakri n wrote: > On 9/27/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote: > > > > > What we _don't_ want to happen is for other processes which are writing to > > > other, non-d

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Chakri n
Thanks. The BDI dirty limits sounds like a good idea. Is there already a patch for this, which I could try? I believe it works like this, Each BDI, will have a limit. If the dirty_thresh exceeds the limit, all the I/O on the block device will be synchronous. so, if I have sda & a NFS mount, th

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-28 Thread Peter Zijlstra
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote: > What we _don't_ want to happen is for other processes which are writing to > other, non-dead devices to get collaterally blocked. We have patches which > might fix that queued for 2.6.24. Peter? Nasty problem, don't do that :-) But yeah

Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-27 Thread Andrew Morton
On Thu, 27 Sep 2007 23:32:36 -0700 "Chakri n" <[EMAIL PROTECTED]> wrote: > Hi, > > In my testing, a unresponsive file system can hang all I/O in the system. > This is not seen in 2.4. > > I started 20 threads doing I/O on a NFS share. They are just doing 4K > writes in a loop. > > Now I stop NF

A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)

2007-09-27 Thread Chakri n
Hi, In my testing, a unresponsive file system can hang all I/O in the system. This is not seen in 2.4. I started 20 threads doing I/O on a NFS share. They are just doing 4K writes in a loop. Now I stop NFS server hosting the NFS share and start a "dd" process to write a file on local EXT3 file s

Linux 2.6.23-rc6-git3-krf1

2007-09-13 Thread Michal Piotrowski
Hi, There are a few regression fixes in -krf tree http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.patch.bz2 http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.tar.bz2 Vitaly Bordug: oops-while-modprobing-phy-fixed

Linux 2.6.23-rc6

2007-09-10 Thread Linus Torvalds
Kenji Kaneshige (2): [IA64] Fix unexpected interrupt vector handling [IA64] Clear pending interrupts at CPU boot up time Kyungmin Park (1): [MIPS] i8259: Add disable method. Laurent Riffard (1): Fix broken pata_via cable detection Linus Torvalds (1): Linux 2.6.23-rc6