On Tue, 02 Oct 2007 15:36:01 +0200 Peter Zijlstra wrote:
> On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote:
>
> > (Searches for the lockstat documentation)
> >
> > Did we forget to do that?
>
> yeah,...
>
> /me quickly whips up something
Thanks. Just some typos noted below.
>
On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote:
> (Searches for the lockstat documentation)
>
> Did we forget to do that?
yeah,...
/me quickly whips up something
Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
---
Documentation/lockstat.txt | 119
On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote:
(Searches for the lockstat documentation)
Did we forget to do that?
yeah,...
/me quickly whips up something
Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---
Documentation/lockstat.txt | 119
On Tue, 02 Oct 2007 15:36:01 +0200 Peter Zijlstra wrote:
On Fri, 2007-09-28 at 12:16 -0700, Andrew Morton wrote:
(Searches for the lockstat documentation)
Did we forget to do that?
yeah,...
/me quickly whips up something
Thanks. Just some typos noted below.
Signed-off-by:
On 09/29/2007 07:04 AM, Fengguang Wu wrote:
> On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
>> Hi,
>>
>> In my testing, a unresponsive file system can hang all I/O in the system.
>> This is not seen in 2.4.
>>
>> I started 20 threads doing I/O on a NFS share. They are just doing 4K
>>
On 09/29/2007 07:04 AM, Fengguang Wu wrote:
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing 4K
writes in a
On Sat, 2007-09-29 at 20:28 +0800, Fengguang Wu wrote:
> On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote:
> > On the patch itself, not sure if it would have been enough. As soon as
> > there is a single dirty inode on the list one would get caught in the
> > same problem as
On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote:
>
> On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote:
> > On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
> > > Hi,
> > >
> > > In my testing, a unresponsive file system can hang all I/O in the system.
> > > This is
On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote:
> On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
> > Hi,
> >
> > In my testing, a unresponsive file system can hang all I/O in the system.
> > This is not seen in 2.4.
> >
> > I started 20 threads doing I/O on a NFS share. They
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
> Hi,
>
> In my testing, a unresponsive file system can hang all I/O in the system.
> This is not seen in 2.4.
>
> I started 20 threads doing I/O on a NFS share. They are just doing 4K
> writes in a loop.
>
> Now I stop NFS server
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing 4K
writes in a loop.
Now I stop NFS server hosting the NFS
On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote:
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing
On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote:
On Sat, 2007-09-29 at 19:04 +0800, Fengguang Wu wrote:
On Thu, Sep 27, 2007 at 11:32:36PM -0700, Chakri n wrote:
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
On Sat, 2007-09-29 at 20:28 +0800, Fengguang Wu wrote:
On Sat, Sep 29, 2007 at 01:48:01PM +0200, Peter Zijlstra wrote:
On the patch itself, not sure if it would have been enough. As soon as
there is a single dirty inode on the list one would get caught in the
same problem as before.
On Friday 28 September 2007 06:35, Peter Zijlstra wrote:
> ,,,it would be grand (and dangerous) if we could provide for a
> button that would just kill off all outstanding pages against a dead
> device.
Substitute "resources" for "pages" and you begin to get an idea of how
tricky that actually
On Thursday 27 September 2007 23:50, Andrew Morton wrote:
> Actually we perhaps could address this at the VFS level in another
> way. Processes which are writing to the dead NFS server will
> eventually block in balance_dirty_pages() once they've exceeded the
> memory limits and will remain
No change in behavior even in case of low memory systems. I confirmed
it running on 1Gig machine.
Thanks
--Chakri
On 9/28/07, Chakri n <[EMAIL PROTECTED]> wrote:
> Here is a the snapshot of vmstats when the problem happened. I believe
> this could help a little.
>
> crash> kmem -V
>
Here is a the snapshot of vmstats when the problem happened. I believe
this could help a little.
crash> kmem -V
NR_FREE_PAGES: 680853
NR_INACTIVE: 95380
NR_ACTIVE: 26891
NR_ANON_PAGES: 2507
NR_FILE_MAPPED: 1832
NR_FILE_PAGES: 119779
On Fri, 28 Sep 2007 16:32:18 -0400
Trond Myklebust <[EMAIL PROTECTED]> wrote:
> On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
> > On Fri, 28 Sep 2007 15:52:28 -0400
> > Trond Myklebust <[EMAIL PROTECTED]> wrote:
> >
> > > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> > > >
On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
> On Fri, 28 Sep 2007 15:52:28 -0400
> Trond Myklebust <[EMAIL PROTECTED]> wrote:
>
> > On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> > > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]>
> > > wrote:
> > >
On Friday 28 September 2007 12:52, Trond Myklebust wrote:
> I'm not sure that the hang that is illustrated here is so special. It
> is an example of a bog-standard ext3 write, that ends up calling the
> NFS client, which is hanging. The fact that it happens to be hanging
> on the nfsd process is
On Fri, 28 Sep 2007 15:52:28 -0400
Trond Myklebust <[EMAIL PROTECTED]> wrote:
> On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> > On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]>
> > wrote:
> > > Looking back, they were getting caught up in
> > >
On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
> On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:
> > Looking back, they were getting caught up in
> > balance_dirty_pages_ratelimited() and friends. See the attached
> > example...
>
> that one is
On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:
> On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
> > On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]>
> > wrote:
> > > Do these patches also cause the memory reclaimers to steer clear of
On Fri, 28 Sep 2007 20:48:59 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
>
> > Do you know where the stalls are occurring? throttle_vm_writeout(), or via
> > direct calls to congestion_wait() from page_alloc.c and vmscan.c?
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
> On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:
> > Do these patches also cause the memory reclaimers to steer clear of
> > devices that are congested (and stop waiting on a congested device if
> > they see
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
> Do you know where the stalls are occurring? throttle_vm_writeout(), or via
> direct calls to congestion_wait() from page_alloc.c and vmscan.c? (running
> sysrq-w five or ten times will probably be enough to determine this)
would it
On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
>
> > Actually we perhaps could address this at the VFS level in another way.
> > Processes which are writing to the dead NFS server will eventually block in
>
On Fri, 28 Sep 2007 07:28:52 -0600 [EMAIL PROTECTED] (Jonathan Corbet) wrote:
> Andrew wrote:
> > It's unrelated to the actual value of dirty_thresh: if the machine fills up
> > with dirty (or unstable) NFS pages then eventually new writers will block
> > until that condition clears.
> >
> > 2.4
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
> Actually we perhaps could address this at the VFS level in another way.
> Processes which are writing to the dead NFS server will eventually block in
> balance_dirty_pages() once they've exceeded the memory limits and will
> remain
On Fri, 28 Sep 2007, Peter Zijlstra wrote:
> On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote:
> > Is it really NFS-related? I was trying to back up my 2.6.23-rc8 system
> > to an external USB drive the other day when something flaked and the
> > drive fell off the bus. That, too, was
On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote:
> Andrew wrote:
> > It's unrelated to the actual value of dirty_thresh: if the machine fills up
> > with dirty (or unstable) NFS pages then eventually new writers will block
> > until that condition clears.
> >
> > 2.4 doesn't have this
Andrew wrote:
> It's unrelated to the actual value of dirty_thresh: if the machine fills up
> with dirty (or unstable) NFS pages then eventually new writers will block
> until that condition clears.
>
> 2.4 doesn't have this problem at low levels of dirty data because 2.4
> VFS/MM doesn't account
It's works on .23-rc8-mm2 with out any problems.
"dd" process does not hang any more.
Thanks for all the help.
Cheers
--Chakri
On 9/28/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> [ and one copy for the list too ]
>
> On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote:
> > It's 2.6.23-rc6.
[ and one copy for the list too ]
On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote:
> It's 2.6.23-rc6.
Could you try .23-rc8-mm2. It includes the per bdi stuff.
signature.asc
Description: This is a digitally signed message part
It's 2.6.23-rc6.
Thanks
--Chakri
On 9/28/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote:
> > Thanks for explaining the adaptive logic.
> >
> > > However other devices will at that moment try to maintain a limit of 0,
> > > which ends up being
On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote:
> Thanks for explaining the adaptive logic.
>
> > However other devices will at that moment try to maintain a limit of 0,
> > which ends up being similar to a sync mount.
> >
> > So they'll not get stuck, but they will be slow.
> >
> >
>
> Sync
Thanks for explaining the adaptive logic.
> However other devices will at that moment try to maintain a limit of 0,
> which ends up being similar to a sync mount.
>
> So they'll not get stuck, but they will be slow.
>
>
Sync should be ok, when the situation is bad like this and some one
hijacked
[ please don't top-post! ]
On Fri, 2007-09-28 at 01:27 -0700, Chakri n wrote:
> On 9/27/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
> >
> > > What we _don't_ want to happen is for other processes which are writing to
> > > other,
Thanks.
The BDI dirty limits sounds like a good idea.
Is there already a patch for this, which I could try?
I believe it works like this,
Each BDI, will have a limit. If the dirty_thresh exceeds the limit,
all the I/O on the block device will be synchronous.
so, if I have sda & a NFS mount,
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
> What we _don't_ want to happen is for other processes which are writing to
> other, non-dead devices to get collaterally blocked. We have patches which
> might fix that queued for 2.6.24. Peter?
Nasty problem, don't do that :-)
But
On Thu, 27 Sep 2007 23:32:36 -0700 "Chakri n" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> In my testing, a unresponsive file system can hang all I/O in the system.
> This is not seen in 2.4.
>
> I started 20 threads doing I/O on a NFS share. They are just doing 4K
> writes in a loop.
>
> Now I stop
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing 4K
writes in a loop.
Now I stop NFS server hosting the NFS share and start a
"dd" process to write a file on local EXT3 file
On Thu, 27 Sep 2007 23:32:36 -0700 Chakri n [EMAIL PROTECTED] wrote:
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing 4K
writes in a loop.
Now I stop NFS server
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
What we _don't_ want to happen is for other processes which are writing to
other, non-dead devices to get collaterally blocked. We have patches which
might fix that queued for 2.6.24. Peter?
Nasty problem, don't do that :-)
But yeah,
Hi,
In my testing, a unresponsive file system can hang all I/O in the system.
This is not seen in 2.4.
I started 20 threads doing I/O on a NFS share. They are just doing 4K
writes in a loop.
Now I stop NFS server hosting the NFS share and start a
dd process to write a file on local EXT3 file
It's 2.6.23-rc6.
Thanks
--Chakri
On 9/28/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote:
Thanks for explaining the adaptive logic.
However other devices will at that moment try to maintain a limit of 0,
which ends up being similar to a
On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote:
Andrew wrote:
It's unrelated to the actual value of dirty_thresh: if the machine fills up
with dirty (or unstable) NFS pages then eventually new writers will block
until that condition clears.
2.4 doesn't have this problem at
Andrew wrote:
It's unrelated to the actual value of dirty_thresh: if the machine fills up
with dirty (or unstable) NFS pages then eventually new writers will block
until that condition clears.
2.4 doesn't have this problem at low levels of dirty data because 2.4
VFS/MM doesn't account for
It's works on .23-rc8-mm2 with out any problems.
dd process does not hang any more.
Thanks for all the help.
Cheers
--Chakri
On 9/28/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
[ and one copy for the list too ]
On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote:
It's 2.6.23-rc6.
Could
On Fri, 28 Sep 2007, Peter Zijlstra wrote:
On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote:
Is it really NFS-related? I was trying to back up my 2.6.23-rc8 system
to an external USB drive the other day when something flaked and the
drive fell off the bus. That, too, was
[ and one copy for the list too ]
On Fri, 2007-09-28 at 02:20 -0700, Chakri n wrote:
It's 2.6.23-rc6.
Could you try .23-rc8-mm2. It includes the per bdi stuff.
signature.asc
Description: This is a digitally signed message part
Thanks.
The BDI dirty limits sounds like a good idea.
Is there already a patch for this, which I could try?
I believe it works like this,
Each BDI, will have a limit. If the dirty_thresh exceeds the limit,
all the I/O on the block device will be synchronous.
so, if I have sda a NFS mount,
[ please don't top-post! ]
On Fri, 2007-09-28 at 01:27 -0700, Chakri n wrote:
On 9/27/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
What we _don't_ want to happen is for other processes which are writing to
other, non-dead devices
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
Actually we perhaps could address this at the VFS level in another way.
Processes which are writing to the dead NFS server will eventually block in
balance_dirty_pages() once they've exceeded the memory limits and will
remain blocked
Thanks for explaining the adaptive logic.
However other devices will at that moment try to maintain a limit of 0,
which ends up being similar to a sync mount.
So they'll not get stuck, but they will be slow.
Sync should be ok, when the situation is bad like this and some one
hijacked all
On Fri, 2007-09-28 at 02:01 -0700, Chakri n wrote:
Thanks for explaining the adaptive logic.
However other devices will at that moment try to maintain a limit of 0,
which ends up being similar to a sync mount.
So they'll not get stuck, but they will be slow.
Sync should be ok,
On Fri, 28 Sep 2007 07:28:52 -0600 [EMAIL PROTECTED] (Jonathan Corbet) wrote:
Andrew wrote:
It's unrelated to the actual value of dirty_thresh: if the machine fills up
with dirty (or unstable) NFS pages then eventually new writers will block
until that condition clears.
2.4 doesn't
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
Do you know where the stalls are occurring? throttle_vm_writeout(), or via
direct calls to congestion_wait() from page_alloc.c and vmscan.c? (running
sysrq-w five or ten times will probably be enough to determine this)
would it make
On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust [EMAIL PROTECTED] wrote:
On Thu, 2007-09-27 at 23:50 -0700, Andrew Morton wrote:
Actually we perhaps could address this at the VFS level in another way.
Processes which are writing to the dead NFS server will eventually block in
On Fri, 28 Sep 2007 20:48:59 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
Do you know where the stalls are occurring? throttle_vm_writeout(), or via
direct calls to congestion_wait() from page_alloc.c and vmscan.c? (running
On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust [EMAIL PROTECTED]
wrote:
Do these patches also cause the memory reclaimers to steer clear of
devices
On Fri, 2007-09-28 at 11:49 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 13:00:53 -0400 Trond Myklebust [EMAIL PROTECTED] wrote:
Do these patches also cause the memory reclaimers to steer clear of
devices that are congested (and stop waiting on a congested device if
they see that it
On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust [EMAIL PROTECTED] wrote:
Looking back, they were getting caught up in
balance_dirty_pages_ratelimited() and friends. See the attached
example...
that one is nfs-on-loopback, which
On Friday 28 September 2007 12:52, Trond Myklebust wrote:
I'm not sure that the hang that is illustrated here is so special. It
is an example of a bog-standard ext3 write, that ends up calling the
NFS client, which is hanging. The fact that it happens to be hanging
on the nfsd process is more
On Fri, 28 Sep 2007 16:32:18 -0400
Trond Myklebust [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 15:52:28 -0400
Trond Myklebust [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
On Fri, 28 Sep
On Fri, 2007-09-28 at 13:10 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 15:52:28 -0400
Trond Myklebust [EMAIL PROTECTED] wrote:
On Fri, 2007-09-28 at 12:26 -0700, Andrew Morton wrote:
On Fri, 28 Sep 2007 15:16:11 -0400 Trond Myklebust [EMAIL PROTECTED]
wrote:
Looking back,
Here is a the snapshot of vmstats when the problem happened. I believe
this could help a little.
crash kmem -V
NR_FREE_PAGES: 680853
NR_INACTIVE: 95380
NR_ACTIVE: 26891
NR_ANON_PAGES: 2507
NR_FILE_MAPPED: 1832
NR_FILE_PAGES: 119779
No change in behavior even in case of low memory systems. I confirmed
it running on 1Gig machine.
Thanks
--Chakri
On 9/28/07, Chakri n [EMAIL PROTECTED] wrote:
Here is a the snapshot of vmstats when the problem happened. I believe
this could help a little.
crash kmem -V
On Thursday 27 September 2007 23:50, Andrew Morton wrote:
Actually we perhaps could address this at the VFS level in another
way. Processes which are writing to the dead NFS server will
eventually block in balance_dirty_pages() once they've exceeded the
memory limits and will remain blocked
On Friday 28 September 2007 06:35, Peter Zijlstra wrote:
,,,it would be grand (and dangerous) if we could provide for a
button that would just kill off all outstanding pages against a dead
device.
Substitute resources for pages and you begin to get an idea of how
tricky that actually is.
Hi,
There are a few regression fixes in -krf tree
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.patch.bz2
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.tar.bz2
Vitaly Bordug:
Hi,
There are a few regression fixes in -krf tree
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.patch.bz2
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc6-git3/2.6.23-rc6-git3-krf1.tar.bz2
Vitaly Bordug:
(2):
[IA64] Fix unexpected interrupt vector handling
[IA64] Clear pending interrupts at CPU boot up time
Kyungmin Park (1):
[MIPS] i8259: Add disable method.
Laurent Riffard (1):
Fix broken pata_via cable detection
Linus Torvalds (1):
Linux 2.6.23-rc6
Masato Noguchi
):
[IA64] Fix unexpected interrupt vector handling
[IA64] Clear pending interrupts at CPU boot up time
Kyungmin Park (1):
[MIPS] i8259: Add disable method.
Laurent Riffard (1):
Fix broken pata_via cable detection
Linus Torvalds (1):
Linux 2.6.23-rc6
Masato Noguchi (1
75 matches
Mail list logo