On Fri, Feb 02, 2007 at 07:42:52PM +1100, Nick Piggin ([EMAIL PROTECTED]) wrote:
> Anyway, I had a look at your bugzilla test-case and managed to slim it
> down to something that easily shows what the problem is (available on
> request) -- the problem is that recipient of the sendfile is seeing
>
Mark Groves wrote:
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at
Mark Groves wrote:
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at
On Fri, Feb 02, 2007 at 07:42:52PM +1100, Nick Piggin ([EMAIL PROTECTED]) wrote:
Anyway, I had a look at your bugzilla test-case and managed to slim it
down to something that easily shows what the problem is (available on
request) -- the problem is that recipient of the sendfile is seeing
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at http://bugzilla.kernel.org/show_bug.cgi?id=7650.
Currently,
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at http://bugzilla.kernel.org/show_bug.cgi?id=7650.
Currently,
On Sun, 7 Jan 2007 12:36:18 +1030
"Tom Lanyon" <[EMAIL PROTECTED]> wrote:
> On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > What would also actually be interesting is whether somebody can reproduce
> > this on Reiserfs, for example. I _think_ all the reports I've seen are on
> > ext2
On 1/7/07, Tom Lanyon <[EMAIL PROTECTED]> wrote:
I've been following this thread for a while now as I started
experiencing file corruption in rtorrent when I upgraded to 2.6.19. I
am using reiserfs.
However, moving to 2.6.20-rc3 does indeed seem to fix the issue thus far...
--
Tom Lanyon
-
To
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
What would also actually be interesting is whether somebody can reproduce
this on Reiserfs, for example. I _think_ all the reports I've seen are on
ext2 or ext3, and if this is somehow writeback-related, it could be some
bug that is just
On 12/27/06, Linus Torvalds [EMAIL PROTECTED] wrote:
What would also actually be interesting is whether somebody can reproduce
this on Reiserfs, for example. I _think_ all the reports I've seen are on
ext2 or ext3, and if this is somehow writeback-related, it could be some
bug that is just
On 1/7/07, Tom Lanyon [EMAIL PROTECTED] wrote:
I've been following this thread for a while now as I started
experiencing file corruption in rtorrent when I upgraded to 2.6.19. I
am using reiserfs.
However, moving to 2.6.20-rc3 does indeed seem to fix the issue thus far...
--
Tom Lanyon
-
To
On Sun, 7 Jan 2007 12:36:18 +1030
Tom Lanyon [EMAIL PROTECTED] wrote:
On 12/27/06, Linus Torvalds [EMAIL PROTECTED] wrote:
What would also actually be interesting is whether somebody can reproduce
this on Reiserfs, for example. I _think_ all the reports I've seen are on
ext2 or ext3, and
On Fri, 29 Dec 2006 16:58:41 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 29 Dec 2006, Andrew Morton wrote:
> >
> > > > Somewhat nastily, but as ext3 directories are metadata it is appropriate
> > > > that modifications to them be done in terms of buffer_heads (ie:
> > >
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> > > Somewhat nastily, but as ext3 directories are metadata it is appropriate
> > > that modifications to them be done in terms of buffer_heads (ie: blocks).
> >
> > No. There is nothing "appropriate" about using buffer_heads for metadata.
>
> I
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> Adam Richter spent considerable time a few years ago trying to make the
> mpage code go direct-to-BIO in all cases and we eventually gave up. The
> conceptual layering of page<->blocks<->bio is pretty clean, and it is hard
> and ugly to fully
On Fri, 29 Dec 2006 16:11:44 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> > JBD implements physical block-based journalling, so it is 100% appropriate
> > that JBD deal with these disk blocks using their buffer_head
> > representation.
>
> And as long as it does that, you just
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> They're extra. As in "can be optimised away".
Sure. Don't use buffer heads.
> The buffer_head is not an IO container. It is the kernel's core
> representation of a disk block.
Please come back from the 90's.
The buffer heads are nothing but a
On Fri, 29 Dec 2006 18:32:07 -0500
Theodore Tso <[EMAIL PROTECTED]> wrote:
> On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
> > I think ext3 is terminally crap by now. It still uses buffer heads in
> > places where it really really shouldn't, and as a result, things like
> >
On Fri, 29 Dec 2006, Theodore Tso wrote:
>
> If we do get this fixed for ext4, one interesting question is whether
> people would accept a patch to backport the fixes to ext3, given the
> the grief this is causing the page I/O and VM routines.
I don't think backporting is the smartest option
On Fri, 29 Dec 2006 14:42:51 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 29 Dec 2006, Andrew Morton wrote:
> >
> > - The above change means that we do extra writeout. If a page is dirtied
> > once, kjournald will write it and then pdflush will come along and
> >
On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
> I think ext3 is terminally crap by now. It still uses buffer heads in
> places where it really really shouldn't, and as a result, things like
> directory accesses are simply slower than they should be. Sadly, I don't
> think ext4
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> - The above change means that we do extra writeout. If a page is dirtied
> once, kjournald will write it and then pdflush will come along and
> needlessly write it again.
There's zero extra writeout for any flushing that flushes BY PAGES.
On Fri, 29 Dec 2006 14:16:32 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> - Poor old IO accounting broke again.
No it didn't - we're relying upon the behaviour of __set_page_dirty_buffers()
against an already-dirty page.
-
To unsubscribe from this list: send the line "unsubscribe
On Fri, 29 Dec 2006 02:48:35 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> + if (mapping && mapping_cap_account_dirty(mapping)) {
> + /*
> + * Yes, Virginia, this is indeed insane.
> + *
> + * We use this sequence to make sure that
On Fri, 29 Dec 2006, Theodore Tso wrote:
>
> I'm confused. Does this mean that if "fs blocksize"=="VM pagesize"
> this bug can't trigger?
No. Even if there is just a single buffer-head, if the filesystem ever
writes out that _single_ buffer-head out of turn (ie before the VM
actually asks
On Fri, 29 Dec 2006, Nick Piggin wrote:
>
> > It still has a tiny tiny race (see the comment), but I bet nobody can really
> > hit it in real life anyway, and I know several ways to fix it, so I'm not
> > really _that_ worried about it.
>
> Well the race isn't a data loss one, is it? Just a
* Stephen Clark <[EMAIL PROTECTED]> [2006-12-29 10:17]:
> >It works for me now, both your testcase as well as an installation of
> >Debian on this ARM device. I manually applied the patch to 2.6.19.
>
> Can you post a diff against 2.6.19?
--- a/mm/page-writeback.c 2006-11-29
On Fri, Dec 29, 2006 at 12:58:12AM -0800, Linus Torvalds wrote:
> Because what "__set_page_dirty_buffers()" does is that AT THE TIME THE
> "set_page_dirty()" IS CALLED, it will mark all the buffers on that page as
> dirty. That may _sound_ like what we want, but it really isn't. Because by
>
Martin Michlmayr wrote:
* Linus Torvalds <[EMAIL PROTECTED]> [2006-12-29 02:48]:
Can anybody get corruption with this thing applied? It goes on top
of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually
* Linus Torvalds <[EMAIL PROTECTED]> [2006-12-29 02:48]:
> Can anybody get corruption with this thing applied? It goes on top
> of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applied the patch to 2.6.19.
Thanks.
Linus Torvalds wrote:
[...]
The patch is mostly a comment. The "real" meat of it is actually just a
few lines.
Can anybody get corruption with this thing applied? It goes on top of
plain v2.6.20-rc2.
No corruption with the testcase here. Will check with rtorrent too later
today but I
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > Hmm? I'd love it if somebody else wrote the patch and tested it,
> > because I'm getting sick and tired of this bug ;)
>
> Who the hell am I kidding? I haven't been able to sleep right for the
> last few days over this bug. It was really getting
Hey nice work Linus!
Linus Torvalds wrote:
On Fri, 29 Dec 2006, Linus Torvalds wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it, because
I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days
On Fri, 2006-12-29 at 02:48 -0800, Linus Torvalds wrote:
>
> On Fri, 29 Dec 2006, Linus Torvalds wrote:
> >
> > Hmm? I'd love it if somebody else wrote the patch and tested it, because
> > I'm getting sick and tired of this bug ;)
>
> Who the hell am I kidding? I haven't been able to sleep
On Fri, 29 Dec 2006, Linus Torvalds wrote:
>
> Hmm? I'd love it if somebody else wrote the patch and tested it, because
> I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days over this bug. It was really getting to me.
On Thu, 28 Dec 2006, Linus Torvalds wrote:
>
> So everything I have ever seen says that the VM layer is actually doing
> everything right.
That was true, but at the same time, it's not. Let me explain.
> That to me says: "somebody didn't actually write out out". The VM layer
> asked the
On Thu, 28 Dec 2006, Linus Torvalds wrote:
So everything I have ever seen says that the VM layer is actually doing
everything right.
That was true, but at the same time, it's not. Let me explain.
That to me says: somebody didn't actually write out out. The VM layer
asked the filesystem
On Fri, 29 Dec 2006, Linus Torvalds wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it, because
I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days over this bug. It was really getting to me.
On Fri, 2006-12-29 at 02:48 -0800, Linus Torvalds wrote:
On Fri, 29 Dec 2006, Linus Torvalds wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it, because
I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the
Hey nice work Linus!
Linus Torvalds wrote:
On Fri, 29 Dec 2006, Linus Torvalds wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it, because
I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days
* Linus Torvalds [EMAIL PROTECTED] wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it,
because I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the
last few days over this bug. It was really getting to me.
Linus Torvalds wrote:
[...]
The patch is mostly a comment. The real meat of it is actually just a
few lines.
Can anybody get corruption with this thing applied? It goes on top of
plain v2.6.20-rc2.
No corruption with the testcase here. Will check with rtorrent too later
today but I
* Linus Torvalds [EMAIL PROTECTED] [2006-12-29 02:48]:
Can anybody get corruption with this thing applied? It goes on top
of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applied the patch to 2.6.19.
Thanks.
--
Martin Michlmayr wrote:
* Linus Torvalds [EMAIL PROTECTED] [2006-12-29 02:48]:
Can anybody get corruption with this thing applied? It goes on top
of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applied
On Fri, Dec 29, 2006 at 12:58:12AM -0800, Linus Torvalds wrote:
Because what __set_page_dirty_buffers() does is that AT THE TIME THE
set_page_dirty() IS CALLED, it will mark all the buffers on that page as
dirty. That may _sound_ like what we want, but it really isn't. Because by
the time
* Stephen Clark [EMAIL PROTECTED] [2006-12-29 10:17]:
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applied the patch to 2.6.19.
Can you post a diff against 2.6.19?
--- a/mm/page-writeback.c 2006-11-29 21:57:37.0
On Fri, 29 Dec 2006, Nick Piggin wrote:
It still has a tiny tiny race (see the comment), but I bet nobody can really
hit it in real life anyway, and I know several ways to fix it, so I'm not
really _that_ worried about it.
Well the race isn't a data loss one, is it? Just a case where
On Fri, 29 Dec 2006, Theodore Tso wrote:
I'm confused. Does this mean that if fs blocksize==VM pagesize
this bug can't trigger?
No. Even if there is just a single buffer-head, if the filesystem ever
writes out that _single_ buffer-head out of turn (ie before the VM
actually asks it to,
On Fri, 29 Dec 2006 02:48:35 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:
+ if (mapping mapping_cap_account_dirty(mapping)) {
+ /*
+ * Yes, Virginia, this is indeed insane.
+ *
+ * We use this sequence to make sure that
+
On Fri, 29 Dec 2006 14:16:32 -0800
Andrew Morton [EMAIL PROTECTED] wrote:
- Poor old IO accounting broke again.
No it didn't - we're relying upon the behaviour of __set_page_dirty_buffers()
against an already-dirty page.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
On Fri, 29 Dec 2006, Andrew Morton wrote:
- The above change means that we do extra writeout. If a page is dirtied
once, kjournald will write it and then pdflush will come along and
needlessly write it again.
There's zero extra writeout for any flushing that flushes BY PAGES.
Only
On Fri, 29 Dec 2006 14:42:51 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:
On Fri, 29 Dec 2006, Andrew Morton wrote:
- The above change means that we do extra writeout. If a page is dirtied
once, kjournald will write it and then pdflush will come along and
needlessly
On Fri, 29 Dec 2006, Andrew Morton wrote:
They're extra. As in can be optimised away.
Sure. Don't use buffer heads.
The buffer_head is not an IO container. It is the kernel's core
representation of a disk block.
Please come back from the 90's.
The buffer heads are nothing but a
On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
I think ext3 is terminally crap by now. It still uses buffer heads in
places where it really really shouldn't, and as a result, things like
directory accesses are simply slower than they should be. Sadly, I don't
think ext4 is
On Fri, 29 Dec 2006, Theodore Tso wrote:
If we do get this fixed for ext4, one interesting question is whether
people would accept a patch to backport the fixes to ext3, given the
the grief this is causing the page I/O and VM routines.
I don't think backporting is the smartest option
On Fri, 29 Dec 2006 18:32:07 -0500
Theodore Tso [EMAIL PROTECTED] wrote:
On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
I think ext3 is terminally crap by now. It still uses buffer heads in
places where it really really shouldn't, and as a result, things like
directory
On Fri, 29 Dec 2006 16:11:44 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:
JBD implements physical block-based journalling, so it is 100% appropriate
that JBD deal with these disk blocks using their buffer_head
representation.
And as long as it does that, you just have to face
On Fri, 29 Dec 2006, Andrew Morton wrote:
Somewhat nastily, but as ext3 directories are metadata it is appropriate
that modifications to them be done in terms of buffer_heads (ie: blocks).
No. There is nothing appropriate about using buffer_heads for metadata.
I said
On Fri, 29 Dec 2006 16:58:41 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:
On Fri, 29 Dec 2006, Andrew Morton wrote:
Somewhat nastily, but as ext3 directories are metadata it is appropriate
that modifications to them be done in terms of buffer_heads (ie:
blocks).
On Fri, 29 Dec 2006, Andrew Morton wrote:
Adam Richter spent considerable time a few years ago trying to make the
mpage code go direct-to-BIO in all cases and we eventually gave up. The
conceptual layering of page-blocks-bio is pretty clean, and it is hard
and ugly to fully optimise away
On Fri, 29 Dec 2006, Segher Boessenkool wrote:
>
> > I think what might be happening is that pdflush writes them out fine,
> > however we don't trap writes by the application _during_ that writeout.
>
> Yeah. I believe that more exactly it happens if the very last
> write to the page causes a
I think what might be happening is that pdflush writes them out fine,
however we don't trap writes by the application _during_ that writeout.
Yeah. I believe that more exactly it happens if the very last
write to the page causes a writeback (due to dirty balancing)
while another writeback for
From: Linus Torvalds <[EMAIL PROTECTED]>
Date: Thu, 28 Dec 2006 12:14:31 -0800 (PST)
> I get corruption - but the whole point is that it's very much pdflush that
> should be writing these pages out.
I think what might be happening is that pdflush writes them out fine,
however we don't trap
On Thu, 2006-12-28 at 11:45 -0800, Andrew Morton wrote:
> On Thu, 28 Dec 2006 11:28:52 -0800 (PST)
> Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> >
> >
> > On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
> > >
> > > The attached patch fixes the corruption for me.
> >
> > Well, that's a good
On Thu, 28 Dec 2006, Andrew Morton wrote:
>
> It would be interesting to convert your app to do fsync() before
> FADV_DONTNEED. That would take WB_SYNC_NONE out of the picture as well
> (apart from pdflush activity).
I get corruption - but the whole point is that it's very much pdflush that
On Thu, 28 Dec 2006 11:28:52 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
> >
> > The attached patch fixes the corruption for me.
>
> Well, that's a good hint, but it's really just a symptom. You effectively
> just made the
On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
>
> The attached patch fixes the corruption for me.
Well, that's a good hint, but it's really just a symptom. You effectively
just made the test-program not even try to flush the data to disk, so the
page cache would stay in memory, and you'd
Guillaume Chazarain a écrit :
I get this kind of corruption:
http://guichaz.free.fr/linux-bug/corruption.png
Actually in qemu, I get three different behaviours:
- no corruption at all : with linux-2.4
- corruption only on the first chunks: before [PATCH] mm: balance dirty
pages as identified
On Thu, 28 Dec 2006, Russell King wrote:
>
> Yup, but I have nothing to do with glibc because I refuse to do that
> silly copyright assignment FSF thing. Hopefully someone else can
> resolve it, but...
Yeah, me too.
> _is_ a fix whether _you_ like it or not to work around the issue so
>
On Thu, Dec 28, 2006 at 09:27:12AM -0800, Linus Torvalds wrote:
> On Thu, 28 Dec 2006, Russell King wrote:
> > and if you look at glibc's memset() function, you'll notice that's exactly
> > what you expect if you pass a non-8bit value to it. Ergo, what you're
> > seeing is utterly expected given
On Thu, 28 Dec 2006, Russell King wrote:
>
> and if you look at glibc's memset() function, you'll notice that's exactly
> what you expect if you pass a non-8bit value to it. Ergo, what you're
> seeing is utterly expected given glibc's memset() implementation on ARM.
Guys, you _really_ should
On Thu, 28 Dec 2006, Zhang, Yanmin wrote:
>
> The test program is a process to write/read data. pdflush might write data
> to disk asynchronously. After pdflush writes a page to disk, it will call
> (either by
> softirq) clear_page_dirty to clear the dirty bit after getting the interrupt
>
On Wed, 27 Dec 2006, Chen, Kenneth W wrote:
> >
> > Running the test code, git bisect points its finger at this commit.
> > Reverting
> > this commit on top of 2.6.20-rc2 doesn't trigger the bug from the test code.
> >
> > [PATCH] mm: balance dirty pages
> >
> > Now that we can
On Wed, 27 Dec 2006, Gordon Farquharson wrote:
>
> 100kB and 200kB files always succeed on the ARM system. 400kB and
> larger always seem to fail.
Oh, wow. Yeah, I've just repressed how tiny 32MB is. And especially if you
lowered the /proc/sys/vm/dirty_ratio to a smaller percentage, I guess
On Thu Dec 28 15:09 , Guillaume Chazarain sent:
>I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
>I have corruption with every Fedora release kernel except the first, that is
>2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
>some corruption.
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-28 07:15]:
> Thanks for the fix, Russell.
>
> I can now trigger the (real) problem by using a 25 MB file (100 << 18)
> and the Linksys NSLU2 (ARM, IXP420 processor, 32 MB RAM).
Me too (using 100 << 18). Interestingly, I don't seem to get any
I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
I have corruption with every Fedora release kernel except the first, that is
2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
some
corruption.
Command line to test:
qemu root_fs -snapshot
* Russell King <[EMAIL PROTECTED]> [2006-12-28 10:49]:
> > By the way, I just tried it with TARGETSIZE (100 << 12) on a different
> > ARM machine (a Thecus N2100 based on an IOP32x chip with 128 MB of
> > memory) and I see similar results to that from Gordon:
>
> Work around the glibc memset()
On 12/28/06, Russell King <[EMAIL PROTECTED]> wrote:
Fixing Linus' test program to pass nr & 255 to memset results in clean
passes on 2.6.9 on TheCus N2100 (IOP8032x) and 2.6.16.9 StrongARM
machines (as would be expected.)
Thanks for the fix, Russell.
I can now trigger the (real) problem by
On Wed, Dec 27, 2006 at 07:04:34PM -0800, Linus Torvalds wrote:
> [ Modified test-program that tells you where the corruption happens (and
> when the missing parts were supposed to be written out) appended, in
> case people care. ]
Hi
2.6.18 (and 2.6.18.6) is ok, 2.6.19-rc1 is broken. I
On Thu, Dec 28, 2006 at 11:16:59AM +0100, Martin Michlmayr wrote:
> * Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> > >> #define TARGETSIZE (100 << 12)
> > >
> > >That's just 400kB!
> > >
> > >There's no way you should see corruption with that kind of value. It
> > >should all stay
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> >> #define TARGETSIZE (100 << 12)
> >
> >That's just 400kB!
> >
> >There's no way you should see corruption with that kind of value. It
> >should all stay solidly in the cache.
> >
> >Is this perhaps with ARM nommu or something else
On Wed, Dec 27, 2006 at 10:20:20PM -0700, Gordon Farquharson wrote:
> I have run the program a few times, and the output is pretty
> consistent. However, when I increase the target size, the difference
> between the expected and actual values is larger.
>
> Written as (749)935(738)
> Chunk 1113
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> >That's just 400kB!
> >
> >There's no way you should see corruption with that kind of value. It
> >should all stay solidly in the cache.
> >
> >Is this perhaps with ARM nommu or something else strange? It may be that
> >the program
On Wed, 2006-12-27 at 19:04 -0800, Linus Torvalds wrote:
>
> On Wed, 27 Dec 2006, David Miller wrote:
> > >
> > > I still don't see _why_, though. But maybe smarter people than me can see
> > > it..
> >
> > FWIW this program definitely triggers the bug for me.
>
> Ok, now that I have
I think what might be happening is that pdflush writes them out fine,
however we don't trap writes by the application _during_ that writeout.
Yeah. I believe that more exactly it happens if the very last
write to the page causes a writeback (due to dirty balancing)
while another writeback for
On Fri, 29 Dec 2006, Segher Boessenkool wrote:
I think what might be happening is that pdflush writes them out fine,
however we don't trap writes by the application _during_ that writeout.
Yeah. I believe that more exactly it happens if the very last
write to the page causes a
On Wed, 2006-12-27 at 19:04 -0800, Linus Torvalds wrote:
On Wed, 27 Dec 2006, David Miller wrote:
I still don't see _why_, though. But maybe smarter people than me can see
it..
FWIW this program definitely triggers the bug for me.
Ok, now that I have something simple to do
* Gordon Farquharson [EMAIL PROTECTED] [2006-12-27 22:38]:
That's just 400kB!
There's no way you should see corruption with that kind of value. It
should all stay solidly in the cache.
Is this perhaps with ARM nommu or something else strange? It may be that
the program just doesn't work
On Wed, Dec 27, 2006 at 10:20:20PM -0700, Gordon Farquharson wrote:
I have run the program a few times, and the output is pretty
consistent. However, when I increase the target size, the difference
between the expected and actual values is larger.
Written as (749)935(738)
Chunk 1113
* Gordon Farquharson [EMAIL PROTECTED] [2006-12-27 22:38]:
#define TARGETSIZE (100 12)
That's just 400kB!
There's no way you should see corruption with that kind of value. It
should all stay solidly in the cache.
Is this perhaps with ARM nommu or something else strange? It may be that
On Thu, Dec 28, 2006 at 11:16:59AM +0100, Martin Michlmayr wrote:
* Gordon Farquharson [EMAIL PROTECTED] [2006-12-27 22:38]:
#define TARGETSIZE (100 12)
That's just 400kB!
There's no way you should see corruption with that kind of value. It
should all stay solidly in the cache.
On Wed, Dec 27, 2006 at 07:04:34PM -0800, Linus Torvalds wrote:
[ Modified test-program that tells you where the corruption happens (and
when the missing parts were supposed to be written out) appended, in
case people care. ]
Hi
2.6.18 (and 2.6.18.6) is ok, 2.6.19-rc1 is broken. I tried
On 12/28/06, Russell King [EMAIL PROTECTED] wrote:
Fixing Linus' test program to pass nr 255 to memset results in clean
passes on 2.6.9 on TheCus N2100 (IOP8032x) and 2.6.16.9 StrongARM
machines (as would be expected.)
Thanks for the fix, Russell.
I can now trigger the (real) problem by
* Russell King [EMAIL PROTECTED] [2006-12-28 10:49]:
By the way, I just tried it with TARGETSIZE (100 12) on a different
ARM machine (a Thecus N2100 based on an IOP32x chip with 128 MB of
memory) and I see similar results to that from Gordon:
Work around the glibc memset() problem by
I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
I have corruption with every Fedora release kernel except the first, that is
2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
some
corruption.
Command line to test:
qemu root_fs -snapshot
* Gordon Farquharson [EMAIL PROTECTED] [2006-12-28 07:15]:
Thanks for the fix, Russell.
I can now trigger the (real) problem by using a 25 MB file (100 18)
and the Linksys NSLU2 (ARM, IXP420 processor, 32 MB RAM).
Me too (using 100 18). Interestingly, I don't seem to get any
corruption on
On Thu Dec 28 15:09 , Guillaume Chazarain sent:
I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
I have corruption with every Fedora release kernel except the first, that is
2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
some corruption.
On Wed, 27 Dec 2006, Gordon Farquharson wrote:
100kB and 200kB files always succeed on the ARM system. 400kB and
larger always seem to fail.
Oh, wow. Yeah, I've just repressed how tiny 32MB is. And especially if you
lowered the /proc/sys/vm/dirty_ratio to a smaller percentage, I guess
On Wed, 27 Dec 2006, Chen, Kenneth W wrote:
Running the test code, git bisect points its finger at this commit.
Reverting
this commit on top of 2.6.20-rc2 doesn't trigger the bug from the test code.
[PATCH] mm: balance dirty pages
Now that we can detect writers of
1 - 100 of 354 matches
Mail list logo