Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-15 Thread richard -rw- weinberger
On Thu, Oct 14, 2010 at 3:14 PM, Tejun Heo hte...@gmail.com wrote: Hello, Can you please try this one then?  It seems to work here but I can't reproduce the original problem reliably so I'm not really sure. Thanks. diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-15 Thread Tejun Heo
Hello, On 10/14/2010 04:20 PM, richard -rw- weinberger wrote: It does not work for me. But the error is a different one. :-) Without your patch I've never got this kernel trace. [ 59.85] kworker/0:1: page allocation failure. order:0, mode:0x20 Hmm... you're seeing out of memory

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-15 Thread Chris Frey
On Thu, Oct 14, 2010 at 03:14:28PM +0200, Tejun Heo wrote: Hello, Can you please try this one then? It seems to work here but I can't reproduce the original problem reliably so I'm not really sure. Thanks. diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-15 Thread Tejun Heo
Hello, Can you please try this one then? It seems to work here but I can't reproduce the original problem reliably so I'm not really sure. Thanks. diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index 1bcd208..9734994 100644 --- a/arch/um/drivers/ubd_kern.c +++

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-08 Thread Jens Axboe
On 2010-10-05 22:31, Chris Frey wrote: On Tue, Oct 05, 2010 at 10:23:19AM +0200, Tejun Heo wrote: H, can you please give a shot at the following one? Thank you. I applied this patch on top of stock 2.6.35.5 as usual (no other patches) and tested on my maverick image as before. I ran a

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-08 Thread Chris Frey
On Thu, Oct 07, 2010 at 09:58:16AM +0200, Jens Axboe wrote: So how about this? Note that I haven't even compiled this. The request handling logic really should be fixed in there, it's horribly inefficient. Thanks. I fixed the compile error with: + .rq_offset 0, \ to +

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-05 Thread Chris Frey
On Mon, Oct 04, 2010 at 06:37:36PM +0200, Tejun Heo wrote: Hello, sorry about chiming in later. I was off last week. No problem, I'm eager to test patches to fix this. I think we're on the right track. The problem with Jens' patch was that it didn't consider the fact that blk_end_request()

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-05 Thread Tejun Heo
On 10/04/2010 09:51 PM, Chris Frey wrote: On Mon, Oct 04, 2010 at 06:37:36PM +0200, Tejun Heo wrote: Hello, sorry about chiming in later. I was off last week. No problem, I'm eager to test patches to fix this. I think we're on the right track. The problem with Jens' patch was that it

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-05 Thread richard -rw- weinberger
On Wed, Sep 29, 2010 at 7:21 AM, Jens Axboe jax...@fusionio.com wrote: I think we need to find the real fix here, just disabling merging is not a fix (it's just a nasty work-around for the real bug). Jens, Do you have an idea which parts of the code are buggy? -- Cheers, //richard

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-05 Thread Tejun Heo
Hello, sorry about chiming in later. I was off last week. On 09/29/2010 08:34 AM, Chris Frey wrote: On Wed, Sep 29, 2010 at 02:21:07PM +0900, Jens Axboe wrote: This seems to imply that the original commit pin pointed is not the only issue we have in that code atm. I think we need to find

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Chris Frey
On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote: Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton: This is a workaround, I think? Do we know what the actual bug is? From the comment it appears to be a regression? Yes, it is a workaround. For more details

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Andrew Morton
On Tue, 28 Sep 2010 23:47:36 +0200 Richard Weinberger rich...@nod.at wrote: Under high load the file system gets corrupted. This patch fixes the issue. Many thanks to Janjaap Bos janj...@bos.nl! LKML-Reference: AANLkTi=PTp7YW_eYxtF-H2QSxgei3whWH59wU0C9oCkz () mail ! gmail ! com

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Jens Axboe
On 2010-09-29 07:52, Chris Frey wrote: On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote: Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton: This is a workaround, I think? Do we know what the actual bug is? From the comment it appears to be a regression? Yes, it

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Janjaap Bos
On Wed, 2010-09-29 at 08:10 +0900, Jens Axboe wrote: It looks like that if we need to restart the requeue, then we use the initial position and not the current index. Does this help? diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c index 1bcd208..81ee063 100644 ---

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Chris Frey
On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote: Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton: This is a workaround, I think? Do we know what the actual bug is? From the comment it appears to be a regression? Yes, it is a workaround. For more details

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Jens Axboe
On 2010-09-29 10:29, Chris Frey wrote: On Wed, Sep 29, 2010 at 08:10:06AM +0900, Jens Axboe wrote: It looks like that if we need to restart the requeue, then we use the initial position and not the current index. Does this help? diff --git a/arch/um/drivers/ubd_kern.c

Re: [uml-user] [PATCH 1/1] um: ubd: Fix data corruption

2010-10-01 Thread Chris Frey
On Wed, Sep 29, 2010 at 02:21:07PM +0900, Jens Axboe wrote: This seems to imply that the original commit pin pointed is not the only issue we have in that code atm. I think we need to find the real fix here, just disabling merging is not a fix (it's just a nasty work-around for the real