Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Toralf Förster
On 10/11/2013 10:57 AM, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
>> yeah, now the picture becomes more clear
>> ...
>> net.core.warnings = 0
>>  [ ok ]
>> ick: pause : -717
>>  ick : min_pause : -177
>>ick : max_pause : -717
>>  ick: pages_dirtied : 14
>> 
>> ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.
> Would help you try the below patch?
> 
Definitely.
I'm running now the test case since 6 hours w/o any issues.
before that usually after 15 - 30 min the bug occurred.

>>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
> From: Fengguang Wu 
> Date: Fri, 11 Oct 2013 16:53:26 +0800
> Subject: [PATCH] fix bdi max pause calculation
> 
> Signed-off-by: Fengguang Wu 
> ---
>  mm/page-writeback.c |   10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned 
> long dirty,
>   return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -   unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +unsigned long bdi_dirty)
>  {
> - long bw = bdi->avg_write_bandwidth;
> - long t;
> + unsigned long bw = bdi->avg_write_bandwidth;
> + unsigned long t;
>  
>   /*
>* Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>   t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>   t++;
>  
> - return min_t(long, t, MAX_PAUSE);
> + return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Fengguang Wu
On Fri, Oct 11, 2013 at 04:57:01PM +0800, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
> > yeah, now the picture becomes more clear
> > ...
> > net.core.warnings = 0   
> >   [ ok ]
> > ick: pause : -717
> >  ick : min_pause : -177
> >ick : max_pause : -717
> >  ick: pages_dirtied : 14
> > 
> > ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.

However I still suspect this is not the main reason for the soft
lockup. Because schedule_timeout() will directly return on negative
timeout. So yes, we have encountered some negative pauses, however
we still need to fix the huge dirtied pages problem which should be
more fundamental. 

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Fengguang Wu
On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
> yeah, now the picture becomes more clear
> ...
> net.core.warnings = 0 
> [ ok ]
> ick: pause : -717
>  ick : min_pause : -177
>ick : max_pause : -717
>  ick: pages_dirtied : 14
> 
> ick: task_ratelimit: 0

Great and thanks! So it's the max pause calculation went wrong.
Would help you try the below patch?

>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
From: Fengguang Wu 
Date: Fri, 11 Oct 2013 16:53:26 +0800
Subject: [PATCH] fix bdi max pause calculation

Signed-off-by: Fengguang Wu 
---
 mm/page-writeback.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long 
dirty,
return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
- unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+  unsigned long bdi_dirty)
 {
-   long bw = bdi->avg_write_bandwidth;
-   long t;
+   unsigned long bw = bdi->avg_write_bandwidth;
+   unsigned long t;
 
/*
 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
t++;
 
-   return min_t(long, t, MAX_PAUSE);
+   return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Toralf Förster
yeah, now the picture becomes more clear
...
net.core.warnings = 0   
  [ ok ]
ick: pause : -717
 ick : min_pause : -177
   ick : max_pause : -717
 ick: pages_dirtied : 14

ick: task_ratelimit: 0


On 10/11/2013 03:16 AM, Fengguang Wu wrote:
> On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
>> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
>>> Am 09.10.2013 23:47, schrieb Jan Kara:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> CC'ing mm folks.
> Please see below.
   Added Fenguang to CC since he is the author of this code.
>>>
>>> Thx, get_maintainer.pl didn't list him.
>>>
> Am 09.10.2013 19:26, schrieb Toralf Förster:
>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
>>>  wrote:
> Hmm, now pages_dirtied is zero, according to the backtrace, but the 
> BUG_ON()
> asserts its strict positive?!?
>
> Can you please try the following instead of the BUG_ON():
>
> if (pause < 0) {
> printk("pages_dirtied = %lu\n", pages_dirtied);
> printk("task_ratelimit = %lu\n", task_ratelimit);
> printk("pause = %ld\n", pause);
> }
>
> Gr{oetje,eeting}s,
>
> Geert
 I tried it in different ways already - I'm completely unsuccessful in 
 getting any printk output.
 As soon as the issue happens I do have a

 BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

 at stderr of the UML and then no further input is accepted. With 
 uml_mconsole I'm however able
 to run very basic commands like a crash dump, sysrq ond so on.
>>>
>>> You may get an idea of the magnitude of pages_dirtied by using a chain 
>>> of
>>> BUG_ON()s, like:
>>>
>>> BUG_ON(pages_dirtied > 20);
>>> BUG_ON(pages_dirtied > 10);
>>> BUG_ON(pages_dirtied > 1);
>>> BUG_ON(pages_dirtied > 1000);
>>> BUG_ON(pages_dirtied > 100);
>>>
>>> Probably 1 million is already too much for normal operation?
>>>
>> period = HZ * pages_dirtied / task_ratelimit;
>>  BUG_ON(pages_dirtied > 20);
>>  BUG_ON(pages_dirtied > 10);  <-- this 
>> is line 1467
>
> Summary for mm people:
>
> Toralf runs trinty on UML/i386.
> After some time pages_dirtied becomes very large.
> More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
 over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling ->nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current->nr_dirtied.
>>>
>>> Toralf, you can try a snipplet like this one to get the values printed out:
>>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>> index f5236f8..a80e520 100644
>>> --- a/mm/page-writeback.c
>>> +++ b/mm/page-writeback.c
>>> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
>>> *mapping,
>>> goto pause;
>>> }
>>> period = HZ * pages_dirtied / task_ratelimit;
>>> +
>>> +   {
>>> +   extern int printf(char *, ...);
>>> +   printf("---> task_ratelimit: %lu\n", 
>>> task_ratelimit);
>>> +   }
>>> +
>>> pause = period;
>>> if (current->dirty_paused_when)
>>> pause -= now - current->dirty_paused_when;
>>>
>>>
>>> Yes, printf(), not printk().
>>> Using this hack we print directly to host's stdout. :)
>>>
>> *head smack* ofc - works fine.
>> So given this diff :
>>
>> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> index f5236f8..5a2c337 100644
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
>> *mapping,
>> }
>> period = HZ * pages_dirtied / task_ratelimit;
>> pause = period;
>> +   if (pause < 0)  {
>> +   extern int printf(char *, ...);
>> +   

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Toralf Förster
On 10/11/2013 10:57 AM, Fengguang Wu wrote:
 On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
 yeah, now the picture becomes more clear
 ...
 net.core.warnings = 0
  [ ok ]
 ick: pause : -717
  ick : min_pause : -177
ick : max_pause : -717
  ick: pages_dirtied : 14
 
 ick: task_ratelimit: 0
 
 Great and thanks! So it's the max pause calculation went wrong.
 Would help you try the below patch?
 
Definitely.
I'm running now the test case since 6 hours w/o any issues.
before that usually after 15 - 30 min the bug occurred.

From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
 From: Fengguang Wu fengguang...@intel.com
 Date: Fri, 11 Oct 2013 16:53:26 +0800
 Subject: [PATCH] fix bdi max pause calculation
 
 Signed-off-by: Fengguang Wu fengguang...@intel.com
 ---
  mm/page-writeback.c |   10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index 3f0c895..241a746 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned 
 long dirty,
   return 1;
  }
  
 -static long bdi_max_pause(struct backing_dev_info *bdi,
 -   unsigned long bdi_dirty)
 +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
 +unsigned long bdi_dirty)
  {
 - long bw = bdi-avg_write_bandwidth;
 - long t;
 + unsigned long bw = bdi-avg_write_bandwidth;
 + unsigned long t;
  
   /*
* Limit pause time for small memory systems. If sleeping for too long
 @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
   t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
   t++;
  
 - return min_t(long, t, MAX_PAUSE);
 + return min_t(unsigned long, t, MAX_PAUSE);
  }
  
  static long bdi_min_pause(struct backing_dev_info *bdi,
 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Toralf Förster
yeah, now the picture becomes more clear
...
net.core.warnings = 0   
  [ ok ]
ick: pause : -717
 ick : min_pause : -177
   ick : max_pause : -717
 ick: pages_dirtied : 14

ick: task_ratelimit: 0


On 10/11/2013 03:16 AM, Fengguang Wu wrote:
 On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
 On 10/10/2013 12:33 AM, Richard Weinberger wrote:
 Am 09.10.2013 23:47, schrieb Jan Kara:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
 CC'ing mm folks.
 Please see below.
   Added Fenguang to CC since he is the author of this code.

 Thx, get_maintainer.pl didn't list him.

 Am 09.10.2013 19:26, schrieb Toralf Förster:
 On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
 toralf.foers...@gmx.de wrote:
 Hmm, now pages_dirtied is zero, according to the backtrace, but the 
 BUG_ON()
 asserts its strict positive?!?

 Can you please try the following instead of the BUG_ON():

 if (pause  0) {
 printk(pages_dirtied = %lu\n, pages_dirtied);
 printk(task_ratelimit = %lu\n, task_ratelimit);
 printk(pause = %ld\n, pause);
 }

 Gr{oetje,eeting}s,

 Geert
 I tried it in different ways already - I'm completely unsuccessful in 
 getting any printk output.
 As soon as the issue happens I do have a

 BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

 at stderr of the UML and then no further input is accepted. With 
 uml_mconsole I'm however able
 to run very basic commands like a crash dump, sysrq ond so on.

 You may get an idea of the magnitude of pages_dirtied by using a chain 
 of
 BUG_ON()s, like:

 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);
 BUG_ON(pages_dirtied  1);
 BUG_ON(pages_dirtied  1000);
 BUG_ON(pages_dirtied  100);

 Probably 1 million is already too much for normal operation?

 period = HZ * pages_dirtied / task_ratelimit;
  BUG_ON(pages_dirtied  20);
  BUG_ON(pages_dirtied  10);  -- this 
 is line 1467

 Summary for mm people:

 Toralf runs trinty on UML/i386.
 After some time pages_dirtied becomes very large.
 More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
 over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling -nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current-nr_dirtied.

 Toralf, you can try a snipplet like this one to get the values printed out:
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index f5236f8..a80e520 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 goto pause;
 }
 period = HZ * pages_dirtied / task_ratelimit;
 +
 +   {
 +   extern int printf(char *, ...);
 +   printf(--- task_ratelimit: %lu\n, 
 task_ratelimit);
 +   }
 +
 pause = period;
 if (current-dirty_paused_when)
 pause -= now - current-dirty_paused_when;


 Yes, printf(), not printk().
 Using this hack we print directly to host's stdout. :)

 *head smack* ofc - works fine.
 So given this diff :

 iff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index f5236f8..5a2c337 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 }
 period = HZ * pages_dirtied / task_ratelimit;
 pause = period;
 +   if (pause  0)  {
 +   extern int printf(char *, ...);
 +   printf(overflow : pause : %li\n, pause);
 +   printf(overflow : pages_dirtied : %lu\n, 
 pages_dirtied);
 +   printf(overflow :  task_ratelimit: %lu\n, 
 task_ratelimit);
 +   BUG_ON(1);
 +   }
 if (current-dirty_paused_when)
 pause -= now - current-dirty_paused_when;
 /*
 @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 }

  pause:
 +   if (pause  0)  {
 
 Oops, we 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Fengguang Wu
On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
 yeah, now the picture becomes more clear
 ...
 net.core.warnings = 0 
 [ ok ]
 ick: pause : -717
  ick : min_pause : -177
ick : max_pause : -717
  ick: pages_dirtied : 14
 
 ick: task_ratelimit: 0

Great and thanks! So it's the max pause calculation went wrong.
Would help you try the below patch?

From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
From: Fengguang Wu fengguang...@intel.com
Date: Fri, 11 Oct 2013 16:53:26 +0800
Subject: [PATCH] fix bdi max pause calculation

Signed-off-by: Fengguang Wu fengguang...@intel.com
---
 mm/page-writeback.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long 
dirty,
return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
- unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+  unsigned long bdi_dirty)
 {
-   long bw = bdi-avg_write_bandwidth;
-   long t;
+   unsigned long bw = bdi-avg_write_bandwidth;
+   unsigned long t;
 
/*
 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
t++;
 
-   return min_t(long, t, MAX_PAUSE);
+   return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-11 Thread Fengguang Wu
On Fri, Oct 11, 2013 at 04:57:01PM +0800, Fengguang Wu wrote:
 On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
  yeah, now the picture becomes more clear
  ...
  net.core.warnings = 0   
[ ok ]
  ick: pause : -717
   ick : min_pause : -177
 ick : max_pause : -717
   ick: pages_dirtied : 14
  
  ick: task_ratelimit: 0
 
 Great and thanks! So it's the max pause calculation went wrong.

However I still suspect this is not the main reason for the soft
lockup. Because schedule_timeout() will directly return on negative
timeout. So yes, we have encountered some negative pauses, however
we still need to fix the huge dirtied pages problem which should be
more fundamental. 

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> > Am 09.10.2013 23:47, schrieb Jan Kara:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >>> CC'ing mm folks.
> >>> Please see below.
> >>   Added Fenguang to CC since he is the author of this code.
> > 
> > Thx, get_maintainer.pl didn't list him.
> > 
> >>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>  On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> > On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
> >  wrote:
> >>> Hmm, now pages_dirtied is zero, according to the backtrace, but the 
> >>> BUG_ON()
> >>> asserts its strict positive?!?
> >>>
> >>> Can you please try the following instead of the BUG_ON():
> >>>
> >>> if (pause < 0) {
> >>> printk("pages_dirtied = %lu\n", pages_dirtied);
> >>> printk("task_ratelimit = %lu\n", task_ratelimit);
> >>> printk("pause = %ld\n", pause);
> >>> }
> >>>
> >>> Gr{oetje,eeting}s,
> >>>
> >>> Geert
> >> I tried it in different ways already - I'm completely unsuccessful in 
> >> getting any printk output.
> >> As soon as the issue happens I do have a
> >>
> >> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>
> >> at stderr of the UML and then no further input is accepted. With 
> >> uml_mconsole I'm however able
> >> to run very basic commands like a crash dump, sysrq ond so on.
> >
> > You may get an idea of the magnitude of pages_dirtied by using a chain 
> > of
> > BUG_ON()s, like:
> >
> > BUG_ON(pages_dirtied > 20);
> > BUG_ON(pages_dirtied > 10);
> > BUG_ON(pages_dirtied > 1);
> > BUG_ON(pages_dirtied > 1000);
> > BUG_ON(pages_dirtied > 100);
> >
> > Probably 1 million is already too much for normal operation?
> >
>  period = HZ * pages_dirtied / task_ratelimit;
>   BUG_ON(pages_dirtied > 20);
>   BUG_ON(pages_dirtied > 10);  <-- this 
>  is line 1467
> >>>
> >>> Summary for mm people:
> >>>
> >>> Toralf runs trinty on UML/i386.
> >>> After some time pages_dirtied becomes very large.
> >>> More than 10 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> > 
> > Toralf, you can try a snipplet like this one to get the values printed out:
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index f5236f8..a80e520 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
> > *mapping,
> > goto pause;
> > }
> > period = HZ * pages_dirtied / task_ratelimit;
> > +
> > +   {
> > +   extern int printf(char *, ...);
> > +   printf("---> task_ratelimit: %lu\n", 
> > task_ratelimit);
> > +   }
> > +
> > pause = period;
> > if (current->dirty_paused_when)
> > pause -= now - current->dirty_paused_when;
> > 
> > 
> > Yes, printf(), not printk().
> > Using this hack we print directly to host's stdout. :)
> > 
> *head smack* ofc - works fine.
> So given this diff :
> 
> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..5a2c337 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
> *mapping,
> }
> period = HZ * pages_dirtied / task_ratelimit;
> pause = period;
> +   if (pause < 0)  {
> +   extern int printf(char *, ...);
> +   printf("overflow : pause : %li\n", pause);
> +   printf("overflow : pages_dirtied : %lu\n", 
> pages_dirtied);
> +   printf("overflow :  task_ratelimit: %lu\n", 
> task_ratelimit);
> +   BUG_ON(1);
> +   }
> if (current->dirty_paused_when)
> pause -= now - current->dirty_paused_when;
> /*
> @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space 
> 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Toralf Förster
On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> Am 09.10.2013 23:47, schrieb Jan Kara:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>> CC'ing mm folks.
>>> Please see below.
>>   Added Fenguang to CC since he is the author of this code.
> 
> Thx, get_maintainer.pl didn't list him.
> 
>>> Am 09.10.2013 19:26, schrieb Toralf Förster:
 On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster  
> wrote:
>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the 
>>> BUG_ON()
>>> asserts its strict positive?!?
>>>
>>> Can you please try the following instead of the BUG_ON():
>>>
>>> if (pause < 0) {
>>> printk("pages_dirtied = %lu\n", pages_dirtied);
>>> printk("task_ratelimit = %lu\n", task_ratelimit);
>>> printk("pause = %ld\n", pause);
>>> }
>>>
>>> Gr{oetje,eeting}s,
>>>
>>> Geert
>> I tried it in different ways already - I'm completely unsuccessful in 
>> getting any printk output.
>> As soon as the issue happens I do have a
>>
>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>
>> at stderr of the UML and then no further input is accepted. With 
>> uml_mconsole I'm however able
>> to run very basic commands like a crash dump, sysrq ond so on.
>
> You may get an idea of the magnitude of pages_dirtied by using a chain of
> BUG_ON()s, like:
>
> BUG_ON(pages_dirtied > 20);
> BUG_ON(pages_dirtied > 10);
> BUG_ON(pages_dirtied > 1);
> BUG_ON(pages_dirtied > 1000);
> BUG_ON(pages_dirtied > 100);
>
> Probably 1 million is already too much for normal operation?
>
 period = HZ * pages_dirtied / task_ratelimit;
BUG_ON(pages_dirtied > 20);
BUG_ON(pages_dirtied > 10);  <-- this 
 is line 1467
>>>
>>> Summary for mm people:
>>>
>>> Toralf runs trinty on UML/i386.
>>> After some time pages_dirtied becomes very large.
>>> More than 10 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> Toralf, you can try a snipplet like this one to get the values printed out:
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..a80e520 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
> *mapping,
> goto pause;
> }
> period = HZ * pages_dirtied / task_ratelimit;
> +
> +   {
> +   extern int printf(char *, ...);
> +   printf("---> task_ratelimit: %lu\n", task_ratelimit);
> +   }
> +
> pause = period;
> if (current->dirty_paused_when)
> pause -= now - current->dirty_paused_when;
> 
> 
> Yes, printf(), not printk().
> Using this hack we print directly to host's stdout. :)
> 
*head smack* ofc - works fine.
So given this diff :

iff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..5a2c337 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
*mapping,
}
period = HZ * pages_dirtied / task_ratelimit;
pause = period;
+   if (pause < 0)  {
+   extern int printf(char *, ...);
+   printf("overflow : pause : %li\n", pause);
+   printf("overflow : pages_dirtied : %lu\n", 
pages_dirtied);
+   printf("overflow :  task_ratelimit: %lu\n", 
task_ratelimit);
+   BUG_ON(1);
+   }
if (current->dirty_paused_when)
pause -= now - current->dirty_paused_when;
/*
@@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space 
*mapping,
}

 pause:
+   if (pause < 0)  {
+   extern int printf(char *, ...);
+   printf("ick : pause : %li\n", pause);
+   printf("ick: pages_dirtied : %lu\n", pages_dirtied);
+   printf("ick: task_ratelimit: 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 08:52:33AM +0200, Geert Uytterhoeven wrote:
> On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu  wrote:
> > On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >> > Am 09.10.2013 19:26, schrieb Toralf Förster:
> >> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
> >> > >>  wrote:
> >> >  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
> >> >  BUG_ON()
> >> >  asserts its strict positive?!?
> >> > 
> >> >  Can you please try the following instead of the BUG_ON():
> >> > 
> >> >  if (pause < 0) {
> >> >  printk("pages_dirtied = %lu\n", pages_dirtied);
> >> >  printk("task_ratelimit = %lu\n", task_ratelimit);
> >> >  printk("pause = %ld\n", pause);
> 
> >> > >>> I tried it in different ways already - I'm completely unsuccessful 
> >> > >>> in getting any printk output.
> >> > >>> As soon as the issue happens I do have a
> >> > >>>
> >> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >> > >>>
> >> > >>> at stderr of the UML and then no further input is accepted. With 
> >> > >>> uml_mconsole I'm however able
> >> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> >> > >>
> >> > >> You may get an idea of the magnitude of pages_dirtied by using a 
> >> > >> chain of
> >> > >> BUG_ON()s, like:
> >> > >>
> >> > >> BUG_ON(pages_dirtied > 20);
> >> > >> BUG_ON(pages_dirtied > 10);
> >> > >> BUG_ON(pages_dirtied > 1);
> >> > >> BUG_ON(pages_dirtied > 1000);
> >> > >> BUG_ON(pages_dirtied > 100);
> >> > >>
> >> > >> Probably 1 million is already too much for normal operation?
> >> > >>
> >> > > period = HZ * pages_dirtied / task_ratelimit;
> >> > >   BUG_ON(pages_dirtied > 20);
> >> > >   BUG_ON(pages_dirtied > 10);  <-- 
> >> > > this is line 1467
> >> >
> >> > Summary for mm people:
> >> >
> >> > Toralf runs trinty on UML/i386.
> >> > After some time pages_dirtied becomes very large.
> >> > More than 10 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there.
> >
> > I noticed aio_setup_ring() in the call trace and find it recently
> > added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> > Add support to aio ring pages migration"). So added CC to its authors.
> >
> >> After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> >>
> >> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> >> > and period/pause becomes extremely large.
> 
> period/pause are signed long, so they become negative instead of
> extremely large when overflowing.

Yeah. For that we have underflow detect as well: 

if (pause < min_pause) {
...
break;
}

So we'll break out of the loop -- but yeah, whether the break is the
right behavior on underflow is still questionable.

> >> > It looks like io_schedule_timeout() get's called with a very large 
> >> > timeout.
> >> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> Because pause is now negative.

So here io_schedule_timeout() won't be called with negative pause.

And if ever io_schedule_timeout() calls schedule_timeout() with
negative timeout, the latter will emit a warning and break out, too:

if (timeout < 0) {
printk(KERN_ERR "schedule_timeout: wrong timeout "
"value %lx\n", timeout);
dump_stack();
current->state = TASK_RUNNING;
goto out;
}

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Geert Uytterhoeven
On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu  wrote:
> On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> > Am 09.10.2013 19:26, schrieb Toralf Förster:
>> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
>> > >>  wrote:
>> >  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
>> >  BUG_ON()
>> >  asserts its strict positive?!?
>> > 
>> >  Can you please try the following instead of the BUG_ON():
>> > 
>> >  if (pause < 0) {
>> >  printk("pages_dirtied = %lu\n", pages_dirtied);
>> >  printk("task_ratelimit = %lu\n", task_ratelimit);
>> >  printk("pause = %ld\n", pause);

>> > >>> I tried it in different ways already - I'm completely unsuccessful in 
>> > >>> getting any printk output.
>> > >>> As soon as the issue happens I do have a
>> > >>>
>> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>> > >>>
>> > >>> at stderr of the UML and then no further input is accepted. With 
>> > >>> uml_mconsole I'm however able
>> > >>> to run very basic commands like a crash dump, sysrq ond so on.
>> > >>
>> > >> You may get an idea of the magnitude of pages_dirtied by using a chain 
>> > >> of
>> > >> BUG_ON()s, like:
>> > >>
>> > >> BUG_ON(pages_dirtied > 20);
>> > >> BUG_ON(pages_dirtied > 10);
>> > >> BUG_ON(pages_dirtied > 1);
>> > >> BUG_ON(pages_dirtied > 1000);
>> > >> BUG_ON(pages_dirtied > 100);
>> > >>
>> > >> Probably 1 million is already too much for normal operation?
>> > >>
>> > > period = HZ * pages_dirtied / task_ratelimit;
>> > >   BUG_ON(pages_dirtied > 20);
>> > >   BUG_ON(pages_dirtied > 10);  <-- this 
>> > > is line 1467
>> >
>> > Summary for mm people:
>> >
>> > Toralf runs trinty on UML/i386.
>> > After some time pages_dirtied becomes very large.
>> > More than 10 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there.
>
> I noticed aio_setup_ring() in the call trace and find it recently
> added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> Add support to aio ring pages migration"). So added CC to its authors.
>
>> After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
>>
>> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
>> > and period/pause becomes extremely large.

period/pause are signed long, so they become negative instead of
extremely large when overflowing.

>> > It looks like io_schedule_timeout() get's called with a very large timeout.
>> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Because pause is now negative.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Geert Uytterhoeven
On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu fengguang...@intel.com wrote:
 On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
  Am 09.10.2013 19:26, schrieb Toralf Förster:
   On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
   On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
   toralf.foers...@gmx.de wrote:
   Hmm, now pages_dirtied is zero, according to the backtrace, but the 
   BUG_ON()
   asserts its strict positive?!?
  
   Can you please try the following instead of the BUG_ON():
  
   if (pause  0) {
   printk(pages_dirtied = %lu\n, pages_dirtied);
   printk(task_ratelimit = %lu\n, task_ratelimit);
   printk(pause = %ld\n, pause);

   I tried it in different ways already - I'm completely unsuccessful in 
   getting any printk output.
   As soon as the issue happens I do have a
  
   BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
  
   at stderr of the UML and then no further input is accepted. With 
   uml_mconsole I'm however able
   to run very basic commands like a crash dump, sysrq ond so on.
  
   You may get an idea of the magnitude of pages_dirtied by using a chain 
   of
   BUG_ON()s, like:
  
   BUG_ON(pages_dirtied  20);
   BUG_ON(pages_dirtied  10);
   BUG_ON(pages_dirtied  1);
   BUG_ON(pages_dirtied  1000);
   BUG_ON(pages_dirtied  100);
  
   Probably 1 million is already too much for normal operation?
  
   period = HZ * pages_dirtied / task_ratelimit;
 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);  -- this 
   is line 1467
 
  Summary for mm people:
 
  Toralf runs trinty on UML/i386.
  After some time pages_dirtied becomes very large.
  More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
 over 10^9 can get there.

 I noticed aio_setup_ring() in the call trace and find it recently
 added a SetPageDirty() call in a loop by commit 36bc08cc01 (fs/aio:
 Add support to aio ring pages migration). So added CC to its authors.

 After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling -nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current-nr_dirtied.

  Thus, period = HZ * pages_dirtied / task_ratelimit overflows
  and period/pause becomes extremely large.

period/pause are signed long, so they become negative instead of
extremely large when overflowing.

  It looks like io_schedule_timeout() get's called with a very large timeout.
  I don't know why if (unlikely(pause  max_pause)) { does not help.

Because pause is now negative.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 08:52:33AM +0200, Geert Uytterhoeven wrote:
 On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu fengguang...@intel.com wrote:
  On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
  On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
   Am 09.10.2013 19:26, schrieb Toralf Förster:
On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
toralf.foers...@gmx.de wrote:
Hmm, now pages_dirtied is zero, according to the backtrace, but the 
BUG_ON()
asserts its strict positive?!?
   
Can you please try the following instead of the BUG_ON():
   
if (pause  0) {
printk(pages_dirtied = %lu\n, pages_dirtied);
printk(task_ratelimit = %lu\n, task_ratelimit);
printk(pause = %ld\n, pause);
 
I tried it in different ways already - I'm completely unsuccessful 
in getting any printk output.
As soon as the issue happens I do have a
   
BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
   
at stderr of the UML and then no further input is accepted. With 
uml_mconsole I'm however able
to run very basic commands like a crash dump, sysrq ond so on.
   
You may get an idea of the magnitude of pages_dirtied by using a 
chain of
BUG_ON()s, like:
   
BUG_ON(pages_dirtied  20);
BUG_ON(pages_dirtied  10);
BUG_ON(pages_dirtied  1);
BUG_ON(pages_dirtied  1000);
BUG_ON(pages_dirtied  100);
   
Probably 1 million is already too much for normal operation?
   
period = HZ * pages_dirtied / task_ratelimit;
  BUG_ON(pages_dirtied  20);
  BUG_ON(pages_dirtied  10);  -- 
this is line 1467
  
   Summary for mm people:
  
   Toralf runs trinty on UML/i386.
   After some time pages_dirtied becomes very large.
   More than 10 pages in this case.
Huh, this is really strange. pages_dirtied is passed into
  balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
  over 10^9 can get there.
 
  I noticed aio_setup_ring() in the call trace and find it recently
  added a SetPageDirty() call in a loop by commit 36bc08cc01 (fs/aio:
  Add support to aio ring pages migration). So added CC to its authors.
 
  After all that is over 4TB so I somewhat doubt the
  task was ever able to dirty that much during its lifetime (but correct me
  if I'm wrong here, with UML and memory backed disks it is not totally
  impossible)... I went through the logic of handling -nr_dirtied but
  I didn't find any obvious problem there. Hum, maybe one thing - what
  'task_ratelimit' values do you see in balance_dirty_pages? If that one was
  huge, we could possibly accumulate huge current-nr_dirtied.
 
   Thus, period = HZ * pages_dirtied / task_ratelimit overflows
   and period/pause becomes extremely large.
 
 period/pause are signed long, so they become negative instead of
 extremely large when overflowing.

Yeah. For that we have underflow detect as well: 

if (pause  min_pause) {
...
break;
}

So we'll break out of the loop -- but yeah, whether the break is the
right behavior on underflow is still questionable.

   It looks like io_schedule_timeout() get's called with a very large 
   timeout.
   I don't know why if (unlikely(pause  max_pause)) { does not help.
 
 Because pause is now negative.

So here io_schedule_timeout() won't be called with negative pause.

And if ever io_schedule_timeout() calls schedule_timeout() with
negative timeout, the latter will emit a warning and break out, too:

if (timeout  0) {
printk(KERN_ERR schedule_timeout: wrong timeout 
value %lx\n, timeout);
dump_stack();
current-state = TASK_RUNNING;
goto out;
}

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Toralf Förster
On 10/10/2013 12:33 AM, Richard Weinberger wrote:
 Am 09.10.2013 23:47, schrieb Jan Kara:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
 CC'ing mm folks.
 Please see below.
   Added Fenguang to CC since he is the author of this code.
 
 Thx, get_maintainer.pl didn't list him.
 
 Am 09.10.2013 19:26, schrieb Toralf Förster:
 On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster toralf.foers...@gmx.de 
 wrote:
 Hmm, now pages_dirtied is zero, according to the backtrace, but the 
 BUG_ON()
 asserts its strict positive?!?

 Can you please try the following instead of the BUG_ON():

 if (pause  0) {
 printk(pages_dirtied = %lu\n, pages_dirtied);
 printk(task_ratelimit = %lu\n, task_ratelimit);
 printk(pause = %ld\n, pause);
 }

 Gr{oetje,eeting}s,

 Geert
 I tried it in different ways already - I'm completely unsuccessful in 
 getting any printk output.
 As soon as the issue happens I do have a

 BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

 at stderr of the UML and then no further input is accepted. With 
 uml_mconsole I'm however able
 to run very basic commands like a crash dump, sysrq ond so on.

 You may get an idea of the magnitude of pages_dirtied by using a chain of
 BUG_ON()s, like:

 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);
 BUG_ON(pages_dirtied  1);
 BUG_ON(pages_dirtied  1000);
 BUG_ON(pages_dirtied  100);

 Probably 1 million is already too much for normal operation?

 period = HZ * pages_dirtied / task_ratelimit;
BUG_ON(pages_dirtied  20);
BUG_ON(pages_dirtied  10);  -- this 
 is line 1467

 Summary for mm people:

 Toralf runs trinty on UML/i386.
 After some time pages_dirtied becomes very large.
 More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
 over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling -nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current-nr_dirtied.
 
 Toralf, you can try a snipplet like this one to get the values printed out:
 diff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index f5236f8..a80e520 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 goto pause;
 }
 period = HZ * pages_dirtied / task_ratelimit;
 +
 +   {
 +   extern int printf(char *, ...);
 +   printf(--- task_ratelimit: %lu\n, task_ratelimit);
 +   }
 +
 pause = period;
 if (current-dirty_paused_when)
 pause -= now - current-dirty_paused_when;
 
 
 Yes, printf(), not printk().
 Using this hack we print directly to host's stdout. :)
 
*head smack* ofc - works fine.
So given this diff :

iff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..5a2c337 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
*mapping,
}
period = HZ * pages_dirtied / task_ratelimit;
pause = period;
+   if (pause  0)  {
+   extern int printf(char *, ...);
+   printf(overflow : pause : %li\n, pause);
+   printf(overflow : pages_dirtied : %lu\n, 
pages_dirtied);
+   printf(overflow :  task_ratelimit: %lu\n, 
task_ratelimit);
+   BUG_ON(1);
+   }
if (current-dirty_paused_when)
pause -= now - current-dirty_paused_when;
/*
@@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space 
*mapping,
}

 pause:
+   if (pause  0)  {
+   extern int printf(char *, ...);
+   printf(ick : pause : %li\n, pause);
+   printf(ick: pages_dirtied : %lu\n, pages_dirtied);
+   printf(ick: task_ratelimit: %lu\n, task_ratelimit);
+   BUG_ON(1);
+   }
trace_balance_dirty_pages(bdi,
  dirty_thresh,
  background_thresh,


I got this :




 * Starting local
net.core.warnings = 0

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-10 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
 On 10/10/2013 12:33 AM, Richard Weinberger wrote:
  Am 09.10.2013 23:47, schrieb Jan Kara:
  On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
  CC'ing mm folks.
  Please see below.
Added Fenguang to CC since he is the author of this code.
  
  Thx, get_maintainer.pl didn't list him.
  
  Am 09.10.2013 19:26, schrieb Toralf Förster:
  On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
  On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster 
  toralf.foers...@gmx.de wrote:
  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
  BUG_ON()
  asserts its strict positive?!?
 
  Can you please try the following instead of the BUG_ON():
 
  if (pause  0) {
  printk(pages_dirtied = %lu\n, pages_dirtied);
  printk(task_ratelimit = %lu\n, task_ratelimit);
  printk(pause = %ld\n, pause);
  }
 
  Gr{oetje,eeting}s,
 
  Geert
  I tried it in different ways already - I'm completely unsuccessful in 
  getting any printk output.
  As soon as the issue happens I do have a
 
  BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
 
  at stderr of the UML and then no further input is accepted. With 
  uml_mconsole I'm however able
  to run very basic commands like a crash dump, sysrq ond so on.
 
  You may get an idea of the magnitude of pages_dirtied by using a chain 
  of
  BUG_ON()s, like:
 
  BUG_ON(pages_dirtied  20);
  BUG_ON(pages_dirtied  10);
  BUG_ON(pages_dirtied  1);
  BUG_ON(pages_dirtied  1000);
  BUG_ON(pages_dirtied  100);
 
  Probably 1 million is already too much for normal operation?
 
  period = HZ * pages_dirtied / task_ratelimit;
   BUG_ON(pages_dirtied  20);
   BUG_ON(pages_dirtied  10);  -- this 
  is line 1467
 
  Summary for mm people:
 
  Toralf runs trinty on UML/i386.
  After some time pages_dirtied becomes very large.
  More than 10 pages in this case.
Huh, this is really strange. pages_dirtied is passed into
  balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
  over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
  task was ever able to dirty that much during its lifetime (but correct me
  if I'm wrong here, with UML and memory backed disks it is not totally
  impossible)... I went through the logic of handling -nr_dirtied but
  I didn't find any obvious problem there. Hum, maybe one thing - what
  'task_ratelimit' values do you see in balance_dirty_pages? If that one was
  huge, we could possibly accumulate huge current-nr_dirtied.
  
  Toralf, you can try a snipplet like this one to get the values printed out:
  diff --git a/mm/page-writeback.c b/mm/page-writeback.c
  index f5236f8..a80e520 100644
  --- a/mm/page-writeback.c
  +++ b/mm/page-writeback.c
  @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
  *mapping,
  goto pause;
  }
  period = HZ * pages_dirtied / task_ratelimit;
  +
  +   {
  +   extern int printf(char *, ...);
  +   printf(--- task_ratelimit: %lu\n, 
  task_ratelimit);
  +   }
  +
  pause = period;
  if (current-dirty_paused_when)
  pause -= now - current-dirty_paused_when;
  
  
  Yes, printf(), not printk().
  Using this hack we print directly to host's stdout. :)
  
 *head smack* ofc - works fine.
 So given this diff :
 
 iff --git a/mm/page-writeback.c b/mm/page-writeback.c
 index f5236f8..5a2c337 100644
 --- a/mm/page-writeback.c
 +++ b/mm/page-writeback.c
 @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 }
 period = HZ * pages_dirtied / task_ratelimit;
 pause = period;
 +   if (pause  0)  {
 +   extern int printf(char *, ...);
 +   printf(overflow : pause : %li\n, pause);
 +   printf(overflow : pages_dirtied : %lu\n, 
 pages_dirtied);
 +   printf(overflow :  task_ratelimit: %lu\n, 
 task_ratelimit);
 +   BUG_ON(1);
 +   }
 if (current-dirty_paused_when)
 pause -= now - current-dirty_paused_when;
 /*
 @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space 
 *mapping,
 }
 
  pause:
 +   if (pause  0)  {

Oops, we got 

ick : pause : -984

here! Since pause is bounded by [min_pause, max_pause], it means
something goes wrong with the bounds. Would you help print min_pause
and max_pause as well?

It seems there are long = unsigned long conversion problems in
bdi_max_pause(), which might make max_pause a negative number. And the
min_pause calculation is based on max_pause to some degree..

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Fengguang Wu
On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> > CC'ing mm folks.
> > Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thanks!

> > Am 09.10.2013 19:26, schrieb Toralf Förster:
> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster  
> > >> wrote:
> >  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
> >  BUG_ON()
> >  asserts its strict positive?!?
> > 
> >  Can you please try the following instead of the BUG_ON():
> > 
> >  if (pause < 0) {
> >  printk("pages_dirtied = %lu\n", pages_dirtied);
> >  printk("task_ratelimit = %lu\n", task_ratelimit);
> >  printk("pause = %ld\n", pause);
> >  }
> > 
> >  Gr{oetje,eeting}s,
> > 
> >  Geert
> > >>> I tried it in different ways already - I'm completely unsuccessful in 
> > >>> getting any printk output.
> > >>> As soon as the issue happens I do have a
> > >>>
> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> > >>>
> > >>> at stderr of the UML and then no further input is accepted. With 
> > >>> uml_mconsole I'm however able
> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> > >>
> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> > >> BUG_ON()s, like:
> > >>
> > >> BUG_ON(pages_dirtied > 20);
> > >> BUG_ON(pages_dirtied > 10);
> > >> BUG_ON(pages_dirtied > 1);
> > >> BUG_ON(pages_dirtied > 1000);
> > >> BUG_ON(pages_dirtied > 100);
> > >>
> > >> Probably 1 million is already too much for normal operation?
> > >>
> > > period = HZ * pages_dirtied / task_ratelimit;
> > >   BUG_ON(pages_dirtied > 20);
> > >   BUG_ON(pages_dirtied > 10);  <-- this 
> > > is line 1467
> > 
> > Summary for mm people:
> > 
> > Toralf runs trinty on UML/i386.
> > After some time pages_dirtied becomes very large.
> > More than 10 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there.

I noticed aio_setup_ring() in the call trace and find it recently
added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
Add support to aio ring pages migration"). So added CC to its authors.

> After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> > and period/pause becomes extremely large.

Yes, that's possible.

> > It looks like io_schedule_timeout() get's called with a very large timeout.
> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

The test will sure work and limit pause to <= max_pause. However it's
very possible balance_dirty_pages() cannot break out of the loop (or
being called repeatedly) and block the task.

I'm afraid there are no one to clear the dirty pages, which makes
balance_dirty_pages() waiting for ever.

Thanks,
Fengguang

> > 
> > > the back trace is :
> > > 
> > > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
> > > /home/tfoerste/devel/linux/linux -batch -ex bt
> > > [New LWP 6911]
> > > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
> > > ubda=/home/tfoerste/virtual/uml/tr'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #1  0x083bdf35 in kill ()
> > > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > > #4  0x080623c4 in panic_exit (self=0x85c1558 , 
> > > unused1=0, unused2=0x85f76e0 ) at arch/um/kernel/um_arch.c:240
> > > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
> > > , nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
> > > , val=0, v=0x85f76e0 , nr_to_call=0, 
> > > nr_calls=0x0) at kernel/notifier.c:182
> > > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
> > > kernel/notifier.c:191
> > > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, 
> > > mapping=) at mm/page-writeback.c:1467
> > > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
> > > mm/page-writeback.c:1661

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Richard Weinberger
Am 09.10.2013 23:47, schrieb Jan Kara:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> CC'ing mm folks.
>> Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thx, get_maintainer.pl didn't list him.

>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster  
 wrote:
>> Hmm, now pages_dirtied is zero, according to the backtrace, but the 
>> BUG_ON()
>> asserts its strict positive?!?
>>
>> Can you please try the following instead of the BUG_ON():
>>
>> if (pause < 0) {
>> printk("pages_dirtied = %lu\n", pages_dirtied);
>> printk("task_ratelimit = %lu\n", task_ratelimit);
>> printk("pause = %ld\n", pause);
>> }
>>
>> Gr{oetje,eeting}s,
>>
>> Geert
> I tried it in different ways already - I'm completely unsuccessful in 
> getting any printk output.
> As soon as the issue happens I do have a
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>
> at stderr of the UML and then no further input is accepted. With 
> uml_mconsole I'm however able
> to run very basic commands like a crash dump, sysrq ond so on.

 You may get an idea of the magnitude of pages_dirtied by using a chain of
 BUG_ON()s, like:

 BUG_ON(pages_dirtied > 20);
 BUG_ON(pages_dirtied > 10);
 BUG_ON(pages_dirtied > 1);
 BUG_ON(pages_dirtied > 1000);
 BUG_ON(pages_dirtied > 100);

 Probably 1 million is already too much for normal operation?

>>> period = HZ * pages_dirtied / task_ratelimit;
>>> BUG_ON(pages_dirtied > 20);
>>> BUG_ON(pages_dirtied > 10);  <-- this 
>>> is line 1467
>>
>> Summary for mm people:
>>
>> Toralf runs trinty on UML/i386.
>> After some time pages_dirtied becomes very large.
>> More than 10 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.

Toralf, you can try a snipplet like this one to get the values printed out:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..a80e520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
*mapping,
goto pause;
}
period = HZ * pages_dirtied / task_ratelimit;
+
+   {
+   extern int printf(char *, ...);
+   printf("---> task_ratelimit: %lu\n", task_ratelimit);
+   }
+
pause = period;
if (current->dirty_paused_when)
pause -= now - current->dirty_paused_when;


Yes, printf(), not printk().
Using this hack we print directly to host's stdout. :)

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Jan Kara
On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> CC'ing mm folks.
> Please see below.
  Added Fenguang to CC since he is the author of this code.

> Am 09.10.2013 19:26, schrieb Toralf Förster:
> > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster  
> >> wrote:
>  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
>  BUG_ON()
>  asserts its strict positive?!?
> 
>  Can you please try the following instead of the BUG_ON():
> 
>  if (pause < 0) {
>  printk("pages_dirtied = %lu\n", pages_dirtied);
>  printk("task_ratelimit = %lu\n", task_ratelimit);
>  printk("pause = %ld\n", pause);
>  }
> 
>  Gr{oetje,eeting}s,
> 
>  Geert
> >>> I tried it in different ways already - I'm completely unsuccessful in 
> >>> getting any printk output.
> >>> As soon as the issue happens I do have a
> >>>
> >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>>
> >>> at stderr of the UML and then no further input is accepted. With 
> >>> uml_mconsole I'm however able
> >>> to run very basic commands like a crash dump, sysrq ond so on.
> >>
> >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >> BUG_ON()s, like:
> >>
> >> BUG_ON(pages_dirtied > 20);
> >> BUG_ON(pages_dirtied > 10);
> >> BUG_ON(pages_dirtied > 1);
> >> BUG_ON(pages_dirtied > 1000);
> >> BUG_ON(pages_dirtied > 100);
> >>
> >> Probably 1 million is already too much for normal operation?
> >>
> > period = HZ * pages_dirtied / task_ratelimit;
> > BUG_ON(pages_dirtied > 20);
> > BUG_ON(pages_dirtied > 10);  <-- this 
> > is line 1467
> 
> Summary for mm people:
> 
> Toralf runs trinty on UML/i386.
> After some time pages_dirtied becomes very large.
> More than 10 pages in this case.
  Huh, this is really strange. pages_dirtied is passed into
balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
task was ever able to dirty that much during its lifetime (but correct me
if I'm wrong here, with UML and memory backed disks it is not totally
impossible)... I went through the logic of handling ->nr_dirtied but
I didn't find any obvious problem there. Hum, maybe one thing - what
'task_ratelimit' values do you see in balance_dirty_pages? If that one was
huge, we could possibly accumulate huge current->nr_dirtied.

> Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> and period/pause becomes extremely large.
> 
> It looks like io_schedule_timeout() get's called with a very large timeout.
> I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> 
> > the back trace is :
> > 
> > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
> > /home/tfoerste/devel/linux/linux -batch -ex bt
> > [New LWP 6911]
> > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
> > ubda=/home/tfoerste/virtual/uml/tr'.
> > Program terminated with signal 6, Aborted.
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #1  0x083bdf35 in kill ()
> > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > #4  0x080623c4 in panic_exit (self=0x85c1558 , 
> > unused1=0, unused2=0x85f76e0 ) at arch/um/kernel/um_arch.c:240
> > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
> > , nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
> > , val=0, v=0x85f76e0 , nr_to_call=0, 
> > nr_calls=0x0) at kernel/notifier.c:182
> > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
> > kernel/notifier.c:191
> > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping= > out>) at mm/page-writeback.c:1467
> > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
> > mm/page-writeback.c:1661
> > #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, 
> > address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=) 
> > at mm/memory.c:3452
> > #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=, 
> > pmd=, address=, vma=, 
> > mm=, page_table=) at mm/memory.c:3486
> > #13 handle_pte_fault (flags=, pmd=, 
> > pte=, address=, vma=, 
> > mm=) at mm/memory.c:3710
> > #14 __handle_mm_fault (flags=, address=, 
> > vma=, mm=) at mm/memory.c:3845
> > #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, 
> > flags=1) at mm/memory.c:3868
> > #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, 
> > start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, 
> > nonblocking=0x0) at mm/memory.c:1822
> > #17 0x080e7ae3 in 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Richard Weinberger
CC'ing mm folks.
Please see below.

Am 09.10.2013 19:26, schrieb Toralf Förster:
> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster  
>> wrote:
 Hmm, now pages_dirtied is zero, according to the backtrace, but the 
 BUG_ON()
 asserts its strict positive?!?

 Can you please try the following instead of the BUG_ON():

 if (pause < 0) {
 printk("pages_dirtied = %lu\n", pages_dirtied);
 printk("task_ratelimit = %lu\n", task_ratelimit);
 printk("pause = %ld\n", pause);
 }

 Gr{oetje,eeting}s,

 Geert
>>> I tried it in different ways already - I'm completely unsuccessful in 
>>> getting any printk output.
>>> As soon as the issue happens I do have a
>>>
>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>
>>> at stderr of the UML and then no further input is accepted. With 
>>> uml_mconsole I'm however able
>>> to run very basic commands like a crash dump, sysrq ond so on.
>>
>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>> BUG_ON()s, like:
>>
>> BUG_ON(pages_dirtied > 20);
>> BUG_ON(pages_dirtied > 10);
>> BUG_ON(pages_dirtied > 1);
>> BUG_ON(pages_dirtied > 1000);
>> BUG_ON(pages_dirtied > 100);
>>
>> Probably 1 million is already too much for normal operation?
>>
> period = HZ * pages_dirtied / task_ratelimit;
>   BUG_ON(pages_dirtied > 20);
>   BUG_ON(pages_dirtied > 10);  <-- this 
> is line 1467

Summary for mm people:

Toralf runs trinty on UML/i386.
After some time pages_dirtied becomes very large.
More than 10 pages in this case.

Thus, period = HZ * pages_dirtied / task_ratelimit overflows
and period/pause becomes extremely large.

It looks like io_schedule_timeout() get's called with a very large timeout.
I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Any ideas?

Thanks,
//richard

> the back trace is :
> 
> tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
> /home/tfoerste/devel/linux/linux -batch -ex bt
> [New LWP 6911]
> Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
> ubda=/home/tfoerste/virtual/uml/tr'.
> Program terminated with signal 6, Aborted.
> #0  0xb77a7424 in __kernel_vsyscall ()
> #0  0xb77a7424 in __kernel_vsyscall ()
> #1  0x083bdf35 in kill ()
> #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> #4  0x080623c4 in panic_exit (self=0x85c1558 , 
> unused1=0, unused2=0x85f76e0 ) at arch/um/kernel/um_arch.c:240
> #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
> , nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
> , val=0, v=0x85f76e0 , nr_to_call=0, 
> nr_calls=0x0) at kernel/notifier.c:182
> #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
> kernel/notifier.c:191
> #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping= out>) at mm/page-writeback.c:1467
> #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
> mm/page-writeback.c:1661
> #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, 
> address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=) at 
> mm/memory.c:3452
> #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=, 
> pmd=, address=, vma=, 
> mm=, page_table=) at mm/memory.c:3486
> #13 handle_pte_fault (flags=, pmd=, 
> pte=, address=, vma=, 
> mm=) at mm/memory.c:3710
> #14 __handle_mm_fault (flags=, address=, 
> vma=, mm=) at mm/memory.c:3845
> #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, 
> flags=1) at mm/memory.c:3868
> #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, 
> start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, 
> nonblocking=0x0) at mm/memory.c:1822
> #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, 
> write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> #18 0x08143dc6 in aio_setup_ring (ctx=) at fs/aio.c:340
> #19 ioctx_alloc (nr_events=) at fs/aio.c:605
> #20 SYSC_io_setup (ctxp=, nr_events=) at 
> fs/aio.c:1122
> #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> #22 0x08062984 in handle_syscall (r=0x487059d4) at 
> arch/um/kernel/skas/syscall.c:35
> #23 0x08074fb5 in handle_trap (local_using_sysemu=, 
> regs=, pid=) at 
> arch/um/os-Linux/skas/process.c:198
> #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> #26 0x in ?? ()
> 
> 
> 
>> Gr{oetje,eeting}s,
>>
>> Geert
>>
>> --
>> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
>> ge...@linux-m68k.org
>>
>> In personal 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Richard Weinberger
CC'ing mm folks.
Please see below.

Am 09.10.2013 19:26, schrieb Toralf Förster:
 On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster toralf.foers...@gmx.de 
 wrote:
 Hmm, now pages_dirtied is zero, according to the backtrace, but the 
 BUG_ON()
 asserts its strict positive?!?

 Can you please try the following instead of the BUG_ON():

 if (pause  0) {
 printk(pages_dirtied = %lu\n, pages_dirtied);
 printk(task_ratelimit = %lu\n, task_ratelimit);
 printk(pause = %ld\n, pause);
 }

 Gr{oetje,eeting}s,

 Geert
 I tried it in different ways already - I'm completely unsuccessful in 
 getting any printk output.
 As soon as the issue happens I do have a

 BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

 at stderr of the UML and then no further input is accepted. With 
 uml_mconsole I'm however able
 to run very basic commands like a crash dump, sysrq ond so on.

 You may get an idea of the magnitude of pages_dirtied by using a chain of
 BUG_ON()s, like:

 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);
 BUG_ON(pages_dirtied  1);
 BUG_ON(pages_dirtied  1000);
 BUG_ON(pages_dirtied  100);

 Probably 1 million is already too much for normal operation?

 period = HZ * pages_dirtied / task_ratelimit;
   BUG_ON(pages_dirtied  20);
   BUG_ON(pages_dirtied  10);  -- this 
 is line 1467

Summary for mm people:

Toralf runs trinty on UML/i386.
After some time pages_dirtied becomes very large.
More than 10 pages in this case.

Thus, period = HZ * pages_dirtied / task_ratelimit overflows
and period/pause becomes extremely large.

It looks like io_schedule_timeout() get's called with a very large timeout.
I don't know why if (unlikely(pause  max_pause)) { does not help.

Any ideas?

Thanks,
//richard

 the back trace is :
 
 tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
 /home/tfoerste/devel/linux/linux -batch -ex bt
 [New LWP 6911]
 Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
 ubda=/home/tfoerste/virtual/uml/tr'.
 Program terminated with signal 6, Aborted.
 #0  0xb77a7424 in __kernel_vsyscall ()
 #0  0xb77a7424 in __kernel_vsyscall ()
 #1  0x083bdf35 in kill ()
 #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
 #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
 #4  0x080623c4 in panic_exit (self=0x85c1558 panic_exit_notifier, 
 unused1=0, unused2=0x85f76e0 buf.16221) at arch/um/kernel/um_arch.c:240
 #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
 buf.16221, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
 #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
 panic_notifier_list, val=0, v=0x85f76e0 buf.16221, nr_to_call=0, 
 nr_calls=0x0) at kernel/notifier.c:182
 #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
 kernel/notifier.c:191
 #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
 #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=optimized 
 out) at mm/page-writeback.c:1467
 #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
 mm/page-writeback.c:1661
 #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, 
 address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=incomplete type) at 
 mm/memory.c:3452
 #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=optimized out, 
 pmd=optimized out, address=optimized out, vma=optimized out, 
 mm=optimized out, page_table=optimized out) at mm/memory.c:3486
 #13 handle_pte_fault (flags=optimized out, pmd=optimized out, 
 pte=optimized out, address=optimized out, vma=optimized out, 
 mm=optimized out) at mm/memory.c:3710
 #14 __handle_mm_fault (flags=optimized out, address=optimized out, 
 vma=optimized out, mm=optimized out) at mm/memory.c:3845
 #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, 
 flags=1) at mm/memory.c:3868
 #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, 
 start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, 
 nonblocking=0x0) at mm/memory.c:1822
 #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, 
 write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
 #18 0x08143dc6 in aio_setup_ring (ctx=optimized out) at fs/aio.c:340
 #19 ioctx_alloc (nr_events=optimized out) at fs/aio.c:605
 #20 SYSC_io_setup (ctxp=optimized out, nr_events=optimized out) at 
 fs/aio.c:1122
 #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
 #22 0x08062984 in handle_syscall (r=0x487059d4) at 
 arch/um/kernel/skas/syscall.c:35
 #23 0x08074fb5 in handle_trap (local_using_sysemu=optimized out, 
 regs=optimized out, pid=optimized out) at 
 arch/um/os-Linux/skas/process.c:198
 #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
 #25 0x0805f750 in fork_handler () at 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Jan Kara
On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
 CC'ing mm folks.
 Please see below.
  Added Fenguang to CC since he is the author of this code.

 Am 09.10.2013 19:26, schrieb Toralf Förster:
  On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
  On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster toralf.foers...@gmx.de 
  wrote:
  Hmm, now pages_dirtied is zero, according to the backtrace, but the 
  BUG_ON()
  asserts its strict positive?!?
 
  Can you please try the following instead of the BUG_ON():
 
  if (pause  0) {
  printk(pages_dirtied = %lu\n, pages_dirtied);
  printk(task_ratelimit = %lu\n, task_ratelimit);
  printk(pause = %ld\n, pause);
  }
 
  Gr{oetje,eeting}s,
 
  Geert
  I tried it in different ways already - I'm completely unsuccessful in 
  getting any printk output.
  As soon as the issue happens I do have a
 
  BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
 
  at stderr of the UML and then no further input is accepted. With 
  uml_mconsole I'm however able
  to run very basic commands like a crash dump, sysrq ond so on.
 
  You may get an idea of the magnitude of pages_dirtied by using a chain of
  BUG_ON()s, like:
 
  BUG_ON(pages_dirtied  20);
  BUG_ON(pages_dirtied  10);
  BUG_ON(pages_dirtied  1);
  BUG_ON(pages_dirtied  1000);
  BUG_ON(pages_dirtied  100);
 
  Probably 1 million is already too much for normal operation?
 
  period = HZ * pages_dirtied / task_ratelimit;
  BUG_ON(pages_dirtied  20);
  BUG_ON(pages_dirtied  10);  -- this 
  is line 1467
 
 Summary for mm people:
 
 Toralf runs trinty on UML/i386.
 After some time pages_dirtied becomes very large.
 More than 10 pages in this case.
  Huh, this is really strange. pages_dirtied is passed into
balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
task was ever able to dirty that much during its lifetime (but correct me
if I'm wrong here, with UML and memory backed disks it is not totally
impossible)... I went through the logic of handling -nr_dirtied but
I didn't find any obvious problem there. Hum, maybe one thing - what
'task_ratelimit' values do you see in balance_dirty_pages? If that one was
huge, we could possibly accumulate huge current-nr_dirtied.

 Thus, period = HZ * pages_dirtied / task_ratelimit overflows
 and period/pause becomes extremely large.
 
 It looks like io_schedule_timeout() get's called with a very large timeout.
 I don't know why if (unlikely(pause  max_pause)) { does not help.
 
 
  the back trace is :
  
  tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
  /home/tfoerste/devel/linux/linux -batch -ex bt
  [New LWP 6911]
  Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
  ubda=/home/tfoerste/virtual/uml/tr'.
  Program terminated with signal 6, Aborted.
  #0  0xb77a7424 in __kernel_vsyscall ()
  #0  0xb77a7424 in __kernel_vsyscall ()
  #1  0x083bdf35 in kill ()
  #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
  #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
  #4  0x080623c4 in panic_exit (self=0x85c1558 panic_exit_notifier, 
  unused1=0, unused2=0x85f76e0 buf.16221) at arch/um/kernel/um_arch.c:240
  #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
  buf.16221, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
  #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
  panic_notifier_list, val=0, v=0x85f76e0 buf.16221, nr_to_call=0, 
  nr_calls=0x0) at kernel/notifier.c:182
  #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
  kernel/notifier.c:191
  #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
  #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=optimized 
  out) at mm/page-writeback.c:1467
  #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
  mm/page-writeback.c:1661
  #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, 
  address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=incomplete type) 
  at mm/memory.c:3452
  #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=optimized out, 
  pmd=optimized out, address=optimized out, vma=optimized out, 
  mm=optimized out, page_table=optimized out) at mm/memory.c:3486
  #13 handle_pte_fault (flags=optimized out, pmd=optimized out, 
  pte=optimized out, address=optimized out, vma=optimized out, 
  mm=optimized out) at mm/memory.c:3710
  #14 __handle_mm_fault (flags=optimized out, address=optimized out, 
  vma=optimized out, mm=optimized out) at mm/memory.c:3845
  #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, 
  flags=1) at mm/memory.c:3868
  #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, 
  start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, 
  nonblocking=0x0) at 

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Richard Weinberger
Am 09.10.2013 23:47, schrieb Jan Kara:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
 CC'ing mm folks.
 Please see below.
   Added Fenguang to CC since he is the author of this code.

Thx, get_maintainer.pl didn't list him.

 Am 09.10.2013 19:26, schrieb Toralf Förster:
 On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster toralf.foers...@gmx.de 
 wrote:
 Hmm, now pages_dirtied is zero, according to the backtrace, but the 
 BUG_ON()
 asserts its strict positive?!?

 Can you please try the following instead of the BUG_ON():

 if (pause  0) {
 printk(pages_dirtied = %lu\n, pages_dirtied);
 printk(task_ratelimit = %lu\n, task_ratelimit);
 printk(pause = %ld\n, pause);
 }

 Gr{oetje,eeting}s,

 Geert
 I tried it in different ways already - I'm completely unsuccessful in 
 getting any printk output.
 As soon as the issue happens I do have a

 BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

 at stderr of the UML and then no further input is accepted. With 
 uml_mconsole I'm however able
 to run very basic commands like a crash dump, sysrq ond so on.

 You may get an idea of the magnitude of pages_dirtied by using a chain of
 BUG_ON()s, like:

 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);
 BUG_ON(pages_dirtied  1);
 BUG_ON(pages_dirtied  1000);
 BUG_ON(pages_dirtied  100);

 Probably 1 million is already too much for normal operation?

 period = HZ * pages_dirtied / task_ratelimit;
 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);  -- this 
 is line 1467

 Summary for mm people:

 Toralf runs trinty on UML/i386.
 After some time pages_dirtied becomes very large.
 More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
 over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling -nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current-nr_dirtied.

Toralf, you can try a snipplet like this one to get the values printed out:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..a80e520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space 
*mapping,
goto pause;
}
period = HZ * pages_dirtied / task_ratelimit;
+
+   {
+   extern int printf(char *, ...);
+   printf(--- task_ratelimit: %lu\n, task_ratelimit);
+   }
+
pause = period;
if (current-dirty_paused_when)
pause -= now - current-dirty_paused_when;


Yes, printf(), not printk().
Using this hack we print directly to host's stdout. :)

Thanks,
//richard
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-09 Thread Fengguang Wu
On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
 On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
  CC'ing mm folks.
  Please see below.
   Added Fenguang to CC since he is the author of this code.

Thanks!

  Am 09.10.2013 19:26, schrieb Toralf Förster:
   On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
   On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster toralf.foers...@gmx.de 
   wrote:
   Hmm, now pages_dirtied is zero, according to the backtrace, but the 
   BUG_ON()
   asserts its strict positive?!?
  
   Can you please try the following instead of the BUG_ON():
  
   if (pause  0) {
   printk(pages_dirtied = %lu\n, pages_dirtied);
   printk(task_ratelimit = %lu\n, task_ratelimit);
   printk(pause = %ld\n, pause);
   }
  
   Gr{oetje,eeting}s,
  
   Geert
   I tried it in different ways already - I'm completely unsuccessful in 
   getting any printk output.
   As soon as the issue happens I do have a
  
   BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
  
   at stderr of the UML and then no further input is accepted. With 
   uml_mconsole I'm however able
   to run very basic commands like a crash dump, sysrq ond so on.
  
   You may get an idea of the magnitude of pages_dirtied by using a chain of
   BUG_ON()s, like:
  
   BUG_ON(pages_dirtied  20);
   BUG_ON(pages_dirtied  10);
   BUG_ON(pages_dirtied  1);
   BUG_ON(pages_dirtied  1000);
   BUG_ON(pages_dirtied  100);
  
   Probably 1 million is already too much for normal operation?
  
   period = HZ * pages_dirtied / task_ratelimit;
 BUG_ON(pages_dirtied  20);
 BUG_ON(pages_dirtied  10);  -- this 
   is line 1467
  
  Summary for mm people:
  
  Toralf runs trinty on UML/i386.
  After some time pages_dirtied becomes very large.
  More than 10 pages in this case.
   Huh, this is really strange. pages_dirtied is passed into
 balance_dirty_pages() from current-nr_dirtied. So I wonder how a value
 over 10^9 can get there.

I noticed aio_setup_ring() in the call trace and find it recently
added a SetPageDirty() call in a loop by commit 36bc08cc01 (fs/aio:
Add support to aio ring pages migration). So added CC to its authors.

 After all that is over 4TB so I somewhat doubt the
 task was ever able to dirty that much during its lifetime (but correct me
 if I'm wrong here, with UML and memory backed disks it is not totally
 impossible)... I went through the logic of handling -nr_dirtied but
 I didn't find any obvious problem there. Hum, maybe one thing - what
 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
 huge, we could possibly accumulate huge current-nr_dirtied.
 
  Thus, period = HZ * pages_dirtied / task_ratelimit overflows
  and period/pause becomes extremely large.

Yes, that's possible.

  It looks like io_schedule_timeout() get's called with a very large timeout.
  I don't know why if (unlikely(pause  max_pause)) { does not help.

The test will sure work and limit pause to = max_pause. However it's
very possible balance_dirty_pages() cannot break out of the loop (or
being called repeatedly) and block the task.

I'm afraid there are no one to clear the dirty pages, which makes
balance_dirty_pages() waiting for ever.

Thanks,
Fengguang

  
   the back trace is :
   
   tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core 
   /home/tfoerste/devel/linux/linux -batch -ex bt
   [New LWP 6911]
   Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk 
   ubda=/home/tfoerste/virtual/uml/tr'.
   Program terminated with signal 6, Aborted.
   #0  0xb77a7424 in __kernel_vsyscall ()
   #0  0xb77a7424 in __kernel_vsyscall ()
   #1  0x083bdf35 in kill ()
   #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
   #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
   #4  0x080623c4 in panic_exit (self=0x85c1558 panic_exit_notifier, 
   unused1=0, unused2=0x85f76e0 buf.16221) at arch/um/kernel/um_arch.c:240
   #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 
   buf.16221, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
   #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 
   panic_notifier_list, val=0, v=0x85f76e0 buf.16221, nr_to_call=0, 
   nr_calls=0x0) at kernel/notifier.c:182
   #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at 
   kernel/notifier.c:191
   #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
   #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, 
   mapping=optimized out) at mm/page-writeback.c:1467
   #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at 
   mm/page-writeback.c:1661
   #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, 
   address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=incomplete 
   type) at mm/memory.c:3452
   #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=optimized out, 
   

Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-08 Thread Toralf Förster
Well, the quick hack below at least works for the moment to
overcome the soft lookup and the hang/unresponsiveness of the 32 bit
user mode linux guest :


diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..7e9483c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1503,6 +1503,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
}

 pause:
+   if (pause < 0)
+   break;
trace_balance_dirty_pages(bdi,
  dirty_thresh,
  background_thresh,



I'm not proud of it but after starring at the source code in
mm/page-writeback.c too often and too long currently I don't have any
better clue.

WRT to debug of the culprit: neither printk nor friends worked (maybe
b/c the affected process is just hanging ?) and BUG_ON doesn't gave me
any new clues.


On 10/06/2013 10:26 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster  
> wrote:
>> On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
>>> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster  
>>> wrote:
 The UML stopped here :
 ...
 if (unlikely(task_ratelimit == 0)) {
 period = max_pause;
 pause = max_pause;
 BUG_ON(pause < 0);
 goto pause;
 }
 BUG_ON(pages_dirtied < 0);
 BUG_ON(task_ratelimit < 0);
 period = HZ * pages_dirtied / task_ratelimit;
 BUG_ON(period < 0); <--here
>>>
>>> So pages_dirtied becomes that big compared to task_ratelimit (both are
>>> "unsigned long"), that period (which is "long", just like "pause") overflows
>>> into a negative number.
>>>
>>> This is indeed much more likely to happen on 32-bit.
>>>
 The back trace is :
>>>
 #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=>>> out>) at mm/page-writeback.c:1471
>>>
>>> But here pages_dirtied is only 9??
> 
>> Well, this points to an overflow or ? :
> 
> Negative indicates an overflow, but pages_dirtied doesn't.
> 
>> tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
>>   1463  BUG_ON(pause < 0);
>>   1464  goto pause;
>>   1465  }
>>   1466  period = HZ * pages_dirtied / task_ratelimit;
>>   1467  pause = period;
>>   1468  BUG_ON(pause < 0 && pages_dirtied > 0 && 
>> task_ratelimit > 0);
>>   1469  if (current->dirty_paused_when)
>>   1470  pause -= now - current->dirty_paused_when;
>>   1471  /*
>>   1472   * For less than 1s think time (ext3/4 may block the 
>> dirtier
>>   1473   * for up to 800ms from time to time on 1-HDD; so 
>> does xfs,
>>
>>
>> and the back trace is :
>>
>> #9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=> out>) at mm/page-writeback.c:1468
> 
> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> asserts its strict positive?!?
> 
> Can you please try the following instead of the BUG_ON():
> 
> if (pause < 0) {
> printk("pages_dirtied = %lu\n", pages_dirtied);
> printk("task_ratelimit = %lu\n", task_ratelimit);
> printk("pause = %ld\n", pause);
> }
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] BUG: soft lockup for a user mode linux image

2013-10-08 Thread Toralf Förster
Well, the quickdirty hack below at least works for the moment to
overcome the soft lookup and the hang/unresponsiveness of the 32 bit
user mode linux guest :


diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..7e9483c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1503,6 +1503,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
}

 pause:
+   if (pause  0)
+   break;
trace_balance_dirty_pages(bdi,
  dirty_thresh,
  background_thresh,



I'm not proud of it but after starring at the source code in
mm/page-writeback.c too often and too long currently I don't have any
better clue.

WRT to debug of the culprit: neither printk nor friends worked (maybe
b/c the affected process is just hanging ?) and BUG_ON doesn't gave me
any new clues.


On 10/06/2013 10:26 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster toralf.foers...@gmx.de 
 wrote:
 On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
 On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster toralf.foers...@gmx.de 
 wrote:
 The UML stopped here :
 ...
 if (unlikely(task_ratelimit == 0)) {
 period = max_pause;
 pause = max_pause;
 BUG_ON(pause  0);
 goto pause;
 }
 BUG_ON(pages_dirtied  0);
 BUG_ON(task_ratelimit  0);
 period = HZ * pages_dirtied / task_ratelimit;
 BUG_ON(period  0); --here

 So pages_dirtied becomes that big compared to task_ratelimit (both are
 unsigned long), that period (which is long, just like pause) overflows
 into a negative number.

 This is indeed much more likely to happen on 32-bit.

 The back trace is :

 #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=optimized 
 out) at mm/page-writeback.c:1471

 But here pages_dirtied is only 9??
 
 Well, this points to an overflow or ? :
 
 Negative indicates an overflow, but pages_dirtied doesn't.
 
 tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
   1463  BUG_ON(pause  0);
   1464  goto pause;
   1465  }
   1466  period = HZ * pages_dirtied / task_ratelimit;
   1467  pause = period;
   1468  BUG_ON(pause  0  pages_dirtied  0  
 task_ratelimit  0);
   1469  if (current-dirty_paused_when)
   1470  pause -= now - current-dirty_paused_when;
   1471  /*
   1472   * For less than 1s think time (ext3/4 may block the 
 dirtier
   1473   * for up to 800ms from time to time on 1-HDD; so 
 does xfs,


 and the back trace is :

 #9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=optimized 
 out) at mm/page-writeback.c:1468
 
 Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
 asserts its strict positive?!?
 
 Can you please try the following instead of the BUG_ON():
 
 if (pause  0) {
 printk(pages_dirtied = %lu\n, pages_dirtied);
 printk(task_ratelimit = %lu\n, task_ratelimit);
 printk(pause = %ld\n, pause);
 }
 
 Gr{oetje,eeting}s,
 
 Geert
 
 --
 Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
 ge...@linux-m68k.org
 
 In personal conversations with technical people, I call myself a hacker. But
 when I'm talking to journalists I just say programmer or something like 
 that.
 -- Linus Torvalds
 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/