Am 09.10.2013 23:47, schrieb Jan Kara: > On Wed 09-10-13 20:43:50, Richard Weinberger wrote: >> CC'ing mm folks. >> Please see below. > Added Fenguang to CC since he is the author of this code.
Thx, get_maintainer.pl didn't list him. >> Am 09.10.2013 19:26, schrieb Toralf Förster: >>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote: >>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foers...@gmx.de> >>>> wrote: >>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the >>>>>> BUG_ON() >>>>>> asserts its strict positive?!? >>>>>> >>>>>> Can you please try the following instead of the BUG_ON(): >>>>>> >>>>>> if (pause < 0) { >>>>>> printk("pages_dirtied = %lu\n", pages_dirtied); >>>>>> printk("task_ratelimit = %lu\n", task_ratelimit); >>>>>> printk("pause = %ld\n", pause); >>>>>> } >>>>>> >>>>>> Gr{oetje,eeting}s, >>>>>> >>>>>> Geert >>>>> I tried it in different ways already - I'm completely unsuccessful in >>>>> getting any printk output. >>>>> As soon as the issue happens I do have a >>>>> >>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] >>>>> >>>>> at stderr of the UML and then no further input is accepted. With >>>>> uml_mconsole I'm however able >>>>> to run very basic commands like a crash dump, sysrq ond so on. >>>> >>>> You may get an idea of the magnitude of pages_dirtied by using a chain of >>>> BUG_ON()s, like: >>>> >>>> BUG_ON(pages_dirtied > 2000000000); >>>> BUG_ON(pages_dirtied > 1000000000); >>>> BUG_ON(pages_dirtied > 100000000); >>>> BUG_ON(pages_dirtied > 10000000); >>>> BUG_ON(pages_dirtied > 1000000); >>>> >>>> Probably 1 million is already too much for normal operation? >>>> >>> period = HZ * pages_dirtied / task_ratelimit; >>> BUG_ON(pages_dirtied > 2000000000); >>> BUG_ON(pages_dirtied > 1000000000); <-------------- this >>> is line 1467 >> >> Summary for mm people: >> >> Toralf runs trinty on UML/i386. >> After some time pages_dirtied becomes very large. >> More than 1000000000 pages in this case. > Huh, this is really strange. pages_dirtied is passed into > balance_dirty_pages() from current->nr_dirtied. So I wonder how a value > over 10^9 can get there. After all that is over 4TB so I somewhat doubt the > task was ever able to dirty that much during its lifetime (but correct me > if I'm wrong here, with UML and memory backed disks it is not totally > impossible)... I went through the logic of handling ->nr_dirtied but > I didn't find any obvious problem there. Hum, maybe one thing - what > 'task_ratelimit' values do you see in balance_dirty_pages? If that one was > huge, we could possibly accumulate huge current->nr_dirtied. Toralf, you can try a snipplet like this one to get the values printed out: diff --git a/mm/page-writeback.c b/mm/page-writeback.c index f5236f8..a80e520 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping, goto pause; } period = HZ * pages_dirtied / task_ratelimit; + + { + extern int printf(char *, ...); + printf("---> task_ratelimit: %lu\n", task_ratelimit); + } + pause = period; if (current->dirty_paused_when) pause -= now - current->dirty_paused_when; Yes, printf(), not printk(). Using this hack we print directly to host's stdout. :) Thanks, //richard ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel