Re: 100% iowait on one of cpus in current -git
On 11/1/07, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > On 11/1/07, Fengguang Wu <[EMAIL PROTECTED]> wrote: > > Thank you. Maybe we can start by the applied debug patch :-) > > Will applied it and try to recreate this. Patch applied, used emerge to install a 2.6.24-rc1 kernel. I had no complete stalls, but three times during the move from tmpfs to the main xfs the emerge got noticeable slower. There still was writeout happening, but as emerge prints out every file it has written during the pause not one file was processed. vmstat 10: procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 0 1 0 3146424332 61476800 134 1849 438 2515 3 4 91 2 0 0 0 3146644332 61478400 2 1628 507 646 0 2 85 13 0 0 0 3146868332 61486800 5 2359 527 1076 0 3 97 0 1 0 0 3144372332 6161480096 2829 607 2666 2 5 92 0 -> normal writeout 0 0 0 3140560332 61814400 152 2764 633 3308 3 6 91 0 0 0 0 3137332332 61990800 114 1801 588 2858 3 4 93 0 0 0 0 3136912332 6201360020 827 393 1605 1 2 98 0 -> first stall 0 0 0 3137088332 62013600 0 557 339 1437 0 1 99 0 0 0 0 3137160332 62013600 0 642 310 1400 0 1 99 0 0 0 0 3136588332 62017200 6 2972 527 1195 0 3 80 16 0 0 0 3136276332 6203480010 2668 558 1195 0 3 96 0 0 0 0 3135228332 62042400 8 2712 522 1311 0 4 96 0 0 0 0 3131740332 6215240075 2935 559 2457 2 5 93 0 0 0 0 3128348332 6229720085 1470 490 2607 3 4 93 0 0 0 0 3129292332 62297200 0 527 353 1398 0 1 99 0 -> second longer stall 0 0 0 3128520332 62302800 6 488 249 1390 0 1 99 0 0 0 0 3128236332 62302800 0 482 222 1222 0 1 99 0 0 0 0 3128408332 62302800 0 585 269 1301 0 0 99 0 0 0 0 3128532332 62302800 0 610 262 1278 0 0 99 0 0 0 0 3128568332 62302800 0 636 345 1639 0 1 99 0 0 0 0 3129032332 62304000 1 664 337 1466 0 1 99 0 0 0 0 3129484332 62304000 0 658 300 1508 0 0 100 0 0 0 0 3129576332 62304000 0 562 271 1454 0 1 99 0 0 0 0 3129736332 62304000 0 627 278 1406 0 1 99 0 0 0 0 3129368332 62304000 0 507 274 1301 0 1 99 0 0 0 0 3129004332 62304000 0 444 211 1213 0 0 99 0 0 1 0 3127260332 62304000 0 1036 305 1242 0 1 95 4 0 0 0 3126280332 62312800 7 4241 555 1575 1 5 84 10 0 0 0 3124948332 62323200 6 4194 529 1505 1 4 95 0 0 0 0 3125228332 6241680058 1966 586 1964 2 4 94 0 -> emerge resumed to normal speed, without any intervention from my side 0 0 0 3120932332 62590400 112 1546 546 2565 3 4 93 0 0 0 0 3118012332 62756800 128 1542 612 2705 3 4 93 0 >From syslog: first stall: [ 575.05] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 47259 global 610 0 0 wc __ tw 1023 sk 0 [ 586.35] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 50465 global 6117 0 0 wc _M tw 967 sk 0 [ 586.36] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 50408 global 6117 0 0 wc __ tw 1022 sk 0 [ 599.90] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 53523 global 11141 0 0 wc __ tw 1009 sk 0 [ 635.78] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 59397 global 12757 124 0 wc __ tw 0 sk 0 [ 638.47] mm/page-writeback.c 418 balance_dirty_pages: emerge(6113) 1536 global 11405 51 0 wc __ tw 0 sk 0 [ 638.82] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 58373 global 11276 48 0 wc __ tw -1 sk 0 [ 641.26] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 57348 global 10565 100 0 wc __ tw 0 sk 0 [ 643.98] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 56324 global 9788 103 0 wc __ tw -1 sk 0 [ 646.12] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 55299 global 8912 6 0 wc __ tw 0 sk 0 second stall: [ 664.04] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 48117 global 2864 81 0 wc _M tw -13 sk 0 [ 664.40] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 47080 global 1995 137 0 wc _M tw 176 sk 0 [ 664.51] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 46232 global 1929 267 0 wc __ tw 880 sk 0 cron[6927]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) [ 809.56] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 49422 global 19166 217 0 wc _M tw 380 sk 0 [ 811.72] mm/page-writeback.c 655
Re: 100% iowait on one of cpus in current -git
On 11/1/07, Fengguang Wu <[EMAIL PROTECTED]> wrote: > On Wed, Oct 31, 2007 at 04:22:10PM +0100, Torsten Kaiser wrote: > > Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. > > Each time I noticed this I was using emerge (package util from the > > gentoo distribution) to install/upgrade a package. The last step, > > where this hang occurred, is moving the prepared files from a tmpfs > > partion to the main xfs filesystem. > > The hangs where not fatal, after a few second everything resumed > > normal, so I was not able to capture a good image of what was > > happening. > > Thank you for the detailed report. > > How severe was the hangs? Only writeouts stalled, all apps stalled, or > cannot type and run new commands? Only writeout stalled. The emerge that was moving the files hung, but everything else worked normaly. I was able to run new commands, like coping the /proc/meminfo. [snip] > > After this SysRq+W writeback resumed again. Possible that writing > > above into the syslog triggered that. > > Maybe. Are the log files on another disk/partition? No, everything was going to / What might be interesting is, that doing cat /proc/meminfo >~/stall/meminfo did not resume the writeback. So there might some threshold that only was broken with the additional write from syslog-ng. Or syslog-ng does some flushing, I dont now. (I'm using the syslog-ng package from gentoo: http://www.balabit.com/products/syslog_ng/ , version 2.0.5) > > The source tmpfs is mounted with any special parameters, but the > > target xfs filesystem resides on a dm-crypt device that is on top a 3 > > disk RAID5 md. > > During the hang all CPUs where idle. > > No iowaits? ;-) No, I have a KSysGuard in my taskbar that showed no activity at all. OK, the subject does not match for my case, but there was also a tmpfs involved. And I found no thread with stalls on xfs. :-) > > The system is x86_64 with CONFIG_NO_HZ=y, but was still receiving ~330 > > interrupts per second because of the bttv driver. (But I was not using > > that device at this time.) > > > > I'm willing to test patches or more provide more information, but lack > > a good testcase to trigger this on demand. > > Thank you. Maybe we can start by the applied debug patch :-) Will applied it and try to recreate this. Thanks for looking into it. Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Wed, Oct 31, 2007 at 04:22:10PM +0100, Torsten Kaiser wrote: > On 10/22/07, Fengguang Wu <[EMAIL PROTECTED]> wrote: > > On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: > > > Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? > > > > Bingo! It can be reproduced in -mm on reiserfs: > > > > # mkfs.reiserfs /dev/sdb1 > > # mount /dev/sdb1 /test > > # cp bin /test > > > > # dmesg > > [...] > > [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) > > [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) > > [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) > > Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. > Each time I noticed this I was using emerge (package util from the > gentoo distribution) to install/upgrade a package. The last step, > where this hang occurred, is moving the prepared files from a tmpfs > partion to the main xfs filesystem. > The hangs where not fatal, after a few second everything resumed > normal, so I was not able to capture a good image of what was > happening. Thank you for the detailed report. How severe was the hangs? Only writeouts stalled, all apps stalled, or cannot type and run new commands? > Today it happend again, but a little more obvious. During the moving > process the writeout stalled completly for several minutes until I hit > SysRq+W. > > /proc/meminfo: > MemTotal: 4061808 kB > MemFree:881332 kB > Buffers: 0 kB > Cached:2566628 kB > SwapCached: 64 kB > Active: 926612 kB > Inactive: 1959136 kB > SwapTotal: 9775416 kB > SwapFree: 9775296 kB > Dirty: 44948 kB > Writeback: 0 kB > AnonPages: 319068 kB > Mapped: 52844 kB > Slab: 235572 kB > SReclaimable: 164408 kB > SUnreclaim: 71164 kB > PageTables: 9576 kB > NFS_Unstable:0 kB > Bounce: 0 kB > CommitLimit: 11806320 kB > Committed_AS: 544520 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 35004 kB > VmallocChunk: 34359702447 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > The 'Dirty' count did not decrease during this time and 'Writeback' stayed at > 0. > I also have /proc/pagetypeinfo ,but I see nothing interessing in > there. (But will send it, if needed) > > The output from SysRq+W: > SysRq : Show Blocked State > taskPC stack pid father > pdflush D 81001fcc2a88 0 285 2 > 810005d55580 0046 0800 0071 > 0400 8022d61c 80817b00 80817b00 > 80813f40 80817b00 810100893b18 > Call Trace: > [] task_rq_lock+0x4c/0x90 > [] __wake_up_common+0x5a/0x90 > [] __down+0xa7/0x11e > [] default_wake_function+0x0/0x10 > [] __down_failed+0x35/0x3a > [] xfs_buf_lock+0x3e/0x40 > [] _xfs_buf_find+0x13e/0x240 > [] xfs_buf_get_flags+0x6f/0x190 > [] xfs_buf_read_flags+0x12/0xa0 > [] xfs_trans_read_buf+0x64/0x340 > [] xfs_itobp+0x81/0x1e0 > [] xfs_buf_rele+0x2e/0xd0 > [] xfs_iflush+0xfe/0x520 > [] __down_read_trylock+0x42/0x60 > [] xfs_inode_item_push+0x12/0x20 > [] xfs_trans_push_ail+0x267/0x2b0 > [] xlog_ticket_get+0xfb/0x140 > [] xfs_log_reserve+0xee/0x120 > [] xfs_trans_reserve+0xa8/0x210 > [] xfs_iomap_write_allocate+0xfa/0x410 > [] __split_bio+0x38a/0x3c0 > [] xfs_start_page_writeback+0x27/0x60 > [] xfs_iomap+0x26c/0x310 > [] xfs_map_blocks+0x38/0x90 > [] xfs_page_state_convert+0x2b8/0x630 > [] xfs_vm_writepage+0x6f/0x120 > [] __writepage+0xa/0x30 > [] write_cache_pages+0x23e/0x330 > [] __writepage+0x0/0x30 > [] xfs_iflush+0x3b7/0x520 > [] _xfs_buf_ioapply+0x222/0x320 > [] __up_read+0x21/0xb0 > [] xfs_iunlock+0x5c/0xc0 > [] do_writepages+0x20/0x40 > [] __writeback_single_inode+0xb0/0x380 > [] dm_table_any_congested+0x2e/0x80 > [] generic_sync_sb_inodes+0x20d/0x330 > [] writeback_inodes+0xa2/0xe0 > [] wb_kupdate+0xa6/0x120 > [] pdflush+0x0/0x1e0 > [] pdflush+0x110/0x1e0 > [] wb_kupdate+0x0/0x120 > [] kthread+0x4b/0x80 > [] child_rip+0xa/0x12 > [] kthread+0x0/0x80 > [] child_rip+0x0/0x12 > > emergeD 81001fcc2a88 0 3221 8163 > 81008c0679f8 0086 81008c067988 8024a719 > 8100060fb008 8022c8ea 80817b00 80817b00 > 80813f40 80817b00 810100893b18 > Call Trace: > [] autoremove_wake_function+0x9/0x30 > [] __wake_up_common+0x5a/0x90 > [] __wake_up_common+0x5a/0x90 > [] __down+0xa7/0x11e > [] default_wake_function+0x0/0x10 > [] __down_failed+0x35/0x3a > [] xfs_buf_lock+0x3e/0x40 > [] _xfs_buf_find+0x13e/0x240 > [] xfs_buf_get_flags+0x6f/0x190 > [] xfs_buf_read_flags+0x12/0xa0 > [] xfs_trans_read_buf+0x64/0x340 > [] xfs_itobp+0x81/0x1e0 > [] xfs_buf_rele+0x2e/0xd0 > [] xfs_iflush+0xfe/0x520
Re: 100% iowait on one of cpus in current -git
On Wed, Oct 31, 2007 at 04:22:10PM +0100, Torsten Kaiser wrote: On 10/22/07, Fengguang Wu [EMAIL PROTECTED] wrote: On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Bingo! It can be reproduced in -mm on reiserfs: # mkfs.reiserfs /dev/sdb1 # mount /dev/sdb1 /test # cp bin /test wait for a while # dmesg [...] [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. Each time I noticed this I was using emerge (package util from the gentoo distribution) to install/upgrade a package. The last step, where this hang occurred, is moving the prepared files from a tmpfs partion to the main xfs filesystem. The hangs where not fatal, after a few second everything resumed normal, so I was not able to capture a good image of what was happening. Thank you for the detailed report. How severe was the hangs? Only writeouts stalled, all apps stalled, or cannot type and run new commands? Today it happend again, but a little more obvious. During the moving process the writeout stalled completly for several minutes until I hit SysRq+W. /proc/meminfo: MemTotal: 4061808 kB MemFree:881332 kB Buffers: 0 kB Cached:2566628 kB SwapCached: 64 kB Active: 926612 kB Inactive: 1959136 kB SwapTotal: 9775416 kB SwapFree: 9775296 kB Dirty: 44948 kB Writeback: 0 kB AnonPages: 319068 kB Mapped: 52844 kB Slab: 235572 kB SReclaimable: 164408 kB SUnreclaim: 71164 kB PageTables: 9576 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 11806320 kB Committed_AS: 544520 kB VmallocTotal: 34359738367 kB VmallocUsed: 35004 kB VmallocChunk: 34359702447 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB The 'Dirty' count did not decrease during this time and 'Writeback' stayed at 0. I also have /proc/pagetypeinfo ,but I see nothing interessing in there. (But will send it, if needed) The output from SysRq+W: SysRq : Show Blocked State taskPC stack pid father pdflush D 81001fcc2a88 0 285 2 810005d55580 0046 0800 0071 0400 8022d61c 80817b00 80817b00 80813f40 80817b00 810100893b18 Call Trace: [8022d61c] task_rq_lock+0x4c/0x90 [8022c8ea] __wake_up_common+0x5a/0x90 [805b14c7] __down+0xa7/0x11e [8022da70] default_wake_function+0x0/0x10 [805b1145] __down_failed+0x35/0x3a [803750be] xfs_buf_lock+0x3e/0x40 [803771fe] _xfs_buf_find+0x13e/0x240 [8037736f] xfs_buf_get_flags+0x6f/0x190 [803774a2] xfs_buf_read_flags+0x12/0xa0 [80368614] xfs_trans_read_buf+0x64/0x340 [80352151] xfs_itobp+0x81/0x1e0 [803759de] xfs_buf_rele+0x2e/0xd0 [80354afe] xfs_iflush+0xfe/0x520 [803ae3b2] __down_read_trylock+0x42/0x60 [80355a72] xfs_inode_item_push+0x12/0x20 [80368037] xfs_trans_push_ail+0x267/0x2b0 [8035a33b] xlog_ticket_get+0xfb/0x140 [8035c5ae] xfs_log_reserve+0xee/0x120 [803669e8] xfs_trans_reserve+0xa8/0x210 [8035703a] xfs_iomap_write_allocate+0xfa/0x410 [804ce67a] __split_bio+0x38a/0x3c0 [80373657] xfs_start_page_writeback+0x27/0x60 [8035660c] xfs_iomap+0x26c/0x310 [803735d8] xfs_map_blocks+0x38/0x90 [80374a88] xfs_page_state_convert+0x2b8/0x630 [80374f5f] xfs_vm_writepage+0x6f/0x120 [8026acda] __writepage+0xa/0x30 [8026b2ce] write_cache_pages+0x23e/0x330 [8026acd0] __writepage+0x0/0x30 [80354db7] xfs_iflush+0x3b7/0x520 [80375782] _xfs_buf_ioapply+0x222/0x320 [803ae451] __up_read+0x21/0xb0 [8034f22c] xfs_iunlock+0x5c/0xc0 [8026b410] do_writepages+0x20/0x40 [802b36a0] __writeback_single_inode+0xb0/0x380 [804d052e] dm_table_any_congested+0x2e/0x80 [802b3d8d] generic_sync_sb_inodes+0x20d/0x330 [802b4322] writeback_inodes+0xa2/0xe0 [8026bde6] wb_kupdate+0xa6/0x120 [8026c2a0] pdflush+0x0/0x1e0 [8026c3b0] pdflush+0x110/0x1e0 [8026bd40] wb_kupdate+0x0/0x120 [8024a32b] kthread+0x4b/0x80 [8020c9d8] child_rip+0xa/0x12 [8024a2e0] kthread+0x0/0x80 [8020c9ce] child_rip+0x0/0x12 emergeD 81001fcc2a88 0 3221 8163 81008c0679f8 0086 81008c067988
Re: 100% iowait on one of cpus in current -git
On 11/1/07, Fengguang Wu [EMAIL PROTECTED] wrote: On Wed, Oct 31, 2007 at 04:22:10PM +0100, Torsten Kaiser wrote: Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. Each time I noticed this I was using emerge (package util from the gentoo distribution) to install/upgrade a package. The last step, where this hang occurred, is moving the prepared files from a tmpfs partion to the main xfs filesystem. The hangs where not fatal, after a few second everything resumed normal, so I was not able to capture a good image of what was happening. Thank you for the detailed report. How severe was the hangs? Only writeouts stalled, all apps stalled, or cannot type and run new commands? Only writeout stalled. The emerge that was moving the files hung, but everything else worked normaly. I was able to run new commands, like coping the /proc/meminfo. [snip] After this SysRq+W writeback resumed again. Possible that writing above into the syslog triggered that. Maybe. Are the log files on another disk/partition? No, everything was going to / What might be interesting is, that doing cat /proc/meminfo ~/stall/meminfo did not resume the writeback. So there might some threshold that only was broken with the additional write from syslog-ng. Or syslog-ng does some flushing, I dont now. (I'm using the syslog-ng package from gentoo: http://www.balabit.com/products/syslog_ng/ , version 2.0.5) The source tmpfs is mounted with any special parameters, but the target xfs filesystem resides on a dm-crypt device that is on top a 3 disk RAID5 md. During the hang all CPUs where idle. No iowaits? ;-) No, I have a KSysGuard in my taskbar that showed no activity at all. OK, the subject does not match for my case, but there was also a tmpfs involved. And I found no thread with stalls on xfs. :-) The system is x86_64 with CONFIG_NO_HZ=y, but was still receiving ~330 interrupts per second because of the bttv driver. (But I was not using that device at this time.) I'm willing to test patches or more provide more information, but lack a good testcase to trigger this on demand. Thank you. Maybe we can start by the applied debug patch :-) Will applied it and try to recreate this. Thanks for looking into it. Torsten - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On 11/1/07, Torsten Kaiser [EMAIL PROTECTED] wrote: On 11/1/07, Fengguang Wu [EMAIL PROTECTED] wrote: Thank you. Maybe we can start by the applied debug patch :-) Will applied it and try to recreate this. Patch applied, used emerge to install a 2.6.24-rc1 kernel. I had no complete stalls, but three times during the move from tmpfs to the main xfs the emerge got noticeable slower. There still was writeout happening, but as emerge prints out every file it has written during the pause not one file was processed. vmstat 10: procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 0 1 0 3146424332 61476800 134 1849 438 2515 3 4 91 2 0 0 0 3146644332 61478400 2 1628 507 646 0 2 85 13 0 0 0 3146868332 61486800 5 2359 527 1076 0 3 97 0 1 0 0 3144372332 6161480096 2829 607 2666 2 5 92 0 - normal writeout 0 0 0 3140560332 61814400 152 2764 633 3308 3 6 91 0 0 0 0 3137332332 61990800 114 1801 588 2858 3 4 93 0 0 0 0 3136912332 6201360020 827 393 1605 1 2 98 0 - first stall 0 0 0 3137088332 62013600 0 557 339 1437 0 1 99 0 0 0 0 3137160332 62013600 0 642 310 1400 0 1 99 0 0 0 0 3136588332 62017200 6 2972 527 1195 0 3 80 16 0 0 0 3136276332 6203480010 2668 558 1195 0 3 96 0 0 0 0 3135228332 62042400 8 2712 522 1311 0 4 96 0 0 0 0 3131740332 6215240075 2935 559 2457 2 5 93 0 0 0 0 3128348332 6229720085 1470 490 2607 3 4 93 0 0 0 0 3129292332 62297200 0 527 353 1398 0 1 99 0 - second longer stall 0 0 0 3128520332 62302800 6 488 249 1390 0 1 99 0 0 0 0 3128236332 62302800 0 482 222 1222 0 1 99 0 0 0 0 3128408332 62302800 0 585 269 1301 0 0 99 0 0 0 0 3128532332 62302800 0 610 262 1278 0 0 99 0 0 0 0 3128568332 62302800 0 636 345 1639 0 1 99 0 0 0 0 3129032332 62304000 1 664 337 1466 0 1 99 0 0 0 0 3129484332 62304000 0 658 300 1508 0 0 100 0 0 0 0 3129576332 62304000 0 562 271 1454 0 1 99 0 0 0 0 3129736332 62304000 0 627 278 1406 0 1 99 0 0 0 0 3129368332 62304000 0 507 274 1301 0 1 99 0 0 0 0 3129004332 62304000 0 444 211 1213 0 0 99 0 0 1 0 3127260332 62304000 0 1036 305 1242 0 1 95 4 0 0 0 3126280332 62312800 7 4241 555 1575 1 5 84 10 0 0 0 3124948332 62323200 6 4194 529 1505 1 4 95 0 0 0 0 3125228332 6241680058 1966 586 1964 2 4 94 0 - emerge resumed to normal speed, without any intervention from my side 0 0 0 3120932332 62590400 112 1546 546 2565 3 4 93 0 0 0 0 3118012332 62756800 128 1542 612 2705 3 4 93 0 From syslog: first stall: [ 575.05] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 47259 global 610 0 0 wc __ tw 1023 sk 0 [ 586.35] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 50465 global 6117 0 0 wc _M tw 967 sk 0 [ 586.36] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 50408 global 6117 0 0 wc __ tw 1022 sk 0 [ 599.90] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 53523 global 11141 0 0 wc __ tw 1009 sk 0 [ 635.78] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 59397 global 12757 124 0 wc __ tw 0 sk 0 [ 638.47] mm/page-writeback.c 418 balance_dirty_pages: emerge(6113) 1536 global 11405 51 0 wc __ tw 0 sk 0 [ 638.82] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 58373 global 11276 48 0 wc __ tw -1 sk 0 [ 641.26] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 57348 global 10565 100 0 wc __ tw 0 sk 0 [ 643.98] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 56324 global 9788 103 0 wc __ tw -1 sk 0 [ 646.12] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 55299 global 8912 6 0 wc __ tw 0 sk 0 second stall: [ 664.04] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 48117 global 2864 81 0 wc _M tw -13 sk 0 [ 664.40] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 47080 global 1995 137 0 wc _M tw 176 sk 0 [ 664.51] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 46232 global 1929 267 0 wc __ tw 880 sk 0 cron[6927]: (root) CMD (test -x /usr/sbin/run-crons /usr/sbin/run-crons ) [ 809.56] mm/page-writeback.c 655 wb_kupdate: pdflush(285) 49422 global 19166 217 0 wc _M tw 380 sk 0 [ 811.72] mm/page-writeback.c 655 wb_kupdate:
Re: 100% iowait on one of cpus in current -git
On 10/22/07, Fengguang Wu <[EMAIL PROTECTED]> wrote: > On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: > > Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? > > Bingo! It can be reproduced in -mm on reiserfs: > > # mkfs.reiserfs /dev/sdb1 > # mount /dev/sdb1 /test > # cp bin /test > > # dmesg > [...] > [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) > [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) > [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. Each time I noticed this I was using emerge (package util from the gentoo distribution) to install/upgrade a package. The last step, where this hang occurred, is moving the prepared files from a tmpfs partion to the main xfs filesystem. The hangs where not fatal, after a few second everything resumed normal, so I was not able to capture a good image of what was happening. Today it happend again, but a little more obvious. During the moving process the writeout stalled completly for several minutes until I hit SysRq+W. /proc/meminfo: MemTotal: 4061808 kB MemFree:881332 kB Buffers: 0 kB Cached:2566628 kB SwapCached: 64 kB Active: 926612 kB Inactive: 1959136 kB SwapTotal: 9775416 kB SwapFree: 9775296 kB Dirty: 44948 kB Writeback: 0 kB AnonPages: 319068 kB Mapped: 52844 kB Slab: 235572 kB SReclaimable: 164408 kB SUnreclaim: 71164 kB PageTables: 9576 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 11806320 kB Committed_AS: 544520 kB VmallocTotal: 34359738367 kB VmallocUsed: 35004 kB VmallocChunk: 34359702447 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB The 'Dirty' count did not decrease during this time and 'Writeback' stayed at 0. I also have /proc/pagetypeinfo ,but I see nothing interessing in there. (But will send it, if needed) The output from SysRq+W: SysRq : Show Blocked State taskPC stack pid father pdflush D 81001fcc2a88 0 285 2 810005d55580 0046 0800 0071 0400 8022d61c 80817b00 80817b00 80813f40 80817b00 810100893b18 Call Trace: [] task_rq_lock+0x4c/0x90 [] __wake_up_common+0x5a/0x90 [] __down+0xa7/0x11e [] default_wake_function+0x0/0x10 [] __down_failed+0x35/0x3a [] xfs_buf_lock+0x3e/0x40 [] _xfs_buf_find+0x13e/0x240 [] xfs_buf_get_flags+0x6f/0x190 [] xfs_buf_read_flags+0x12/0xa0 [] xfs_trans_read_buf+0x64/0x340 [] xfs_itobp+0x81/0x1e0 [] xfs_buf_rele+0x2e/0xd0 [] xfs_iflush+0xfe/0x520 [] __down_read_trylock+0x42/0x60 [] xfs_inode_item_push+0x12/0x20 [] xfs_trans_push_ail+0x267/0x2b0 [] xlog_ticket_get+0xfb/0x140 [] xfs_log_reserve+0xee/0x120 [] xfs_trans_reserve+0xa8/0x210 [] xfs_iomap_write_allocate+0xfa/0x410 [] __split_bio+0x38a/0x3c0 [] xfs_start_page_writeback+0x27/0x60 [] xfs_iomap+0x26c/0x310 [] xfs_map_blocks+0x38/0x90 [] xfs_page_state_convert+0x2b8/0x630 [] xfs_vm_writepage+0x6f/0x120 [] __writepage+0xa/0x30 [] write_cache_pages+0x23e/0x330 [] __writepage+0x0/0x30 [] xfs_iflush+0x3b7/0x520 [] _xfs_buf_ioapply+0x222/0x320 [] __up_read+0x21/0xb0 [] xfs_iunlock+0x5c/0xc0 [] do_writepages+0x20/0x40 [] __writeback_single_inode+0xb0/0x380 [] dm_table_any_congested+0x2e/0x80 [] generic_sync_sb_inodes+0x20d/0x330 [] writeback_inodes+0xa2/0xe0 [] wb_kupdate+0xa6/0x120 [] pdflush+0x0/0x1e0 [] pdflush+0x110/0x1e0 [] wb_kupdate+0x0/0x120 [] kthread+0x4b/0x80 [] child_rip+0xa/0x12 [] kthread+0x0/0x80 [] child_rip+0x0/0x12 emergeD 81001fcc2a88 0 3221 8163 81008c0679f8 0086 81008c067988 8024a719 8100060fb008 8022c8ea 80817b00 80817b00 80813f40 80817b00 810100893b18 Call Trace: [] autoremove_wake_function+0x9/0x30 [] __wake_up_common+0x5a/0x90 [] __wake_up_common+0x5a/0x90 [] __down+0xa7/0x11e [] default_wake_function+0x0/0x10 [] __down_failed+0x35/0x3a [] xfs_buf_lock+0x3e/0x40 [] _xfs_buf_find+0x13e/0x240 [] xfs_buf_get_flags+0x6f/0x190 [] xfs_buf_read_flags+0x12/0xa0 [] xfs_trans_read_buf+0x64/0x340 [] xfs_itobp+0x81/0x1e0 [] xfs_buf_rele+0x2e/0xd0 [] xfs_iflush+0xfe/0x520 [] __down_read_trylock+0x42/0x60 [] xfs_inode_item_push+0x12/0x20 [] xfs_trans_push_ail+0x267/0x2b0 [] xfs_log_reserve+0x72/0x120 [] xfs_trans_reserve+0xa8/0x210 [] kmem_zone_zalloc+0x32/0x50 [] xfs_itruncate_finish+0xfb/0x310 [] xfs_free_eofblocks+0x23b/0x280 [] xfs_release+0x153/0x200 [] xfs_file_release+0x10/0x20 [] __fput+0xb1/0x220 [] filp_close+0x54/0x90 [] sys_close+0x9f/0x100 [] system_call+0x7e/0x83 After this SysRq+W writeback resumed again. Possible
Re: 100% iowait on one of cpus in current -git
On 10/22/07, Fengguang Wu [EMAIL PROTECTED] wrote: On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Bingo! It can be reproduced in -mm on reiserfs: # mkfs.reiserfs /dev/sdb1 # mount /dev/sdb1 /test # cp bin /test wait for a while # dmesg [...] [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) Since 2.6.23-mm1 I also experience strange hangs during heavy writeouts. Each time I noticed this I was using emerge (package util from the gentoo distribution) to install/upgrade a package. The last step, where this hang occurred, is moving the prepared files from a tmpfs partion to the main xfs filesystem. The hangs where not fatal, after a few second everything resumed normal, so I was not able to capture a good image of what was happening. Today it happend again, but a little more obvious. During the moving process the writeout stalled completly for several minutes until I hit SysRq+W. /proc/meminfo: MemTotal: 4061808 kB MemFree:881332 kB Buffers: 0 kB Cached:2566628 kB SwapCached: 64 kB Active: 926612 kB Inactive: 1959136 kB SwapTotal: 9775416 kB SwapFree: 9775296 kB Dirty: 44948 kB Writeback: 0 kB AnonPages: 319068 kB Mapped: 52844 kB Slab: 235572 kB SReclaimable: 164408 kB SUnreclaim: 71164 kB PageTables: 9576 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 11806320 kB Committed_AS: 544520 kB VmallocTotal: 34359738367 kB VmallocUsed: 35004 kB VmallocChunk: 34359702447 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB The 'Dirty' count did not decrease during this time and 'Writeback' stayed at 0. I also have /proc/pagetypeinfo ,but I see nothing interessing in there. (But will send it, if needed) The output from SysRq+W: SysRq : Show Blocked State taskPC stack pid father pdflush D 81001fcc2a88 0 285 2 810005d55580 0046 0800 0071 0400 8022d61c 80817b00 80817b00 80813f40 80817b00 810100893b18 Call Trace: [8022d61c] task_rq_lock+0x4c/0x90 [8022c8ea] __wake_up_common+0x5a/0x90 [805b14c7] __down+0xa7/0x11e [8022da70] default_wake_function+0x0/0x10 [805b1145] __down_failed+0x35/0x3a [803750be] xfs_buf_lock+0x3e/0x40 [803771fe] _xfs_buf_find+0x13e/0x240 [8037736f] xfs_buf_get_flags+0x6f/0x190 [803774a2] xfs_buf_read_flags+0x12/0xa0 [80368614] xfs_trans_read_buf+0x64/0x340 [80352151] xfs_itobp+0x81/0x1e0 [803759de] xfs_buf_rele+0x2e/0xd0 [80354afe] xfs_iflush+0xfe/0x520 [803ae3b2] __down_read_trylock+0x42/0x60 [80355a72] xfs_inode_item_push+0x12/0x20 [80368037] xfs_trans_push_ail+0x267/0x2b0 [8035a33b] xlog_ticket_get+0xfb/0x140 [8035c5ae] xfs_log_reserve+0xee/0x120 [803669e8] xfs_trans_reserve+0xa8/0x210 [8035703a] xfs_iomap_write_allocate+0xfa/0x410 [804ce67a] __split_bio+0x38a/0x3c0 [80373657] xfs_start_page_writeback+0x27/0x60 [8035660c] xfs_iomap+0x26c/0x310 [803735d8] xfs_map_blocks+0x38/0x90 [80374a88] xfs_page_state_convert+0x2b8/0x630 [80374f5f] xfs_vm_writepage+0x6f/0x120 [8026acda] __writepage+0xa/0x30 [8026b2ce] write_cache_pages+0x23e/0x330 [8026acd0] __writepage+0x0/0x30 [80354db7] xfs_iflush+0x3b7/0x520 [80375782] _xfs_buf_ioapply+0x222/0x320 [803ae451] __up_read+0x21/0xb0 [8034f22c] xfs_iunlock+0x5c/0xc0 [8026b410] do_writepages+0x20/0x40 [802b36a0] __writeback_single_inode+0xb0/0x380 [804d052e] dm_table_any_congested+0x2e/0x80 [802b3d8d] generic_sync_sb_inodes+0x20d/0x330 [802b4322] writeback_inodes+0xa2/0xe0 [8026bde6] wb_kupdate+0xa6/0x120 [8026c2a0] pdflush+0x0/0x1e0 [8026c3b0] pdflush+0x110/0x1e0 [8026bd40] wb_kupdate+0x0/0x120 [8024a32b] kthread+0x4b/0x80 [8020c9d8] child_rip+0xa/0x12 [8024a2e0] kthread+0x0/0x80 [8020c9ce] child_rip+0x0/0x12 emergeD 81001fcc2a88 0 3221 8163 81008c0679f8 0086 81008c067988 8024a719 8100060fb008 8022c8ea 80817b00 80817b00 80813f40 80817b00 810100893b18 Call Trace: [8024a719] autoremove_wake_function+0x9/0x30 [8022c8ea] __wake_up_common+0x5a/0x90 [8022c8ea] __wake_up_common+0x5a/0x90 [805b14c7]
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: > Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Bingo! It can be reproduced in -mm on reiserfs: # mkfs.reiserfs /dev/sdb1 # mount /dev/sdb1 /test # cp bin /test # dmesg [...] [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) [ 418.346131] requeue_io 308: inode 9 size 180 at 08:11(sdb1) [ 418.346136] requeue_io 308: inode 10 size 1488 at 08:11(sdb1) [ 418.346142] requeue_io 308: inode 12 size 1358 at 08:11(sdb1) [ 418.346148] requeue_io 308: inode 13 size 482 at 08:11(sdb1) [ 418.346153] requeue_io 308: inode 14 size 171 at 08:11(sdb1) [ 418.346159] requeue_io 308: inode 15 size 93 at 08:11(sdb1) [ 418.346164] requeue_io 308: inode 16 size 81 at 08:11(sdb1) [ 418.346170] requeue_io 308: inode 17 size 212 at 08:11(sdb1) [ 418.346176] requeue_io 308: inode 18 size 431 at 08:11(sdb1) [ 418.346181] requeue_io 308: inode 19 size 231 at 08:11(sdb1) [ 418.346187] requeue_io 308: inode 20 size 1756 at 08:11(sdb1) [ 418.346193] requeue_io 308: inode 21 size 1229 at 08:11(sdb1) [ 418.346198] requeue_io 308: inode 22 size 157 at 08:11(sdb1) [ 418.346204] requeue_io 308: inode 23 size 3430 at 08:11(sdb1) [ 418.346210] requeue_io 308: inode 24 size 200 at 08:11(sdb1) [ 418.346215] requeue_io 308: inode 25 size 202 at 08:11(sdb1) [ 418.346221] requeue_io 308: inode 26 size 386 at 08:11(sdb1) [ 418.346226] requeue_io 308: inode 27 size 264 at 08:11(sdb1) [ 418.346232] requeue_io 308: inode 28 size 268 at 08:11(sdb1) [ 418.346238] requeue_io 308: inode 29 size 1228 at 08:11(sdb1) [ 418.346243] requeue_io 308: inode 30 size 404 at 08:11(sdb1) [ 418.346249] requeue_io 308: inode 31 size 2452 at 08:11(sdb1) [ 418.346255] requeue_io 308: inode 32 size 1236 at 08:11(sdb1) [ 418.346260] requeue_io 308: inode 33 size 655 at 08:11(sdb1) [ 418.346266] requeue_io 308: inode 35 size 330 at 08:11(sdb1) [ 418.346272] requeue_io 308: inode 36 size 248 at 08:11(sdb1) [ 418.346277] requeue_io 308: inode 37 size 683 at 08:11(sdb1) [ 418.346283] requeue_io 308: inode 38 size 1451 at 08:11(sdb1) [ 418.346288] requeue_io 308: inode 39 size 894 at 08:11(sdb1) [ 418.346294] requeue_io 308: inode 40 size 879 at 08:11(sdb1) [ 418.346300] requeue_io 308: inode 42 size 797 at 08:11(sdb1) [ 418.346305] requeue_io 308: inode 43 size 1314 at 08:11(sdb1) [ 418.346311] requeue_io 308: inode 44 size 1463 at 08:11(sdb1) [ 418.346317] requeue_io 308: inode 45 size 3032 at 08:11(sdb1) [ 418.346322] requeue_io 308: inode 46 size 325 at 08:11(sdb1) [ 418.346328] requeue_io 308: inode 47 size 583 at 08:11(sdb1) [ 418.346334] requeue_io 308: inode 48 size 1660 at 08:11(sdb1) [ 418.346339] requeue_io 308: inode 49 size 3159 at 08:11(sdb1) [ 418.346345] requeue_io 308: inode 50 size 510 at 08:11(sdb1) [ 418.346350] requeue_io 308: inode 51 size 100 at 08:11(sdb1) [ 418.346356] requeue_io 308: inode 52 size 143 at 08:11(sdb1) [ 418.346370] requeue_io 308: inode 53 size 954 at 08:11(sdb1) [ 418.346373] requeue_io 308: inode 54 size 322 at 08:11(sdb1) [ 418.346376] requeue_io 308: inode 55 size 970 at 08:11(sdb1) [ 418.346379] requeue_io 308: inode 57 size 483 at 08:11(sdb1) [ 418.346382] requeue_io 308: inode 58 size 1125 at 08:11(sdb1) [ 418.346385] requeue_io 308: inode 59 size 2196 at 08:11(sdb1) [ 418.346388] requeue_io 308: inode 60 size 104 at 08:11(sdb1) [ 418.346391] requeue_io 308: inode 61 size 488 at 08:11(sdb1) [ 418.346394] requeue_io 308: inode 62 size 116 at 08:11(sdb1) [ 418.346397] requeue_io 308: inode 63 size 907 at 08:11(sdb1) [ 418.346400] requeue_io 308: inode 64 size 1076 at 08:11(sdb1) [ 418.346403] requeue_io 308: inode 65 size 460 at 08:11(sdb1) [ 418.346406] requeue_io 308: inode 66 size 1092 at 08:11(sdb1) [ 418.346409] requeue_io 308: inode 67 size 424 at 08:11(sdb1) [ 418.346412] requeue_io 308: inode 68 size 696 at 08:11(sdb1) [ 418.346415] requeue_io 308: inode 70 size 137 at 08:11(sdb1) [ 418.346418] requeue_io 308: inode 71 size 201 at 08:11(sdb1) [ 418.346421] requeue_io 308: inode 72 size 150 at 08:11(sdb1) [ 418.346424] requeue_io 308: inode 73 size 188 at 08:11(sdb1) [ 418.346427] requeue_io 308: inode 75 size 1208 at 08:11(sdb1) [ 418.346431] requeue_io 308: inode 76 size 493 at 08:11(sdb1) [ 418.346434] requeue_io 308: inode 77 size 484 at 08:11(sdb1) [ 418.346437] requeue_io 308: inode 78 size 356 at 08:11(sdb1) [ 418.346440] requeue_io 308: inode 79 size 895 at 08:11(sdb1) [ 418.346443] requeue_io 308: inode 80 size 847 at 08:11(sdb1) [ 418.346446] requeue_io 308: inode 81 size 3281 at 08:11(sdb1) [ 418.346449] requeue_io 308: inode 82 size 3329 at 08:11(sdb1) [ 418.346452] requeue_io 308: inode 83 size 115 at 08:11(sdb1) [ 418.346455] requeue_io 308: inode 84 size 644 at 08:11(sdb1) [ 418.346458] requeue_io 308: inode 85
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 15:10:45 Fengguang Wu wrote: > On Mon, Oct 22, 2007 at 03:05:35PM +0200, Maxim Levitsky wrote: > > Hi, > > Thank you very much too, for helping me. > > You are welcome :-) > > > Applied. > > Had to kill klogd, since kernel generates tons of redirtied inode messages. > > Size of the kern.log is 863 KB, thus I I don't think it is polite to > > attachit. > > Don't know whenever it is ok to put it on pastebin too. > > > > Anyway, it shows lots of redirtied inode... messages, > > and while most of them are "at 08:02(sda2)" , my reiserfs root partition, > > some are > > > > "Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at > > 00:0f(tmpfs), line 300." > > > > " line 300" is shown always > > > > (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) > > Thank you for the testing out. > > Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? No, I don't think so, like I said it sometimes shows the same message on tmpfs > > Thank you, > Fengguang > > Best Regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 03:05:35PM +0200, Maxim Levitsky wrote: > Hi, > Thank you very much too, for helping me. You are welcome :-) > Applied. > Had to kill klogd, since kernel generates tons of redirtied inode messages. > Size of the kern.log is 863 KB, thus I I don't think it is polite to attachit. > Don't know whenever it is ok to put it on pastebin too. > > Anyway, it shows lots of redirtied inode... messages, > and while most of them are "at 08:02(sda2)" , my reiserfs root partition, > some are > > "Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at > 00:0f(tmpfs), line 300." > > " line 300" is shown always > > (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) Thank you for the testing out. Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Thank you, Fengguang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 14:37:07 Fengguang Wu wrote: > On Mon, Oct 22, 2007 at 02:21:21PM +0200, Maxim Levitsky wrote: > > I Bisected this bug to exactly this commit: > > > > 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b > > writeback: introduce writeback_control.more_io to indicate more io > > > > Reverting it and compiling latest git shows no more signs of that bug. > > Thank you very much. > > I guess your system has some difficulty in writing back some inodes. > (i.e. a bug disclosed by this patch, the 100% iowait only makes it > more obvious) > > I cannot reproduce it with your .config, so would you recompile and > run the kernel with the above commit _and_ the below debugging patch? > > Thank you, > Fengguang > --- > > fs/fs-writeback.c | 15 ++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > --- linux-2.6.23-git17.orig/fs/fs-writeback.c > +++ linux-2.6.23-git17/fs/fs-writeback.c > @@ -164,12 +164,25 @@ static void redirty_tail(struct inode *i > list_move(>i_list, >s_dirty); > } > > +#define requeue_io(inode)\ > + do {\ > + __requeue_io(inode, __LINE__); \ > + } while (0) > + > /* > * requeue inode for re-scanning after sb->s_io list is exhausted. > */ > -static void requeue_io(struct inode *inode) > +static void __requeue_io(struct inode *inode, int line) > { > list_move(>i_list, >i_sb->s_more_io); > + > + printk(KERN_DEBUG "redirtied inode %lu size %llu at %02x:%02x(%s), line > %d.\n", > + inode->i_ino, > + i_size_read(inode), > + MAJOR(inode->i_sb->s_dev), > + MINOR(inode->i_sb->s_dev), > + inode->i_sb->s_id, > + line); > } > > static void inode_sync_complete(struct inode *inode) > > Hi, Thank you very much too, for helping me. Applied. Had to kill klogd, since kernel generates tons of redirtied inode messages. Size of the kern.log is 863 KB, thus I I don't think it is polite to attachit. Don't know whenever it is ok to put it on pastebin too. Anyway, it shows lots of redirtied inode... messages, and while most of them are "at 08:02(sda2)" , my reiserfs root partition, some are "Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at 00:0f(tmpfs), line 300." " line 300" is shown always (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 02:21:21PM +0200, Maxim Levitsky wrote: > I Bisected this bug to exactly this commit: > > 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b > writeback: introduce writeback_control.more_io to indicate more io > > Reverting it and compiling latest git shows no more signs of that bug. Thank you very much. I guess your system has some difficulty in writing back some inodes. (i.e. a bug disclosed by this patch, the 100% iowait only makes it more obvious) I cannot reproduce it with your .config, so would you recompile and run the kernel with the above commit _and_ the below debugging patch? Thank you, Fengguang --- fs/fs-writeback.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) --- linux-2.6.23-git17.orig/fs/fs-writeback.c +++ linux-2.6.23-git17/fs/fs-writeback.c @@ -164,12 +164,25 @@ static void redirty_tail(struct inode *i list_move(>i_list, >s_dirty); } +#define requeue_io(inode) \ + do {\ + __requeue_io(inode, __LINE__); \ + } while (0) + /* * requeue inode for re-scanning after sb->s_io list is exhausted. */ -static void requeue_io(struct inode *inode) +static void __requeue_io(struct inode *inode, int line) { list_move(>i_list, >i_sb->s_more_io); + + printk(KERN_DEBUG "redirtied inode %lu size %llu at %02x:%02x(%s), line %d.\n", + inode->i_ino, + i_size_read(inode), + MAJOR(inode->i_sb->s_dev), + MINOR(inode->i_sb->s_dev), + inode->i_sb->s_id, + line); } static void inode_sync_complete(struct inode *inode) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 13:19:08 Fengguang Wu wrote: > On Mon, Oct 22, 2007 at 12:58:11PM +0200, Maxim Levitsky wrote: > > On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: > > > On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: > > > > On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: > > > > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > > > > > > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > > > > > > c21b3000 0002 > > > > > > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 > > > > > > c21bdec0 c03ff849 c21bdef0 > > > > > > [ 673.365650]00052974 00ff > > > > > > c21bdef0 000529dc > > > > > > [ 673.365657] Call Trace: > > > > > > [ 673.365659] [] schedule_timeout+0x48/0xc0 > > > > > > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > > > > > > [ 673.365667] [] congestion_wait+0x71/0x90 > > > > > > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > > > > > > [ 673.365675] [] pdflush+0x102/0x1d0 > > > > > > [ 673.365679] [] kthread+0x42/0x70 > > > > > > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > > > > > > > > > > > > > > > > That looks more like the inode writeback patches from Wu than the per > > > > > bdi dirty stuff. The later typically hangs in balance_dirty_pages(). > > > > > > > > > > > > > > > > > > > > > > > Yes, you are right, > > > > > > > > both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and > > > > 3e26c149c358529b1605f8959341d34bc4b880a3 work fine > > > > But I didn't pay attention that those are before > > > > f4a1c2bce002f683801bcdbbc9fd89804614fb6b. > > > > So, back to the drawing board :-) > > > > > > > > Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after > > > > writeback patches. > > > > > > Thank you. I'll try if I can reproduce it locally... > > > > > > Fengguang > > > > > > > > > > Bingo, > > > > Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. > > > > I will now bisect to find exact patch that caused this bug, > > This one is most relevant: > > 1f7decf6d9f06dac008b8d66935c0c3b18e564f9 > writeback: introduce writeback_control.more_io to indicate more io Exactly. > > Still compiling the kernel... > > Thank you, > Fengguang > > Hi, I Bisected this bug to exactly this commit: 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b writeback: introduce writeback_control.more_io to indicate more io Reverting it and compiling latest git shows no more signs of that bug. Thanks, Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 12:58:11PM +0200, Maxim Levitsky wrote: > On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: > > On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: > > > On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: > > > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > > > > > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > > > > > c21b3000 0002 > > > > > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 > > > > > c21bdec0 c03ff849 c21bdef0 > > > > > [ 673.365650]00052974 00ff > > > > > c21bdef0 000529dc > > > > > [ 673.365657] Call Trace: > > > > > [ 673.365659] [] schedule_timeout+0x48/0xc0 > > > > > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > > > > > [ 673.365667] [] congestion_wait+0x71/0x90 > > > > > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > > > > > [ 673.365675] [] pdflush+0x102/0x1d0 > > > > > [ 673.365679] [] kthread+0x42/0x70 > > > > > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > > > > > > > > > > > > > That looks more like the inode writeback patches from Wu than the per > > > > bdi dirty stuff. The later typically hangs in balance_dirty_pages(). > > > > > > > > > > > > > > > > > > Yes, you are right, > > > > > > both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and > > > 3e26c149c358529b1605f8959341d34bc4b880a3 work fine > > > But I didn't pay attention that those are before > > > f4a1c2bce002f683801bcdbbc9fd89804614fb6b. > > > So, back to the drawing board :-) > > > > > > Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after > > > writeback patches. > > > > Thank you. I'll try if I can reproduce it locally... > > > > Fengguang > > > > > > Bingo, > > Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. > > I will now bisect to find exact patch that caused this bug, This one is most relevant: 1f7decf6d9f06dac008b8d66935c0c3b18e564f9 writeback: introduce writeback_control.more_io to indicate more io Still compiling the kernel... Thank you, Fengguang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: > On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: > > On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: > > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > > > > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > > > > c21b3000 0002 > > > > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 > > > > c21bdec0 c03ff849 c21bdef0 > > > > [ 673.365650]00052974 00ff > > > > c21bdef0 000529dc > > > > [ 673.365657] Call Trace: > > > > [ 673.365659] [] schedule_timeout+0x48/0xc0 > > > > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > > > > [ 673.365667] [] congestion_wait+0x71/0x90 > > > > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > > > > [ 673.365675] [] pdflush+0x102/0x1d0 > > > > [ 673.365679] [] kthread+0x42/0x70 > > > > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > > > > > > > > > > That looks more like the inode writeback patches from Wu than the per > > > bdi dirty stuff. The later typically hangs in balance_dirty_pages(). > > > > > > > > > > > > > Yes, you are right, > > > > both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and > > 3e26c149c358529b1605f8959341d34bc4b880a3 work fine > > But I didn't pay attention that those are before > > f4a1c2bce002f683801bcdbbc9fd89804614fb6b. > > So, back to the drawing board :-) > > > > Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after > > writeback patches. > > Thank you. I'll try if I can reproduce it locally... > > Fengguang > > Bingo, Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. I will now bisect to find exact patch that caused this bug, Thanks, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: > On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > > > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > > > c21b3000 0002 > > > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 > > > c21bdec0 c03ff849 c21bdef0 > > > [ 673.365650]00052974 00ff > > > c21bdef0 000529dc > > > [ 673.365657] Call Trace: > > > [ 673.365659] [] schedule_timeout+0x48/0xc0 > > > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > > > [ 673.365667] [] congestion_wait+0x71/0x90 > > > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > > > [ 673.365675] [] pdflush+0x102/0x1d0 > > > [ 673.365679] [] kthread+0x42/0x70 > > > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > > > > > > > That looks more like the inode writeback patches from Wu than the per > > bdi dirty stuff. The later typically hangs in balance_dirty_pages(). > > > > > > > > Yes, you are right, > > both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and > 3e26c149c358529b1605f8959341d34bc4b880a3 work fine > But I didn't pay attention that those are before > f4a1c2bce002f683801bcdbbc9fd89804614fb6b. > So, back to the drawing board :-) > > Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after > writeback patches. Thank you. I'll try if I can reproduce it locally... Fengguang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: > On Mon, 2007-10-22 at 11:59 +0200, Maxim Levitsky wrote: > > On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: > > > On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: > > > > Hi, > > > > > > > > I found a bug in current -git: > > > > > > > > On my system on of cpus stays 100% in iowait mode (I have core 2 duo) > > > > Otherwise the system works OK, no disk activity and/or slowdown. > > > > Suspecting that this is a swap-related problem I tried to turn swap of, > > > > but it doesn't affect anything. > > > > It is probably some accounting bug. > > > > > > > > If I start with init=/bin/bash, then this disappears. > > > > I tried then to start usual /etc/init.d scripts then, and first one to > > > > show this bug was gpm. > > > > but then I rebooted the system to X without gpm, and I still see 100% > > > > iowait. > > > > > > > > No additional messages in dmesg. > > > > > > does sysrq-t show any D state tasks? > > > > > > > > This one: > > Probably per-block device dirty writeback? > > I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b > > Thanks for the pointer. > > > > > > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > > c21b3000 0002 > > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 > > c03ff849 c21bdef0 > > [ 673.365650]00052974 00ff > > c21bdef0 000529dc > > [ 673.365657] Call Trace: > > [ 673.365659] [] schedule_timeout+0x48/0xc0 > > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > > [ 673.365667] [] congestion_wait+0x71/0x90 > > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > > [ 673.365675] [] pdflush+0x102/0x1d0 > > [ 673.365679] [] kthread+0x42/0x70 > > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > > > > That looks more like the inode writeback patches from Wu than the per > bdi dirty stuff. The later typically hangs in balance_dirty_pages(). > > > Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thanks, Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 2007-10-22 at 11:59 +0200, Maxim Levitsky wrote: > On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: > > On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: > > > Hi, > > > > > > I found a bug in current -git: > > > > > > On my system on of cpus stays 100% in iowait mode (I have core 2 duo) > > > Otherwise the system works OK, no disk activity and/or slowdown. > > > Suspecting that this is a swap-related problem I tried to turn swap of, > > > but it doesn't affect anything. > > > It is probably some accounting bug. > > > > > > If I start with init=/bin/bash, then this disappears. > > > I tried then to start usual /etc/init.d scripts then, and first one to > > > show this bug was gpm. > > > but then I rebooted the system to X without gpm, and I still see 100% > > > iowait. > > > > > > No additional messages in dmesg. > > > > does sysrq-t show any D state tasks? > > > > > This one: > Probably per-block device dirty writeback? > I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b > Thanks for the pointer. > > > > [ 673.365631] pdflush D c21bdecc 0 221 2 > [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 > c21b3000 0002 > [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 > c03ff849 c21bdef0 > [ 673.365650]00052974 00ff > c21bdef0 000529dc > [ 673.365657] Call Trace: > [ 673.365659] [] schedule_timeout+0x48/0xc0 > [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 > [ 673.365667] [] congestion_wait+0x71/0x90 > [ 673.365671] [] wb_kupdate+0x9e/0xf0 > [ 673.365675] [] pdflush+0x102/0x1d0 > [ 673.365679] [] kthread+0x42/0x70 > [ 673.365683] [] kernel_thread_helper+0x7/0x18 > That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: > On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: > > Hi, > > > > I found a bug in current -git: > > > > On my system on of cpus stays 100% in iowait mode (I have core 2 duo) > > Otherwise the system works OK, no disk activity and/or slowdown. > > Suspecting that this is a swap-related problem I tried to turn swap of, but > > it doesn't affect anything. > > It is probably some accounting bug. > > > > If I start with init=/bin/bash, then this disappears. > > I tried then to start usual /etc/init.d scripts then, and first one to show > > this bug was gpm. > > but then I rebooted the system to X without gpm, and I still see 100% > > iowait. > > > > No additional messages in dmesg. > > does sysrq-t show any D state tasks? > > This one: Probably per-block device dirty writeback? I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b Thanks for the pointer. [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [] schedule_timeout+0x48/0xc0 [ 673.365663] [] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [] congestion_wait+0x71/0x90 [ 673.365671] [] wb_kupdate+0x9e/0xf0 [ 673.365675] [] pdflush+0x102/0x1d0 [ 673.365679] [] kthread+0x42/0x70 [ 673.365683] [] kernel_thread_helper+0x7/0x18 Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 11:11:52 Paolo Ornati wrote: > On Mon, 22 Oct 2007 08:22:52 +0200 > Maxim Levitsky <[EMAIL PROTECTED]> wrote: > > > I tried to bisect this, but eventually I run into other bugs that cause > > system to oops early. > > You can pick a different revision to test with: > git-reset --hard "SHA1" > > Choose one with "git-bisect visualize". > Well, I know that, and I did try this a lot. The problem is that between good and bad revisions there are few bugs that cause the system to oops early, thus I can't tell whenever the 100% iowait bug is present or not. Best regards, Maxim Levitsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: > Hi, > > I found a bug in current -git: > > On my system on of cpus stays 100% in iowait mode (I have core 2 duo) > Otherwise the system works OK, no disk activity and/or slowdown. > Suspecting that this is a swap-related problem I tried to turn swap of, but > it doesn't affect anything. > It is probably some accounting bug. > > If I start with init=/bin/bash, then this disappears. > I tried then to start usual /etc/init.d scripts then, and first one to show > this bug was gpm. > but then I rebooted the system to X without gpm, and I still see 100% iowait. > > No additional messages in dmesg. does sysrq-t show any D state tasks? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 22 Oct 2007 08:22:52 +0200 Maxim Levitsky <[EMAIL PROTECTED]> wrote: > I tried to bisect this, but eventually I run into other bugs that cause > system to oops early. You can pick a different revision to test with: git-reset --hard "SHA1" Choose one with "git-bisect visualize". -- Paolo Ornati Linux 2.6.23-ge8b8c977 on x86_64 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 22 Oct 2007 08:22:52 +0200 Maxim Levitsky [EMAIL PROTECTED] wrote: I tried to bisect this, but eventually I run into other bugs that cause system to oops early. You can pick a different revision to test with: git-reset --hard SHA1 Choose one with git-bisect visualize. -- Paolo Ornati Linux 2.6.23-ge8b8c977 on x86_64 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo) Otherwise the system works OK, no disk activity and/or slowdown. Suspecting that this is a swap-related problem I tried to turn swap of, but it doesn't affect anything. It is probably some accounting bug. If I start with init=/bin/bash, then this disappears. I tried then to start usual /etc/init.d scripts then, and first one to show this bug was gpm. but then I rebooted the system to X without gpm, and I still see 100% iowait. No additional messages in dmesg. does sysrq-t show any D state tasks? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 11:11:52 Paolo Ornati wrote: On Mon, 22 Oct 2007 08:22:52 +0200 Maxim Levitsky [EMAIL PROTECTED] wrote: I tried to bisect this, but eventually I run into other bugs that cause system to oops early. You can pick a different revision to test with: git-reset --hard SHA1 Choose one with git-bisect visualize. Well, I know that, and I did try this a lot. The problem is that between good and bad revisions there are few bugs that cause the system to oops early, thus I can't tell whenever the 100% iowait bug is present or not. Best regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo) Otherwise the system works OK, no disk activity and/or slowdown. Suspecting that this is a swap-related problem I tried to turn swap of, but it doesn't affect anything. It is probably some accounting bug. If I start with init=/bin/bash, then this disappears. I tried then to start usual /etc/init.d scripts then, and first one to show this bug was gpm. but then I rebooted the system to X without gpm, and I still see 100% iowait. No additional messages in dmesg. does sysrq-t show any D state tasks? This one: Probably per-block device dirty writeback? I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b Thanks for the pointer. [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 Best regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, 2007-10-22 at 11:59 +0200, Maxim Levitsky wrote: On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo) Otherwise the system works OK, no disk activity and/or slowdown. Suspecting that this is a swap-related problem I tried to turn swap of, but it doesn't affect anything. It is probably some accounting bug. If I start with init=/bin/bash, then this disappears. I tried then to start usual /etc/init.d scripts then, and first one to show this bug was gpm. but then I rebooted the system to X without gpm, and I still see 100% iowait. No additional messages in dmesg. does sysrq-t show any D state tasks? This one: Probably per-block device dirty writeback? I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b Thanks for the pointer. [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: On Mon, 2007-10-22 at 11:59 +0200, Maxim Levitsky wrote: On Monday 22 October 2007 11:41:57 Peter Zijlstra wrote: On Mon, 2007-10-22 at 08:22 +0200, Maxim Levitsky wrote: Hi, I found a bug in current -git: On my system on of cpus stays 100% in iowait mode (I have core 2 duo) Otherwise the system works OK, no disk activity and/or slowdown. Suspecting that this is a swap-related problem I tried to turn swap of, but it doesn't affect anything. It is probably some accounting bug. If I start with init=/bin/bash, then this disappears. I tried then to start usual /etc/init.d scripts then, and first one to show this bug was gpm. but then I rebooted the system to X without gpm, and I still see 100% iowait. No additional messages in dmesg. does sysrq-t show any D state tasks? This one: Probably per-block device dirty writeback? I am compiling now revision 1f7d6668c29b1dfa307a44844f9bb38356fc989b Thanks for the pointer. [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thanks, Best regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thank you. I'll try if I can reproduce it locally... Fengguang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thank you. I'll try if I can reproduce it locally... Fengguang Bingo, Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. I will now bisect to find exact patch that caused this bug, Thanks, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 12:58:11PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thank you. I'll try if I can reproduce it locally... Fengguang Bingo, Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. I will now bisect to find exact patch that caused this bug, This one is most relevant: 1f7decf6d9f06dac008b8d66935c0c3b18e564f9 writeback: introduce writeback_control.more_io to indicate more io Still compiling the kernel... Thank you, Fengguang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 13:19:08 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 12:58:11PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:55:25 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 12:40:24PM +0200, Maxim Levitsky wrote: On Monday 22 October 2007 12:22:10 Peter Zijlstra wrote: [ 673.365631] pdflush D c21bdecc 0 221 2 [ 673.365635]c21bdee0 0046 0002 c21bdecc c21bdec4 c21b3000 0002 [ 673.365643]c0134892 c21b3164 c1e00200 0001 c7109280 c21bdec0 c03ff849 c21bdef0 [ 673.365650]00052974 00ff c21bdef0 000529dc [ 673.365657] Call Trace: [ 673.365659] [c03fd728] schedule_timeout+0x48/0xc0 [ 673.365663] [c03fd50e] io_schedule_timeout+0x5e/0xb0 [ 673.365667] [c0170d11] congestion_wait+0x71/0x90 [ 673.365671] [c016b92e] wb_kupdate+0x9e/0xf0 [ 673.365675] [c016beb2] pdflush+0x102/0x1d0 [ 673.365679] [c013fa82] kthread+0x42/0x70 [ 673.365683] [c01050df] kernel_thread_helper+0x7/0x18 That looks more like the inode writeback patches from Wu than the per bdi dirty stuff. The later typically hangs in balance_dirty_pages(). Yes, you are right, both revisions 1f7d6668c29b1dfa307a44844f9bb38356fc989b and 3e26c149c358529b1605f8959341d34bc4b880a3 work fine But I didn't pay attention that those are before f4a1c2bce002f683801bcdbbc9fd89804614fb6b. So, back to the drawing board :-) Will test revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, just after writeback patches. Thank you. I'll try if I can reproduce it locally... Fengguang Bingo, Revision 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b shows this bug. I will now bisect to find exact patch that caused this bug, This one is most relevant: 1f7decf6d9f06dac008b8d66935c0c3b18e564f9 writeback: introduce writeback_control.more_io to indicate more io Exactly. Still compiling the kernel... Thank you, Fengguang Hi, I Bisected this bug to exactly this commit: 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b writeback: introduce writeback_control.more_io to indicate more io Reverting it and compiling latest git shows no more signs of that bug. Thanks, Best regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 02:21:21PM +0200, Maxim Levitsky wrote: I Bisected this bug to exactly this commit: 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b writeback: introduce writeback_control.more_io to indicate more io Reverting it and compiling latest git shows no more signs of that bug. Thank you very much. I guess your system has some difficulty in writing back some inodes. (i.e. a bug disclosed by this patch, the 100% iowait only makes it more obvious) I cannot reproduce it with your .config, so would you recompile and run the kernel with the above commit _and_ the below debugging patch? Thank you, Fengguang --- fs/fs-writeback.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) --- linux-2.6.23-git17.orig/fs/fs-writeback.c +++ linux-2.6.23-git17/fs/fs-writeback.c @@ -164,12 +164,25 @@ static void redirty_tail(struct inode *i list_move(inode-i_list, sb-s_dirty); } +#define requeue_io(inode) \ + do {\ + __requeue_io(inode, __LINE__); \ + } while (0) + /* * requeue inode for re-scanning after sb-s_io list is exhausted. */ -static void requeue_io(struct inode *inode) +static void __requeue_io(struct inode *inode, int line) { list_move(inode-i_list, inode-i_sb-s_more_io); + + printk(KERN_DEBUG redirtied inode %lu size %llu at %02x:%02x(%s), line %d.\n, + inode-i_ino, + i_size_read(inode), + MAJOR(inode-i_sb-s_dev), + MINOR(inode-i_sb-s_dev), + inode-i_sb-s_id, + line); } static void inode_sync_complete(struct inode *inode) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 14:37:07 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 02:21:21PM +0200, Maxim Levitsky wrote: I Bisected this bug to exactly this commit: 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b writeback: introduce writeback_control.more_io to indicate more io Reverting it and compiling latest git shows no more signs of that bug. Thank you very much. I guess your system has some difficulty in writing back some inodes. (i.e. a bug disclosed by this patch, the 100% iowait only makes it more obvious) I cannot reproduce it with your .config, so would you recompile and run the kernel with the above commit _and_ the below debugging patch? Thank you, Fengguang --- fs/fs-writeback.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) --- linux-2.6.23-git17.orig/fs/fs-writeback.c +++ linux-2.6.23-git17/fs/fs-writeback.c @@ -164,12 +164,25 @@ static void redirty_tail(struct inode *i list_move(inode-i_list, sb-s_dirty); } +#define requeue_io(inode)\ + do {\ + __requeue_io(inode, __LINE__); \ + } while (0) + /* * requeue inode for re-scanning after sb-s_io list is exhausted. */ -static void requeue_io(struct inode *inode) +static void __requeue_io(struct inode *inode, int line) { list_move(inode-i_list, inode-i_sb-s_more_io); + + printk(KERN_DEBUG redirtied inode %lu size %llu at %02x:%02x(%s), line %d.\n, + inode-i_ino, + i_size_read(inode), + MAJOR(inode-i_sb-s_dev), + MINOR(inode-i_sb-s_dev), + inode-i_sb-s_id, + line); } static void inode_sync_complete(struct inode *inode) Hi, Thank you very much too, for helping me. Applied. Had to kill klogd, since kernel generates tons of redirtied inode messages. Size of the kern.log is 863 KB, thus I I don't think it is polite to attachit. Don't know whenever it is ok to put it on pastebin too. Anyway, it shows lots of redirtied inode... messages, and while most of them are at 08:02(sda2) , my reiserfs root partition, some are Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at 00:0f(tmpfs), line 300. line 300 is shown always (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) Best regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 03:05:35PM +0200, Maxim Levitsky wrote: Hi, Thank you very much too, for helping me. You are welcome :-) Applied. Had to kill klogd, since kernel generates tons of redirtied inode messages. Size of the kern.log is 863 KB, thus I I don't think it is polite to attachit. Don't know whenever it is ok to put it on pastebin too. Anyway, it shows lots of redirtied inode... messages, and while most of them are at 08:02(sda2) , my reiserfs root partition, some are Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at 00:0f(tmpfs), line 300. line 300 is shown always (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) Thank you for the testing out. Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Thank you, Fengguang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Monday 22 October 2007 15:10:45 Fengguang Wu wrote: On Mon, Oct 22, 2007 at 03:05:35PM +0200, Maxim Levitsky wrote: Hi, Thank you very much too, for helping me. You are welcome :-) Applied. Had to kill klogd, since kernel generates tons of redirtied inode messages. Size of the kern.log is 863 KB, thus I I don't think it is polite to attachit. Don't know whenever it is ok to put it on pastebin too. Anyway, it shows lots of redirtied inode... messages, and while most of them are at 08:02(sda2) , my reiserfs root partition, some are Oct 22 14:50:27 MAIN kernel: [ 73.643794] redirtied inode 2582 size 0 at 00:0f(tmpfs), line 300. line 300 is shown always (I have /var/run, /var/lock, /dev mounted as tmpfs, default kubuntu setup) Thank you for the testing out. Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? No, I don't think so, like I said it sometimes shows the same message on tmpfs Thank you, Fengguang Best Regards, Maxim Levitsky - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 100% iowait on one of cpus in current -git
On Mon, Oct 22, 2007 at 09:10:45PM +0800, Fengguang Wu wrote: Hmm, Maybe it's an reiserfs related issue. Do you have the full log file? Bingo! It can be reproduced in -mm on reiserfs: # mkfs.reiserfs /dev/sdb1 # mount /dev/sdb1 /test # cp bin /test wait for a while # dmesg [...] [ 418.346113] requeue_io 308: inode 6 size 302 at 08:11(sdb1) [ 418.346119] requeue_io 308: inode 7 size 196 at 08:11(sdb1) [ 418.346125] requeue_io 308: inode 8 size 85 at 08:11(sdb1) [ 418.346131] requeue_io 308: inode 9 size 180 at 08:11(sdb1) [ 418.346136] requeue_io 308: inode 10 size 1488 at 08:11(sdb1) [ 418.346142] requeue_io 308: inode 12 size 1358 at 08:11(sdb1) [ 418.346148] requeue_io 308: inode 13 size 482 at 08:11(sdb1) [ 418.346153] requeue_io 308: inode 14 size 171 at 08:11(sdb1) [ 418.346159] requeue_io 308: inode 15 size 93 at 08:11(sdb1) [ 418.346164] requeue_io 308: inode 16 size 81 at 08:11(sdb1) [ 418.346170] requeue_io 308: inode 17 size 212 at 08:11(sdb1) [ 418.346176] requeue_io 308: inode 18 size 431 at 08:11(sdb1) [ 418.346181] requeue_io 308: inode 19 size 231 at 08:11(sdb1) [ 418.346187] requeue_io 308: inode 20 size 1756 at 08:11(sdb1) [ 418.346193] requeue_io 308: inode 21 size 1229 at 08:11(sdb1) [ 418.346198] requeue_io 308: inode 22 size 157 at 08:11(sdb1) [ 418.346204] requeue_io 308: inode 23 size 3430 at 08:11(sdb1) [ 418.346210] requeue_io 308: inode 24 size 200 at 08:11(sdb1) [ 418.346215] requeue_io 308: inode 25 size 202 at 08:11(sdb1) [ 418.346221] requeue_io 308: inode 26 size 386 at 08:11(sdb1) [ 418.346226] requeue_io 308: inode 27 size 264 at 08:11(sdb1) [ 418.346232] requeue_io 308: inode 28 size 268 at 08:11(sdb1) [ 418.346238] requeue_io 308: inode 29 size 1228 at 08:11(sdb1) [ 418.346243] requeue_io 308: inode 30 size 404 at 08:11(sdb1) [ 418.346249] requeue_io 308: inode 31 size 2452 at 08:11(sdb1) [ 418.346255] requeue_io 308: inode 32 size 1236 at 08:11(sdb1) [ 418.346260] requeue_io 308: inode 33 size 655 at 08:11(sdb1) [ 418.346266] requeue_io 308: inode 35 size 330 at 08:11(sdb1) [ 418.346272] requeue_io 308: inode 36 size 248 at 08:11(sdb1) [ 418.346277] requeue_io 308: inode 37 size 683 at 08:11(sdb1) [ 418.346283] requeue_io 308: inode 38 size 1451 at 08:11(sdb1) [ 418.346288] requeue_io 308: inode 39 size 894 at 08:11(sdb1) [ 418.346294] requeue_io 308: inode 40 size 879 at 08:11(sdb1) [ 418.346300] requeue_io 308: inode 42 size 797 at 08:11(sdb1) [ 418.346305] requeue_io 308: inode 43 size 1314 at 08:11(sdb1) [ 418.346311] requeue_io 308: inode 44 size 1463 at 08:11(sdb1) [ 418.346317] requeue_io 308: inode 45 size 3032 at 08:11(sdb1) [ 418.346322] requeue_io 308: inode 46 size 325 at 08:11(sdb1) [ 418.346328] requeue_io 308: inode 47 size 583 at 08:11(sdb1) [ 418.346334] requeue_io 308: inode 48 size 1660 at 08:11(sdb1) [ 418.346339] requeue_io 308: inode 49 size 3159 at 08:11(sdb1) [ 418.346345] requeue_io 308: inode 50 size 510 at 08:11(sdb1) [ 418.346350] requeue_io 308: inode 51 size 100 at 08:11(sdb1) [ 418.346356] requeue_io 308: inode 52 size 143 at 08:11(sdb1) [ 418.346370] requeue_io 308: inode 53 size 954 at 08:11(sdb1) [ 418.346373] requeue_io 308: inode 54 size 322 at 08:11(sdb1) [ 418.346376] requeue_io 308: inode 55 size 970 at 08:11(sdb1) [ 418.346379] requeue_io 308: inode 57 size 483 at 08:11(sdb1) [ 418.346382] requeue_io 308: inode 58 size 1125 at 08:11(sdb1) [ 418.346385] requeue_io 308: inode 59 size 2196 at 08:11(sdb1) [ 418.346388] requeue_io 308: inode 60 size 104 at 08:11(sdb1) [ 418.346391] requeue_io 308: inode 61 size 488 at 08:11(sdb1) [ 418.346394] requeue_io 308: inode 62 size 116 at 08:11(sdb1) [ 418.346397] requeue_io 308: inode 63 size 907 at 08:11(sdb1) [ 418.346400] requeue_io 308: inode 64 size 1076 at 08:11(sdb1) [ 418.346403] requeue_io 308: inode 65 size 460 at 08:11(sdb1) [ 418.346406] requeue_io 308: inode 66 size 1092 at 08:11(sdb1) [ 418.346409] requeue_io 308: inode 67 size 424 at 08:11(sdb1) [ 418.346412] requeue_io 308: inode 68 size 696 at 08:11(sdb1) [ 418.346415] requeue_io 308: inode 70 size 137 at 08:11(sdb1) [ 418.346418] requeue_io 308: inode 71 size 201 at 08:11(sdb1) [ 418.346421] requeue_io 308: inode 72 size 150 at 08:11(sdb1) [ 418.346424] requeue_io 308: inode 73 size 188 at 08:11(sdb1) [ 418.346427] requeue_io 308: inode 75 size 1208 at 08:11(sdb1) [ 418.346431] requeue_io 308: inode 76 size 493 at 08:11(sdb1) [ 418.346434] requeue_io 308: inode 77 size 484 at 08:11(sdb1) [ 418.346437] requeue_io 308: inode 78 size 356 at 08:11(sdb1) [ 418.346440] requeue_io 308: inode 79 size 895 at 08:11(sdb1) [ 418.346443] requeue_io 308: inode 80 size 847 at 08:11(sdb1) [ 418.346446] requeue_io 308: inode 81 size 3281 at 08:11(sdb1) [ 418.346449] requeue_io 308: inode 82 size 3329 at 08:11(sdb1) [ 418.346452] requeue_io 308: inode 83 size 115 at 08:11(sdb1) [ 418.346455] requeue_io 308: inode 84 size 644 at 08:11(sdb1) [ 418.346458] requeue_io