Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Wednesday, 21 of November 2007, Andrey Borzenkov wrote: > On Sunday 09 September 2007, Rafael J. Wysocki wrote: > > On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: > > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > > suspend to disk :) > > > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > > > > libata with pata_ali driver. > > > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > > absolutely silent (black screen without reaction to any key). In rc6 I > > > > > just got something different. After resume I got on screem: > > > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > > swsusp: Basic memory bitmaps created > > > > > swsusp: Basic memory bitmaps freed > > > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of > > > > > HDD > > > > > but I suspect it was related more to power-on. System was responding > > > > > to > > > > > power-on button press: > > > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event > > > > > [0002 > > > > > 20070125] > > > > > > > > > > And SysRq was functioning. > > > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > > > Unfortunately I do not have serial console so I > > > > > copy manually stacks from several last screens of output; I have > > > > > tried to > > > > > make a photo but right now my kbluetooth is refusing to work at all > > > > > so I > > > > > cannot transfer them :( (but I suspect quality would be too bad > > > > > anyway) > > > > > > > > > > laptop_mode D > > > > > io_schedule+0xe/0x20 > > > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > > to? > > > > > > > > > sync_buffer+0x35/0x40 > > > > > __wait_on_bit+0x45/0x70 > > > > > out_of_line_wait_on_bit+0x6c/0x80 > > > > > __wait_on_buffer+0x27/0x30 > > > > > search_by_key+0x15e/0x1250 [reiserfs] > > > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > > > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > > > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > > > > do_lookup+0x138/0x180 > > > > > __link_path_walk+0x787/0xce0 > > > > > link_path_walk+0x44/0xc0 > > > > > path_walk+0x18/0x20 > > > > > do_path_lookup_0x88/0x210 > > > > > __path_lookupintent_open+0x4d/0x90 > > > > > path_lookup_open+0x1f/0x30 > > > > > open_exec+0x28/0xb0 > > > > > do_execve+0x36/0x1d0 > > > > > sys_execve+0x2e/0x80 > > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > > > 90clock D > > > > > __mutex_lock_slow_path+0xa1/0x290 > > > > > mutex_lock+0x21/0x30 > > > > > do_lookup+0xa1/0x180 > > > > > __link_path_walk+0x44/0xc0 > > > > > path_walk+0x18/0x20 > > > > > do_path_lookup+0x78/0x210 > > > > > __user_walk_fd+0x38/0x50 > > > > > vfs_stat_fd+0x21/0x50 > > > > > vfs_stat+0x11/0x20 > > > > > sys_stat64+0x14/0x30 > > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > > > alsactl D > > > > > io_schedule+0xe/0x20 > > > > > > > > Same here. Hmm. > > > > > > > > > sync_page+0x35/0x40 > > > > > __wait_on_bit_lock+0x3f/0x70 > > > > > __lock_page+0x68/0x70 > > > > > filemap_nopage+0x16c/0x300 > > > > > __handle_mm_faul+0x1d7/0x610 > > > > > do_page_fault+0x1d7/0x610 > > > > > error_code+0x6a/0x70 > > > > > padzero+0x1f/0x30 > > > > > load_elf_binary+0x743/0x1ab0 > > > > > search_binary_handler+0x7b/0x1f0 > > > > > do_execve+0x137/0x1d0 > > > > > sys_execve+0x2e/0x80 > > > > > sysenter_past_esp+0x5f/0x90 > > > > > > > > > > After that I could remount, sync and reboot using SysRq (well, after > > > > > reboot it still insisted on replaying insane number of transactions so > > > > > may be it did *not* remount / ro after all). Before reboot there was > > > > > brief output that resembled lockdep warnings, but it went too fast to > > > > > be > > > > > readable. > > > > > > > > > > usual stuff follows > > > > > > > > I see you're using CFQ as the default IO scheduler. Can you please > > > > switch > > > > to AS and see if that changes anything? > > > > > > > > > > I just had the same lockup on resume using AS with 2.6.23-rc5. > > > > Hm. Does your root partition sit on reiserfs? > > I already answered this but yes, I do. > > > > > > > just had it again on 2.6.24-rc3. Same thing - keys working (to some extent) > Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally) > from resume
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 09 September 2007, Rafael J. Wysocki wrote: > On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > suspend to disk :) > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > > > libata with pata_ali driver. > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > absolutely silent (black screen without reaction to any key). In rc6 I > > > > just got something different. After resume I got on screem: > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > swsusp: Basic memory bitmaps created > > > > swsusp: Basic memory bitmaps freed > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of HDD > > > > but I suspect it was related more to power-on. System was responding to > > > > power-on button press: > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > > > 20070125] > > > > > > > > And SysRq was functioning. > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > Unfortunately I do not have serial console so I > > > > copy manually stacks from several last screens of output; I have tried > > > > to > > > > make a photo but right now my kbluetooth is refusing to work at all so I > > > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > > > > > laptop_mode D > > > > io_schedule+0xe/0x20 > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > to? > > > > > > > sync_buffer+0x35/0x40 > > > > __wait_on_bit+0x45/0x70 > > > > out_of_line_wait_on_bit+0x6c/0x80 > > > > __wait_on_buffer+0x27/0x30 > > > > search_by_key+0x15e/0x1250 [reiserfs] > > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > > > do_lookup+0x138/0x180 > > > > __link_path_walk+0x787/0xce0 > > > > link_path_walk+0x44/0xc0 > > > > path_walk+0x18/0x20 > > > > do_path_lookup_0x88/0x210 > > > > __path_lookupintent_open+0x4d/0x90 > > > > path_lookup_open+0x1f/0x30 > > > > open_exec+0x28/0xb0 > > > > do_execve+0x36/0x1d0 > > > > sys_execve+0x2e/0x80 > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > 90clock D > > > > __mutex_lock_slow_path+0xa1/0x290 > > > > mutex_lock+0x21/0x30 > > > > do_lookup+0xa1/0x180 > > > > __link_path_walk+0x44/0xc0 > > > > path_walk+0x18/0x20 > > > > do_path_lookup+0x78/0x210 > > > > __user_walk_fd+0x38/0x50 > > > > vfs_stat_fd+0x21/0x50 > > > > vfs_stat+0x11/0x20 > > > > sys_stat64+0x14/0x30 > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > alsactl D > > > > io_schedule+0xe/0x20 > > > > > > Same here. Hmm. > > > > > > > sync_page+0x35/0x40 > > > > __wait_on_bit_lock+0x3f/0x70 > > > > __lock_page+0x68/0x70 > > > > filemap_nopage+0x16c/0x300 > > > > __handle_mm_faul+0x1d7/0x610 > > > > do_page_fault+0x1d7/0x610 > > > > error_code+0x6a/0x70 > > > > padzero+0x1f/0x30 > > > > load_elf_binary+0x743/0x1ab0 > > > > search_binary_handler+0x7b/0x1f0 > > > > do_execve+0x137/0x1d0 > > > > sys_execve+0x2e/0x80 > > > > sysenter_past_esp+0x5f/0x90 > > > > > > > > After that I could remount, sync and reboot using SysRq (well, after > > > > reboot it still insisted on replaying insane number of transactions so > > > > may be it did *not* remount / ro after all). Before reboot there was > > > > brief output that resembled lockdep warnings, but it went too fast to be > > > > readable. > > > > > > > > usual stuff follows > > > > > > I see you're using CFQ as the default IO scheduler. Can you please switch > > > to AS and see if that changes anything? > > > > > > > I just had the same lockup on resume using AS with 2.6.23-rc5. > > Hm. Does your root partition sit on reiserfs? I already answered this but yes, I do. > > just had it again on 2.6.24-rc3. Same thing - keys working (to some extent) Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally) from resume message to tty1 and it was in funny state so SysRq-t was lost but I pretty much suspect it be the same. well, not sure how to debug problem that pops up once in three-four release ... signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 09 September 2007, Rafael J. Wysocki wrote: On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? I just had the same lockup on resume using AS with 2.6.23-rc5. Hm. Does your root partition sit on reiserfs? I already answered this but yes, I do. just had it again on 2.6.24-rc3. Same thing - keys working (to some extent) Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally) from resume message to tty1 and it was in funny state so SysRq-t was lost but I pretty much suspect it be the same. well, not sure how to debug problem that pops up once in three-four release ... signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Wednesday, 21 of November 2007, Andrey Borzenkov wrote: On Sunday 09 September 2007, Rafael J. Wysocki wrote: On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? I just had the same lockup on resume using AS with 2.6.23-rc5. Hm. Does your root partition sit on reiserfs? I already answered this but yes, I do. just had it again on 2.6.24-rc3. Same thing - keys working (to some extent) Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally) from resume message to tty1 and it was in funny state so SysRq-t was lost but I pretty much suspect it be the same. well, not sure how to debug problem that pops up once in three-four release ... And you never know when it happens ... I have no idea. It's probably related to your hardware configuration somehow, as it doesn't seem to be reproducible in general. Regards, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 09 September 2007, Rafael J. Wysocki wrote: > On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > suspend to disk :) > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + > > > > single pata_ali patch to switch off DMA on CD-ROM), single root on > > > > reiserfs, libata with pata_ali driver. > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > absolutely silent (black screen without reaction to any key). In rc6 > > > > I just got something different. After resume I got on screem: > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > swsusp: Basic memory bitmaps created > > > > swsusp: Basic memory bitmaps freed > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of > > > > HDD but I suspect it was related more to power-on. System was > > > > responding to power-on button press: > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event > > > > [0002 20070125] > > > > > > > > And SysRq was functioning. > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > Unfortunately I do not have serial console so I > > > > copy manually stacks from several last screens of output; I have > > > > tried to make a photo but right now my kbluetooth is refusing to work > > > > at all so I cannot transfer them :( (but I suspect quality would be > > > > too bad anyway) > > > > > > > > laptop_mode D > > > > io_schedule+0xe/0x20 > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > to? > > > > > > > sync_buffer+0x35/0x40 > > > > __wait_on_bit+0x45/0x70 > > > > out_of_line_wait_on_bit+0x6c/0x80 > > > > __wait_on_buffer+0x27/0x30 > > > > search_by_key+0x15e/0x1250 [reiserfs] > > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > > > do_lookup+0x138/0x180 > > > > __link_path_walk+0x787/0xce0 > > > > link_path_walk+0x44/0xc0 > > > > path_walk+0x18/0x20 > > > > do_path_lookup_0x88/0x210 > > > > __path_lookupintent_open+0x4d/0x90 > > > > path_lookup_open+0x1f/0x30 > > > > open_exec+0x28/0xb0 > > > > do_execve+0x36/0x1d0 > > > > sys_execve+0x2e/0x80 > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > 90clock D > > > > __mutex_lock_slow_path+0xa1/0x290 > > > > mutex_lock+0x21/0x30 > > > > do_lookup+0xa1/0x180 > > > > __link_path_walk+0x44/0xc0 > > > > path_walk+0x18/0x20 > > > > do_path_lookup+0x78/0x210 > > > > __user_walk_fd+0x38/0x50 > > > > vfs_stat_fd+0x21/0x50 > > > > vfs_stat+0x11/0x20 > > > > sys_stat64+0x14/0x30 > > > > sysenter_past_esp+0x5f/0x99 > > > > > > > > alsactl D > > > > io_schedule+0xe/0x20 > > > > > > Same here. Hmm. > > > > > > > sync_page+0x35/0x40 > > > > __wait_on_bit_lock+0x3f/0x70 > > > > __lock_page+0x68/0x70 > > > > filemap_nopage+0x16c/0x300 > > > > __handle_mm_faul+0x1d7/0x610 > > > > do_page_fault+0x1d7/0x610 > > > > error_code+0x6a/0x70 > > > > padzero+0x1f/0x30 > > > > load_elf_binary+0x743/0x1ab0 > > > > search_binary_handler+0x7b/0x1f0 > > > > do_execve+0x137/0x1d0 > > > > sys_execve+0x2e/0x80 > > > > sysenter_past_esp+0x5f/0x90 > > > > > > > > After that I could remount, sync and reboot using SysRq (well, after > > > > reboot it still insisted on replaying insane number of transactions > > > > so may be it did *not* remount / ro after all). Before reboot there > > > > was brief output that resembled lockdep warnings, but it went too > > > > fast to be readable. > > > > > > > > usual stuff follows > > > > > > I see you're using CFQ as the default IO scheduler. Can you please > > > switch to AS and see if that changes anything? > > > > I just had the same lockup on resume using AS with 2.6.23-rc5. > > Hm. Does your root partition sit on reiserfs? yes - "single root on reiserfs" signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > suspend to disk :) > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > > libata with pata_ali driver. > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > absolutely silent (black screen without reaction to any key). In rc6 I > > > just got something different. After resume I got on screem: > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > swsusp: Basic memory bitmaps created > > > swsusp: Basic memory bitmaps freed > > > > > > After that it just sits there doing nothing. Ther was brief sound of HDD > > > but I suspect it was related more to power-on. System was responding to > > > power-on button press: > > > > > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > > 20070125] > > > > > > And SysRq was functioning. > > > > That probably means that there's a deadlock somewhere in there. > > > > > Unfortunately I do not have serial console so I > > > copy manually stacks from several last screens of output; I have tried to > > > make a photo but right now my kbluetooth is refusing to work at all so I > > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > > > laptop_mode D > > > io_schedule+0xe/0x20 > > > > Looks suspicious to me. Can you identify what line of code this points to? > > > > > sync_buffer+0x35/0x40 > > > __wait_on_bit+0x45/0x70 > > > out_of_line_wait_on_bit+0x6c/0x80 > > > __wait_on_buffer+0x27/0x30 > > > search_by_key+0x15e/0x1250 [reiserfs] > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > > do_lookup+0x138/0x180 > > > __link_path_walk+0x787/0xce0 > > > link_path_walk+0x44/0xc0 > > > path_walk+0x18/0x20 > > > do_path_lookup_0x88/0x210 > > > __path_lookupintent_open+0x4d/0x90 > > > path_lookup_open+0x1f/0x30 > > > open_exec+0x28/0xb0 > > > do_execve+0x36/0x1d0 > > > sys_execve+0x2e/0x80 > > > sysenter_past_esp+0x5f/0x99 > > > > > > 90clock D > > > __mutex_lock_slow_path+0xa1/0x290 > > > mutex_lock+0x21/0x30 > > > do_lookup+0xa1/0x180 > > > __link_path_walk+0x44/0xc0 > > > path_walk+0x18/0x20 > > > do_path_lookup+0x78/0x210 > > > __user_walk_fd+0x38/0x50 > > > vfs_stat_fd+0x21/0x50 > > > vfs_stat+0x11/0x20 > > > sys_stat64+0x14/0x30 > > > sysenter_past_esp+0x5f/0x99 > > > > > > alsactl D > > > io_schedule+0xe/0x20 > > > > Same here. Hmm. > > > > > sync_page+0x35/0x40 > > > __wait_on_bit_lock+0x3f/0x70 > > > __lock_page+0x68/0x70 > > > filemap_nopage+0x16c/0x300 > > > __handle_mm_faul+0x1d7/0x610 > > > do_page_fault+0x1d7/0x610 > > > error_code+0x6a/0x70 > > > padzero+0x1f/0x30 > > > load_elf_binary+0x743/0x1ab0 > > > search_binary_handler+0x7b/0x1f0 > > > do_execve+0x137/0x1d0 > > > sys_execve+0x2e/0x80 > > > sysenter_past_esp+0x5f/0x90 > > > > > > After that I could remount, sync and reboot using SysRq (well, after > > > reboot it still insisted on replaying insane number of transactions so > > > may be it did *not* remount / ro after all). Before reboot there was > > > brief output that resembled lockdep warnings, but it went too fast to be > > > readable. > > > > > > usual stuff follows > > > > I see you're using CFQ as the default IO scheduler. Can you please switch > > to AS and see if that changes anything? > > > > I just had the same lockup on resume using AS with 2.6.23-rc5. Hm. Does your root partition sit on reiserfs? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > suspend to disk :) > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > libata with pata_ali driver. > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > system hung at least once in every rcX. Up to rc6 those lockups were > > absolutely silent (black screen without reaction to any key). In rc6 I > > just got something different. After resume I got on screem: > > > > swsusp: Marking nosave pages: 0009f000-0010 > > swsusp: Basic memory bitmaps created > > swsusp: Basic memory bitmaps freed > > > > After that it just sits there doing nothing. Ther was brief sound of HDD > > but I suspect it was related more to power-on. System was responding to > > power-on button press: > > > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > 20070125] > > > > And SysRq was functioning. > > That probably means that there's a deadlock somewhere in there. > > > Unfortunately I do not have serial console so I > > copy manually stacks from several last screens of output; I have tried to > > make a photo but right now my kbluetooth is refusing to work at all so I > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > laptop_mode D > > io_schedule+0xe/0x20 > > Looks suspicious to me. Can you identify what line of code this points to? > > > sync_buffer+0x35/0x40 > > __wait_on_bit+0x45/0x70 > > out_of_line_wait_on_bit+0x6c/0x80 > > __wait_on_buffer+0x27/0x30 > > search_by_key+0x15e/0x1250 [reiserfs] > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > do_lookup+0x138/0x180 > > __link_path_walk+0x787/0xce0 > > link_path_walk+0x44/0xc0 > > path_walk+0x18/0x20 > > do_path_lookup_0x88/0x210 > > __path_lookupintent_open+0x4d/0x90 > > path_lookup_open+0x1f/0x30 > > open_exec+0x28/0xb0 > > do_execve+0x36/0x1d0 > > sys_execve+0x2e/0x80 > > sysenter_past_esp+0x5f/0x99 > > > > 90clock D > > __mutex_lock_slow_path+0xa1/0x290 > > mutex_lock+0x21/0x30 > > do_lookup+0xa1/0x180 > > __link_path_walk+0x44/0xc0 > > path_walk+0x18/0x20 > > do_path_lookup+0x78/0x210 > > __user_walk_fd+0x38/0x50 > > vfs_stat_fd+0x21/0x50 > > vfs_stat+0x11/0x20 > > sys_stat64+0x14/0x30 > > sysenter_past_esp+0x5f/0x99 > > > > alsactl D > > io_schedule+0xe/0x20 > > Same here. Hmm. > > > sync_page+0x35/0x40 > > __wait_on_bit_lock+0x3f/0x70 > > __lock_page+0x68/0x70 > > filemap_nopage+0x16c/0x300 > > __handle_mm_faul+0x1d7/0x610 > > do_page_fault+0x1d7/0x610 > > error_code+0x6a/0x70 > > padzero+0x1f/0x30 > > load_elf_binary+0x743/0x1ab0 > > search_binary_handler+0x7b/0x1f0 > > do_execve+0x137/0x1d0 > > sys_execve+0x2e/0x80 > > sysenter_past_esp+0x5f/0x90 > > > > After that I could remount, sync and reboot using SysRq (well, after > > reboot it still insisted on replaying insane number of transactions so > > may be it did *not* remount / ro after all). Before reboot there was > > brief output that resembled lockdep warnings, but it went too fast to be > > readable. > > > > usual stuff follows > > I see you're using CFQ as the default IO scheduler. Can you please switch > to AS and see if that changes anything? > I just had the same lockup on resume using AS with 2.6.23-rc5. signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? I just had the same lockup on resume using AS with 2.6.23-rc5. signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? I just had the same lockup on resume using AS with 2.6.23-rc5. Hm. Does your root partition sit on reiserfs? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 09 September 2007, Rafael J. Wysocki wrote: On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? I just had the same lockup on resume using AS with 2.6.23-rc5. Hm. Does your root partition sit on reiserfs? yes - single root on reiserfs signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Sorry for top posting, but this is MAYBE a related matter, i am not sure. the thing is, i am running with libata and reiserfs on a raid5 with 6 disks, and after i changed to libata it has worked excellently (before it used to give DMA errors and then go boom). however now i sometimes, if theres some load on the array, see that the hdd leds go fully on, and for ~10sec to 1 min, all IO just stops complete, and after the time, it resumes and works perfectly. any ideas? and i also bring this up as it may give a clue as to what causes this. - I also use CFQ On Sun, 2007-09-02 at 15:29 +0400, Andrey Borzenkov wrote: > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: > > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > > suspend to disk :) > > > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + > > > > > single pata_ali patch to switch off DMA on CD-ROM), single root on > > > > > reiserfs, libata with pata_ali driver. > > > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > > absolutely silent (black screen without reaction to any key). In rc6 > > > > > I just got something different. After resume I got on screem: > > > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > > swsusp: Basic memory bitmaps created > > > > > swsusp: Basic memory bitmaps freed > > > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of > > > > > HDD but I suspect it was related more to power-on. System was > > > > > responding to power-on button press: > > > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event > > > > > [0002 20070125] > > > > > > > > > > And SysRq was functioning. > > > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > > > Unfortunately I do not have serial console so I > > > > > copy manually stacks from several last screens of output; I have > > > > > tried to make a photo but right now my kbluetooth is refusing to work > > > > > at all so I cannot transfer them :( (but I suspect quality would be > > > > > too bad anyway) > > > > > > > > > > laptop_mode D > > > > > io_schedule+0xe/0x20 > > > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > > to? > > > > > > If you could explain how to ... > > > > Michal has already done that. :-) > > > > [--snip--] > > > > > > I see you're using CFQ as the default IO scheduler. Can you please > > > > switch to AS and see if that changes anything? > > > > > > Sure, but given that I have no idea how to reproduce the lockup, we may > > > never know whether it actually helped. > > > > Well, if the lockup never happens with AS, that will indicate something ... > > > > I thought it is gone but it just happened again with 2.6.23-rc5. I thought I > have been running AS but no, I did use CFQ. Now I definitely switched to AS > default; let's see ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: > On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > suspend to disk :) > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + > > > > single pata_ali patch to switch off DMA on CD-ROM), single root on > > > > reiserfs, libata with pata_ali driver. > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > absolutely silent (black screen without reaction to any key). In rc6 > > > > I just got something different. After resume I got on screem: > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > swsusp: Basic memory bitmaps created > > > > swsusp: Basic memory bitmaps freed > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of > > > > HDD but I suspect it was related more to power-on. System was > > > > responding to power-on button press: > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event > > > > [0002 20070125] > > > > > > > > And SysRq was functioning. > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > Unfortunately I do not have serial console so I > > > > copy manually stacks from several last screens of output; I have > > > > tried to make a photo but right now my kbluetooth is refusing to work > > > > at all so I cannot transfer them :( (but I suspect quality would be > > > > too bad anyway) > > > > > > > > laptop_mode D > > > > io_schedule+0xe/0x20 > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > to? > > > > If you could explain how to ... > > Michal has already done that. :-) > > [--snip--] > > > > I see you're using CFQ as the default IO scheduler. Can you please > > > switch to AS and see if that changes anything? > > > > Sure, but given that I have no idea how to reproduce the lockup, we may > > never know whether it actually helped. > > Well, if the lockup never happens with AS, that will indicate something ... > I thought it is gone but it just happened again with 2.6.23-rc5. I thought I have been running AS but no, I did use CFQ. Now I definitely switched to AS default; let's see ... signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... Michal has already done that. :-) [--snip--] I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. Well, if the lockup never happens with AS, that will indicate something ... I thought it is gone but it just happened again with 2.6.23-rc5. I thought I have been running AS but no, I did use CFQ. Now I definitely switched to AS default; let's see ... signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Sorry for top posting, but this is MAYBE a related matter, i am not sure. the thing is, i am running with libata and reiserfs on a raid5 with 6 disks, and after i changed to libata it has worked excellently (before it used to give DMA errors and then go boom). however now i sometimes, if theres some load on the array, see that the hdd leds go fully on, and for ~10sec to 1 min, all IO just stops complete, and after the time, it resumes and works perfectly. any ideas? and i also bring this up as it may give a clue as to what causes this. - I also use CFQ On Sun, 2007-09-02 at 15:29 +0400, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... Michal has already done that. :-) [--snip--] I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. Well, if the lockup never happens with AS, that will indicate something ... I thought it is gone but it just happened again with 2.6.23-rc5. I thought I have been running AS but no, I did use CFQ. Now I definitely switched to AS default; let's see ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: > On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: > > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > > suspend to disk :) > > > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + > > > > single pata_ali patch to switch off DMA on CD-ROM), single root on > > > > reiserfs, libata with pata_ali driver. > > > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > > absolutely silent (black screen without reaction to any key). In rc6 > > > > I just got something different. After resume I got on screem: > > > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > > swsusp: Basic memory bitmaps created > > > > swsusp: Basic memory bitmaps freed > > > > > > > > After that it just sits there doing nothing. Ther was brief sound of > > > > HDD but I suspect it was related more to power-on. System was > > > > responding to power-on button press: > > > > > > > > ACPI Error (event-0305): No installed handler for fixed event > > > > [0002 20070125] > > > > > > > > And SysRq was functioning. > > > > > > That probably means that there's a deadlock somewhere in there. > > > > > > > Unfortunately I do not have serial console so I > > > > copy manually stacks from several last screens of output; I have > > > > tried to make a photo but right now my kbluetooth is refusing to work > > > > at all so I cannot transfer them :( (but I suspect quality would be > > > > too bad anyway) > > > > > > > > laptop_mode D > > > > io_schedule+0xe/0x20 > > > > > > Looks suspicious to me. Can you identify what line of code this points > > > to? > > > > If you could explain how to ... > > Michal has already done that. :-) > (gdb) l *io_schedule+0xe 0xc02aa84e is in io_schedule (include2/asm/atomic.h:110). 105 * 106 * Atomically decrements @v by 1. 107 */ 108 static __inline__ void atomic_dec(atomic_t *v) 109 { 110 __asm__ __volatile__( 111 LOCK_PREFIX "decl %0" 112 :"+m" (v->counter)); 113 } 114 > [--snip--] > > > > I see you're using CFQ as the default IO scheduler. Can you please > > > switch to AS and see if that changes anything? > > > > Sure, but given that I have no idea how to reproduce the lockup, we may > > never know whether it actually helped. > > Well, if the lockup never happens with AS, that will indicate something ... > Well, I was about to say that it is probably gone as it hit again. Now with IDE, so we at least can rule out libata. But reiserfs is still in path, so I Cc list. It is stock 2.6.22. I will switch to AS, but it took 2 weeks to happen with CFS and in one week I will be off for a couple of weeks. Here is hand-copied information about locks and blocked processes. Showing all locks held in the system: 1 lock held by syslogd/2515 #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30 1 lock held by X/3800: #0: (>mmap_sem){}, at: [] do_page_fault+0x1e8/0x5e0 6 times migetty - 1 lock held by mingetty/3838: #0: (>atomic_read_lock){--..}, at: [] mutex_lock_interruptible+0x21/0x30 2 times zsh - 1 lock held by zsh/4276: #0: (>atomic_read_lock){--..}, at: [] mutex_lock_interruptible+0x21/0x30 1 lock held by consolehelper-g/21231: #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30 1 lock held by kio_http/31282: #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30 and list of blocked tasks: syslogd io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60 __wait_on_buffer+0x27/0x30 flush_commit_list+0x397/0x610 [reiserfs] do_journal_end+0xadc/0xc90 [reiserfs] journal_end_sync+0x5d/0x70 [reiserfs] reiserfs_commit_for_inode+0x17e/0x1a0 [reiserfs] reiserfs_sync_file+0x2d/0x70 [reiserfs] do_fsync+0x28/0x40 sys_fsync+0xd/0x10 sysenter_past_esp+0x5f/0x99 X io_schedule+0xe/0x20 sync_page+0x3a/0x50 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x4c/0x60 __handle_mm_fault+0x657/0x860 do_page_fault+0x2f4/0x5e0 error_code+0x6a/0x70 consolehelper io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60 __wait_on_buffer+0x27/0x30 search_by_key+0x17e/0x1370 [reiserfs] search_by_entry_key+0x1c/0x2a0 [reiserfs] reiserfs_find_entry+0x7d/0x3a0 [reiserfs] reiserfs_lookup+0x75/0x120 [reiserfs] do_lookup+0x133/0x180 __link_path_walk+0x765/0xd10 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x7c/0x200 __path_lookup_intent_open+0x1f/0x30 open_namei+0x66/0x670 do_filp_open+0x2c/0x50 do_sys_open+0x47/0xd0 sys_open+0x1c/0x20 syscall_call+0x7/0xb kio_http io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... Michal has already done that. :-) (gdb) l *io_schedule+0xe 0xc02aa84e is in io_schedule (include2/asm/atomic.h:110). 105 * 106 * Atomically decrements @v by 1. 107 */ 108 static __inline__ void atomic_dec(atomic_t *v) 109 { 110 __asm__ __volatile__( 111 LOCK_PREFIX decl %0 112 :+m (v-counter)); 113 } 114 [--snip--] I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. Well, if the lockup never happens with AS, that will indicate something ... Well, I was about to say that it is probably gone as it hit again. Now with IDE, so we at least can rule out libata. But reiserfs is still in path, so I Cc list. It is stock 2.6.22. I will switch to AS, but it took 2 weeks to happen with CFS and in one week I will be off for a couple of weeks. Here is hand-copied information about locks and blocked processes. Showing all locks held in the system: 1 lock held by syslogd/2515 #0: (inode-i_mutex){--..}, at: [c02ab4d15] mutex_lock+0x21/0x30 1 lock held by X/3800: #0: (mm-mmap_sem){}, at: [c02ae2e8] do_page_fault+0x1e8/0x5e0 6 times migetty - 1 lock held by mingetty/3838: #0: (tty-atomic_read_lock){--..}, at: [c02ab0915] mutex_lock_interruptible+0x21/0x30 2 times zsh - 1 lock held by zsh/4276: #0: (tty-atomic_read_lock){--..}, at: [c02ab0915] mutex_lock_interruptible+0x21/0x30 1 lock held by consolehelper-g/21231: #0: (inode-i_mutex){--..}, at: [c02ab4b1] mutex_lock+0x21/0x30 1 lock held by kio_http/31282: #0: (inode-i_mutex){--..}, at: [c02ab4b1] mutex_lock+0x21/0x30 and list of blocked tasks: syslogd io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60 __wait_on_buffer+0x27/0x30 flush_commit_list+0x397/0x610 [reiserfs] do_journal_end+0xadc/0xc90 [reiserfs] journal_end_sync+0x5d/0x70 [reiserfs] reiserfs_commit_for_inode+0x17e/0x1a0 [reiserfs] reiserfs_sync_file+0x2d/0x70 [reiserfs] do_fsync+0x28/0x40 sys_fsync+0xd/0x10 sysenter_past_esp+0x5f/0x99 X io_schedule+0xe/0x20 sync_page+0x3a/0x50 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x4c/0x60 __handle_mm_fault+0x657/0x860 do_page_fault+0x2f4/0x5e0 error_code+0x6a/0x70 consolehelper io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60 __wait_on_buffer+0x27/0x30 search_by_key+0x17e/0x1370 [reiserfs] search_by_entry_key+0x1c/0x2a0 [reiserfs] reiserfs_find_entry+0x7d/0x3a0 [reiserfs] reiserfs_lookup+0x75/0x120 [reiserfs] do_lookup+0x133/0x180 __link_path_walk+0x765/0xd10 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x7c/0x200 __path_lookup_intent_open+0x1f/0x30 open_namei+0x66/0x670 do_filp_open+0x2c/0x50 do_sys_open+0x47/0xd0 sys_open+0x1c/0x20 syscall_call+0x7/0xb kio_http io_schedule+0xe/0x20 sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x50/0x60 __wait_on_buffer+0x27/0x30 search_by_key+0x17e/0x1370 [reiserfs] reiserfs_read_locked_inode+0x63/0x570
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Hi! > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > 20070125] > > > > And SysRq was functioning. > > That probably means that there's a deadlock somewhere in there. > > > Unfortunately I do not have serial console so I > > copy manually stacks from several last screens of output; I have tried to > > make a photo but right now my kbluetooth is refusing to work at all so I > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > laptop_mode D > > io_schedule+0xe/0x20 > > Looks suspicious to me. Can you identify what line of code this points to? Actually, I see laptop_mode being locked. laptop_mode does disk spindowns, which is somehow unusual. Does it happen w/o laptop mode? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: > On Sunday 01 July 2007, Rafael J. Wysocki wrote: > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > > suspend to disk :) > > > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > > libata with pata_ali driver. > > > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > > system hung at least once in every rcX. Up to rc6 those lockups were > > > absolutely silent (black screen without reaction to any key). In rc6 I > > > just got something different. After resume I got on screem: > > > > > > swsusp: Marking nosave pages: 0009f000-0010 > > > swsusp: Basic memory bitmaps created > > > swsusp: Basic memory bitmaps freed > > > > > > After that it just sits there doing nothing. Ther was brief sound of HDD > > > but I suspect it was related more to power-on. System was responding to > > > power-on button press: > > > > > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > > 20070125] > > > > > > And SysRq was functioning. > > > > That probably means that there's a deadlock somewhere in there. > > > > > Unfortunately I do not have serial console so I > > > copy manually stacks from several last screens of output; I have tried to > > > make a photo but right now my kbluetooth is refusing to work at all so I > > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > > > laptop_mode D > > > io_schedule+0xe/0x20 > > > > Looks suspicious to me. Can you identify what line of code this points to? > > > > If you could explain how to ... Michal has already done that. :-) [--snip--] > > > > I see you're using CFQ as the default IO scheduler. Can you please switch > > to AS and see if that changes anything? > > > > Sure, but given that I have no idea how to reproduce the lockup, we may never > know whether it actually helped. Well, if the lockup never happens with AS, that will indicate something ... Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... Michal has already done that. :-) [--snip--] I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. Well, if the lockup never happens with AS, that will indicate something ... Greetings, Rafael -- Premature optimization is the root of all evil. - Donald Knuth - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Hi! ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? Actually, I see laptop_mode being locked. laptop_mode does disk spindowns, which is somehow unusual. Does it happen w/o laptop mode? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Andrey Borzenkov pisze: > On Sunday 01 July 2007, Rafael J. Wysocki wrote: >> On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: >>> Since 2.6.18 I do not have suspend to RAM; now I am starting to lose >>> suspend to disk :) >>> >>> Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single >>> pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, >>> libata with pata_ali driver. >>> >>> Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc >>> system hung at least once in every rcX. Up to rc6 those lockups were >>> absolutely silent (black screen without reaction to any key). In rc6 I >>> just got something different. After resume I got on screem: >>> >>> swsusp: Marking nosave pages: 0009f000-0010 >>> swsusp: Basic memory bitmaps created >>> swsusp: Basic memory bitmaps freed >>> >>> After that it just sits there doing nothing. Ther was brief sound of HDD >>> but I suspect it was related more to power-on. System was responding to >>> power-on button press: >>> >>> ACPI Error (event-0305): No installed handler for fixed event [0002 >>> 20070125] >>> >>> And SysRq was functioning. >> That probably means that there's a deadlock somewhere in there. >> >>> Unfortunately I do not have serial console so I >>> copy manually stacks from several last screens of output; I have tried to >>> make a photo but right now my kbluetooth is refusing to work at all so I >>> cannot transfer them :( (but I suspect quality would be too bad anyway) >>> >>> laptop_mode D >>> io_schedule+0xe/0x20 >> Looks suspicious to me. Can you identify what line of code this points to? >> > > If you could explain how to ... gdb vmlinux (gdb) l *io_schedule+0xe Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose > > suspend to disk :) > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, > > libata with pata_ali driver. > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc > > system hung at least once in every rcX. Up to rc6 those lockups were > > absolutely silent (black screen without reaction to any key). In rc6 I > > just got something different. After resume I got on screem: > > > > swsusp: Marking nosave pages: 0009f000-0010 > > swsusp: Basic memory bitmaps created > > swsusp: Basic memory bitmaps freed > > > > After that it just sits there doing nothing. Ther was brief sound of HDD > > but I suspect it was related more to power-on. System was responding to > > power-on button press: > > > > ACPI Error (event-0305): No installed handler for fixed event [0002 > > 20070125] > > > > And SysRq was functioning. > > That probably means that there's a deadlock somewhere in there. > > > Unfortunately I do not have serial console so I > > copy manually stacks from several last screens of output; I have tried to > > make a photo but right now my kbluetooth is refusing to work at all so I > > cannot transfer them :( (but I suspect quality would be too bad anyway) > > > > laptop_mode D > > io_schedule+0xe/0x20 > > Looks suspicious to me. Can you identify what line of code this points to? > If you could explain how to ... (I never understood what those two numbers mean :) ) Here is disassembled function 4168 .section .sched.text 4169 .p2align 4,,15 4170 .globl io_schedule 4171 .type io_schedule,@function 4172 io_schedule: 4173 0cd0 55 pushl %ebp 4174 0cd1 89E5 movl %esp,%ebp 4175 4176 0cd3 FF05140A incl per_cpu__runqueues+2388 4176 4177 4178 0cd9 E8FC call schedule 4178 FF 4179 4180 0cde FF0D140A decl per_cpu__runqueues+2388 4180 4181 4182 0ce4 5D popl %ebp 4183 0ce5 C3 ret 4184 .size io_schedule,.-io_schedule > > sync_buffer+0x35/0x40 > > __wait_on_bit+0x45/0x70 > > out_of_line_wait_on_bit+0x6c/0x80 > > __wait_on_buffer+0x27/0x30 > > search_by_key+0x15e/0x1250 [reiserfs] > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > > reiserfs_iget+0x7e/0xa0 [reiserfs] > > reiserfs_lookup+0xc7/0x120 [reiserfs] > > do_lookup+0x138/0x180 > > __link_path_walk+0x787/0xce0 > > link_path_walk+0x44/0xc0 > > path_walk+0x18/0x20 > > do_path_lookup_0x88/0x210 > > __path_lookupintent_open+0x4d/0x90 > > path_lookup_open+0x1f/0x30 > > open_exec+0x28/0xb0 > > do_execve+0x36/0x1d0 > > sys_execve+0x2e/0x80 > > sysenter_past_esp+0x5f/0x99 > > > > 90clock D > > __mutex_lock_slow_path+0xa1/0x290 > > mutex_lock+0x21/0x30 > > do_lookup+0xa1/0x180 > > __link_path_walk+0x44/0xc0 > > path_walk+0x18/0x20 > > do_path_lookup+0x78/0x210 > > __user_walk_fd+0x38/0x50 > > vfs_stat_fd+0x21/0x50 > > vfs_stat+0x11/0x20 > > sys_stat64+0x14/0x30 > > sysenter_past_esp+0x5f/0x99 > > > > alsactl D > > io_schedule+0xe/0x20 > > Same here. Hmm. > > > sync_page+0x35/0x40 > > __wait_on_bit_lock+0x3f/0x70 > > __lock_page+0x68/0x70 > > filemap_nopage+0x16c/0x300 > > __handle_mm_faul+0x1d7/0x610 > > do_page_fault+0x1d7/0x610 > > error_code+0x6a/0x70 > > padzero+0x1f/0x30 > > load_elf_binary+0x743/0x1ab0 > > search_binary_handler+0x7b/0x1f0 > > do_execve+0x137/0x1d0 > > sys_execve+0x2e/0x80 > > sysenter_past_esp+0x5f/0x90 > > > > After that I could remount, sync and reboot using SysRq (well, after > > reboot it still insisted on replaying insane number of transactions so > > may be it did *not* remount / ro after all). Before reboot there was > > brief output that resembled lockdep warnings, but it went too fast to be > > readable. > > > > usual stuff follows > > I see you're using CFQ as the default IO scheduler. Can you please switch > to AS and see if that changes anything? > Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend > to disk :) > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata > with pata_ali driver. > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system > hung at least once in every rcX. Up to rc6 those lockups were absolutely > silent (black screen without reaction to any key). In rc6 I just got > something different. After resume I got on screem: > > swsusp: Marking nosave pages: 0009f000-0010 > swsusp: Basic memory bitmaps created > swsusp: Basic memory bitmaps freed > > After that it just sits there doing nothing. Ther was brief sound of HDD but > I > suspect it was related more to power-on. System was responding to power-on > button press: > > ACPI Error (event-0305): No installed handler for fixed event [0002 > 20070125] > > And SysRq was functioning. That probably means that there's a deadlock somewhere in there. > Unfortunately I do not have serial console so I > copy manually stacks from several last screens of output; I have tried to > make a photo but right now my kbluetooth is refusing to work at all so I > cannot transfer them :( (but I suspect quality would be too bad anyway) > > laptop_mode D > io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? > sync_buffer+0x35/0x40 > __wait_on_bit+0x45/0x70 > out_of_line_wait_on_bit+0x6c/0x80 > __wait_on_buffer+0x27/0x30 > search_by_key+0x15e/0x1250 [reiserfs] > reiserfs_read_locked_inode+0x64/0x570 [reiserfs] > reiserfs_iget+0x7e/0xa0 [reiserfs] > reiserfs_lookup+0xc7/0x120 [reiserfs] > do_lookup+0x138/0x180 > __link_path_walk+0x787/0xce0 > link_path_walk+0x44/0xc0 > path_walk+0x18/0x20 > do_path_lookup_0x88/0x210 > __path_lookupintent_open+0x4d/0x90 > path_lookup_open+0x1f/0x30 > open_exec+0x28/0xb0 > do_execve+0x36/0x1d0 > sys_execve+0x2e/0x80 > sysenter_past_esp+0x5f/0x99 > > 90clock D > __mutex_lock_slow_path+0xa1/0x290 > mutex_lock+0x21/0x30 > do_lookup+0xa1/0x180 > __link_path_walk+0x44/0xc0 > path_walk+0x18/0x20 > do_path_lookup+0x78/0x210 > __user_walk_fd+0x38/0x50 > vfs_stat_fd+0x21/0x50 > vfs_stat+0x11/0x20 > sys_stat64+0x14/0x30 > sysenter_past_esp+0x5f/0x99 > > alsactl D > io_schedule+0xe/0x20 Same here. Hmm. > sync_page+0x35/0x40 > __wait_on_bit_lock+0x3f/0x70 > __lock_page+0x68/0x70 > filemap_nopage+0x16c/0x300 > __handle_mm_faul+0x1d7/0x610 > do_page_fault+0x1d7/0x610 > error_code+0x6a/0x70 > padzero+0x1f/0x30 > load_elf_binary+0x743/0x1ab0 > search_binary_handler+0x7b/0x1f0 > do_execve+0x137/0x1d0 > sys_execve+0x2e/0x80 > sysenter_past_esp+0x5f/0x90 > > After that I could remount, sync and reboot using SysRq (well, after reboot > it > still insisted on replaying insane number of transactions so may be it did > *not* remount / ro after all). Before reboot there was brief output that > resembled lockdep warnings, but it went too fast to be readable. > > usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Greetings, Rafael -- Premature optimization is the root of all evil. - Donald Knuth - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... (I never understood what those two numbers mean :) ) Here is disassembled function 4168 .section .sched.text 4169 .p2align 4,,15 4170 .globl io_schedule 4171 .type io_schedule,@function 4172 io_schedule: 4173 0cd0 55 pushl %ebp 4174 0cd1 89E5 movl %esp,%ebp 4175 4176 0cd3 FF05140A incl per_cpu__runqueues+2388 4176 4177 4178 0cd9 E8FC call schedule 4178 FF 4179 4180 0cde FF0D140A decl per_cpu__runqueues+2388 4180 4181 4182 0ce4 5D popl %ebp 4183 0ce5 C3 ret 4184 .size io_schedule,.-io_schedule sync_buffer+0x35/0x40 __wait_on_bit+0x45/0x70 out_of_line_wait_on_bit+0x6c/0x80 __wait_on_buffer+0x27/0x30 search_by_key+0x15e/0x1250 [reiserfs] reiserfs_read_locked_inode+0x64/0x570 [reiserfs] reiserfs_iget+0x7e/0xa0 [reiserfs] reiserfs_lookup+0xc7/0x120 [reiserfs] do_lookup+0x138/0x180 __link_path_walk+0x787/0xce0 link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup_0x88/0x210 __path_lookupintent_open+0x4d/0x90 path_lookup_open+0x1f/0x30 open_exec+0x28/0xb0 do_execve+0x36/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x99 90clock D __mutex_lock_slow_path+0xa1/0x290 mutex_lock+0x21/0x30 do_lookup+0xa1/0x180 __link_path_walk+0x44/0xc0 path_walk+0x18/0x20 do_path_lookup+0x78/0x210 __user_walk_fd+0x38/0x50 vfs_stat_fd+0x21/0x50 vfs_stat+0x11/0x20 sys_stat64+0x14/0x30 sysenter_past_esp+0x5f/0x99 alsactl D io_schedule+0xe/0x20 Same here. Hmm. sync_page+0x35/0x40 __wait_on_bit_lock+0x3f/0x70 __lock_page+0x68/0x70 filemap_nopage+0x16c/0x300 __handle_mm_faul+0x1d7/0x610 do_page_fault+0x1d7/0x610 error_code+0x6a/0x70 padzero+0x1f/0x30 load_elf_binary+0x743/0x1ab0 search_binary_handler+0x7b/0x1f0 do_execve+0x137/0x1d0 sys_execve+0x2e/0x80 sysenter_past_esp+0x5f/0x90 After that I could remount, sync and reboot using SysRq (well, after reboot it still insisted on replaying insane number of transactions so may be it did *not* remount / ro after all). Before reboot there was brief output that resembled lockdep warnings, but it went too fast to be readable. usual stuff follows I see you're using CFQ as the default IO scheduler. Can you please switch to AS and see if that changes anything? Sure, but given that I have no idea how to reproduce the lockup, we may never know whether it actually helped. signature.asc Description: This is a digitally signed message part.
Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation
Andrey Borzenkov pisze: On Sunday 01 July 2007, Rafael J. Wysocki wrote: On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote: Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend to disk :) Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata with pata_ali driver. Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system hung at least once in every rcX. Up to rc6 those lockups were absolutely silent (black screen without reaction to any key). In rc6 I just got something different. After resume I got on screem: swsusp: Marking nosave pages: 0009f000-0010 swsusp: Basic memory bitmaps created swsusp: Basic memory bitmaps freed After that it just sits there doing nothing. Ther was brief sound of HDD but I suspect it was related more to power-on. System was responding to power-on button press: ACPI Error (event-0305): No installed handler for fixed event [0002 20070125] And SysRq was functioning. That probably means that there's a deadlock somewhere in there. Unfortunately I do not have serial console so I copy manually stacks from several last screens of output; I have tried to make a photo but right now my kbluetooth is refusing to work at all so I cannot transfer them :( (but I suspect quality would be too bad anyway) laptop_mode D io_schedule+0xe/0x20 Looks suspicious to me. Can you identify what line of code this points to? If you could explain how to ... gdb vmlinux (gdb) l *io_schedule+0xe Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/