Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-11-21 Thread Rafael J. Wysocki
On Wednesday, 21 of November 2007, Andrey Borzenkov wrote:
> On Sunday 09 September 2007, Rafael J. Wysocki wrote:
> > On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
> > > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > > suspend to disk :)
> > > > >
> > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > > > > libata with pata_ali driver.
> > > > >
> > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > > absolutely silent (black screen without reaction to any key). In rc6 I
> > > > > just got something different. After resume I got on screem:
> > > > >
> > > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > > swsusp: Basic memory bitmaps created
> > > > > swsusp: Basic memory bitmaps freed
> > > > >
> > > > > After that it just sits there doing nothing. Ther was brief sound of 
> > > > > HDD
> > > > > but I suspect it was related more to power-on. System was responding 
> > > > > to
> > > > > power-on button press:
> > > > >
> > > > > ACPI Error (event-0305): No installed handler for fixed event 
> > > > > [0002
> > > > > 20070125]
> > > > >
> > > > > And SysRq was functioning.
> > > >
> > > > That probably means that there's a deadlock somewhere in there.
> > > >
> > > > > Unfortunately I do not have serial console so I
> > > > > copy manually stacks from several last screens of output; I have 
> > > > > tried to
> > > > > make a photo but right now my kbluetooth is refusing to work at all 
> > > > > so I
> > > > > cannot transfer them :( (but I suspect quality would be too bad 
> > > > > anyway)
> > > > >
> > > > > laptop_mode D
> > > > >   io_schedule+0xe/0x20
> > > >
> > > > Looks suspicious to me.  Can you identify what line of code this points 
> > > > to?
> > > >
> > > > >   sync_buffer+0x35/0x40
> > > > >   __wait_on_bit+0x45/0x70
> > > > >   out_of_line_wait_on_bit+0x6c/0x80
> > > > >   __wait_on_buffer+0x27/0x30
> > > > >   search_by_key+0x15e/0x1250 [reiserfs]
> > > > >   reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > > > >   reiserfs_iget+0x7e/0xa0 [reiserfs]
> > > > >   reiserfs_lookup+0xc7/0x120 [reiserfs]
> > > > >   do_lookup+0x138/0x180
> > > > >   __link_path_walk+0x787/0xce0
> > > > >   link_path_walk+0x44/0xc0
> > > > >   path_walk+0x18/0x20
> > > > >   do_path_lookup_0x88/0x210
> > > > >   __path_lookupintent_open+0x4d/0x90
> > > > >   path_lookup_open+0x1f/0x30
> > > > >   open_exec+0x28/0xb0
> > > > >   do_execve+0x36/0x1d0
> > > > >   sys_execve+0x2e/0x80
> > > > >   sysenter_past_esp+0x5f/0x99
> > > > >
> > > > > 90clock D
> > > > >   __mutex_lock_slow_path+0xa1/0x290
> > > > >   mutex_lock+0x21/0x30
> > > > >   do_lookup+0xa1/0x180
> > > > >   __link_path_walk+0x44/0xc0
> > > > >   path_walk+0x18/0x20
> > > > >   do_path_lookup+0x78/0x210
> > > > >   __user_walk_fd+0x38/0x50
> > > > >   vfs_stat_fd+0x21/0x50
> > > > >   vfs_stat+0x11/0x20
> > > > >   sys_stat64+0x14/0x30
> > > > >   sysenter_past_esp+0x5f/0x99
> > > > >
> > > > > alsactl D
> > > > >   io_schedule+0xe/0x20
> > > >
> > > > Same here.  Hmm.
> > > >
> > > > >   sync_page+0x35/0x40
> > > > >   __wait_on_bit_lock+0x3f/0x70
> > > > >   __lock_page+0x68/0x70
> > > > >   filemap_nopage+0x16c/0x300
> > > > >   __handle_mm_faul+0x1d7/0x610
> > > > >   do_page_fault+0x1d7/0x610
> > > > >   error_code+0x6a/0x70
> > > > >   padzero+0x1f/0x30
> > > > >   load_elf_binary+0x743/0x1ab0
> > > > >   search_binary_handler+0x7b/0x1f0
> > > > >   do_execve+0x137/0x1d0
> > > > >   sys_execve+0x2e/0x80
> > > > >   sysenter_past_esp+0x5f/0x90
> > > > >
> > > > > After that I could remount, sync and reboot using SysRq (well, after
> > > > > reboot it still insisted on replaying insane number of transactions so
> > > > > may be it did *not* remount / ro after all). Before reboot there was
> > > > > brief output that resembled lockdep warnings, but it went too fast to 
> > > > > be
> > > > > readable.
> > > > >
> > > > > usual stuff follows
> > > >
> > > > I see you're using CFQ as the default IO scheduler.  Can you please 
> > > > switch
> > > > to AS and see if that changes anything?
> > > >
> > > 
> > > I just had the same lockup on resume using AS with 2.6.23-rc5.
> > 
> > Hm.  Does your root partition sit on reiserfs?
> 
> I already answered this but yes, I do.
> 
> > 
> > 
> 
> just had it again on 2.6.24-rc3. Same thing - keys working (to some extent)
> Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally)
> from resume 

Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-11-21 Thread Andrey Borzenkov
On Sunday 09 September 2007, Rafael J. Wysocki wrote:
> On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
> > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > suspend to disk :)
> > > >
> > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > > > libata with pata_ali driver.
> > > >
> > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > absolutely silent (black screen without reaction to any key). In rc6 I
> > > > just got something different. After resume I got on screem:
> > > >
> > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > swsusp: Basic memory bitmaps created
> > > > swsusp: Basic memory bitmaps freed
> > > >
> > > > After that it just sits there doing nothing. Ther was brief sound of HDD
> > > > but I suspect it was related more to power-on. System was responding to
> > > > power-on button press:
> > > >
> > > > ACPI Error (event-0305): No installed handler for fixed event [0002
> > > > 20070125]
> > > >
> > > > And SysRq was functioning.
> > >
> > > That probably means that there's a deadlock somewhere in there.
> > >
> > > > Unfortunately I do not have serial console so I
> > > > copy manually stacks from several last screens of output; I have tried 
> > > > to
> > > > make a photo but right now my kbluetooth is refusing to work at all so I
> > > > cannot transfer them :( (but I suspect quality would be too bad anyway)
> > > >
> > > > laptop_mode D
> > > > io_schedule+0xe/0x20
> > >
> > > Looks suspicious to me.  Can you identify what line of code this points 
> > > to?
> > >
> > > > sync_buffer+0x35/0x40
> > > > __wait_on_bit+0x45/0x70
> > > > out_of_line_wait_on_bit+0x6c/0x80
> > > > __wait_on_buffer+0x27/0x30
> > > > search_by_key+0x15e/0x1250 [reiserfs]
> > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > > > reiserfs_iget+0x7e/0xa0 [reiserfs]
> > > > reiserfs_lookup+0xc7/0x120 [reiserfs]
> > > > do_lookup+0x138/0x180
> > > > __link_path_walk+0x787/0xce0
> > > > link_path_walk+0x44/0xc0
> > > > path_walk+0x18/0x20
> > > > do_path_lookup_0x88/0x210
> > > > __path_lookupintent_open+0x4d/0x90
> > > > path_lookup_open+0x1f/0x30
> > > > open_exec+0x28/0xb0
> > > > do_execve+0x36/0x1d0
> > > > sys_execve+0x2e/0x80
> > > > sysenter_past_esp+0x5f/0x99
> > > >
> > > > 90clock D
> > > > __mutex_lock_slow_path+0xa1/0x290
> > > > mutex_lock+0x21/0x30
> > > > do_lookup+0xa1/0x180
> > > > __link_path_walk+0x44/0xc0
> > > > path_walk+0x18/0x20
> > > > do_path_lookup+0x78/0x210
> > > > __user_walk_fd+0x38/0x50
> > > > vfs_stat_fd+0x21/0x50
> > > > vfs_stat+0x11/0x20
> > > > sys_stat64+0x14/0x30
> > > > sysenter_past_esp+0x5f/0x99
> > > >
> > > > alsactl D
> > > > io_schedule+0xe/0x20
> > >
> > > Same here.  Hmm.
> > >
> > > > sync_page+0x35/0x40
> > > > __wait_on_bit_lock+0x3f/0x70
> > > > __lock_page+0x68/0x70
> > > > filemap_nopage+0x16c/0x300
> > > > __handle_mm_faul+0x1d7/0x610
> > > > do_page_fault+0x1d7/0x610
> > > > error_code+0x6a/0x70
> > > > padzero+0x1f/0x30
> > > > load_elf_binary+0x743/0x1ab0
> > > > search_binary_handler+0x7b/0x1f0
> > > > do_execve+0x137/0x1d0
> > > > sys_execve+0x2e/0x80
> > > > sysenter_past_esp+0x5f/0x90
> > > >
> > > > After that I could remount, sync and reboot using SysRq (well, after
> > > > reboot it still insisted on replaying insane number of transactions so
> > > > may be it did *not* remount / ro after all). Before reboot there was
> > > > brief output that resembled lockdep warnings, but it went too fast to be
> > > > readable.
> > > >
> > > > usual stuff follows
> > >
> > > I see you're using CFQ as the default IO scheduler.  Can you please switch
> > > to AS and see if that changes anything?
> > >
> > 
> > I just had the same lockup on resume using AS with 2.6.23-rc5.
> 
> Hm.  Does your root partition sit on reiserfs?

I already answered this but yes, I do.

> 
> 

just had it again on 2.6.24-rc3. Same thing - keys working (to some extent)
Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally)
from resume message to tty1 and it was in funny state so SysRq-t was lost but
I pretty much suspect it be the same.

well, not sure how to debug problem that pops up once in three-four release ...


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-11-21 Thread Andrey Borzenkov
On Sunday 09 September 2007, Rafael J. Wysocki wrote:
 On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
  On Sunday 01 July 2007, Rafael J. Wysocki wrote:
   On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
suspend to disk :)
   
Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
libata with pata_ali driver.
   
Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
system hung at least once in every rcX. Up to rc6 those lockups were
absolutely silent (black screen without reaction to any key). In rc6 I
just got something different. After resume I got on screem:
   
swsusp: Marking nosave pages: 0009f000-0010
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
   
After that it just sits there doing nothing. Ther was brief sound of HDD
but I suspect it was related more to power-on. System was responding to
power-on button press:
   
ACPI Error (event-0305): No installed handler for fixed event [0002
20070125]
   
And SysRq was functioning.
  
   That probably means that there's a deadlock somewhere in there.
  
Unfortunately I do not have serial console so I
copy manually stacks from several last screens of output; I have tried 
to
make a photo but right now my kbluetooth is refusing to work at all so I
cannot transfer them :( (but I suspect quality would be too bad anyway)
   
laptop_mode D
io_schedule+0xe/0x20
  
   Looks suspicious to me.  Can you identify what line of code this points 
   to?
  
sync_buffer+0x35/0x40
__wait_on_bit+0x45/0x70
out_of_line_wait_on_bit+0x6c/0x80
__wait_on_buffer+0x27/0x30
search_by_key+0x15e/0x1250 [reiserfs]
reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
reiserfs_iget+0x7e/0xa0 [reiserfs]
reiserfs_lookup+0xc7/0x120 [reiserfs]
do_lookup+0x138/0x180
__link_path_walk+0x787/0xce0
link_path_walk+0x44/0xc0
path_walk+0x18/0x20
do_path_lookup_0x88/0x210
__path_lookupintent_open+0x4d/0x90
path_lookup_open+0x1f/0x30
open_exec+0x28/0xb0
do_execve+0x36/0x1d0
sys_execve+0x2e/0x80
sysenter_past_esp+0x5f/0x99
   
90clock D
__mutex_lock_slow_path+0xa1/0x290
mutex_lock+0x21/0x30
do_lookup+0xa1/0x180
__link_path_walk+0x44/0xc0
path_walk+0x18/0x20
do_path_lookup+0x78/0x210
__user_walk_fd+0x38/0x50
vfs_stat_fd+0x21/0x50
vfs_stat+0x11/0x20
sys_stat64+0x14/0x30
sysenter_past_esp+0x5f/0x99
   
alsactl D
io_schedule+0xe/0x20
  
   Same here.  Hmm.
  
sync_page+0x35/0x40
__wait_on_bit_lock+0x3f/0x70
__lock_page+0x68/0x70
filemap_nopage+0x16c/0x300
__handle_mm_faul+0x1d7/0x610
do_page_fault+0x1d7/0x610
error_code+0x6a/0x70
padzero+0x1f/0x30
load_elf_binary+0x743/0x1ab0
search_binary_handler+0x7b/0x1f0
do_execve+0x137/0x1d0
sys_execve+0x2e/0x80
sysenter_past_esp+0x5f/0x90
   
After that I could remount, sync and reboot using SysRq (well, after
reboot it still insisted on replaying insane number of transactions so
may be it did *not* remount / ro after all). Before reboot there was
brief output that resembled lockdep warnings, but it went too fast to be
readable.
   
usual stuff follows
  
   I see you're using CFQ as the default IO scheduler.  Can you please switch
   to AS and see if that changes anything?
  
  
  I just had the same lockup on resume using AS with 2.6.23-rc5.
 
 Hm.  Does your root partition sit on reiserfs?

I already answered this but yes, I do.

 
 

just had it again on 2.6.24-rc3. Same thing - keys working (to some extent)
Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally)
from resume message to tty1 and it was in funny state so SysRq-t was lost but
I pretty much suspect it be the same.

well, not sure how to debug problem that pops up once in three-four release ...


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-11-21 Thread Rafael J. Wysocki
On Wednesday, 21 of November 2007, Andrey Borzenkov wrote:
 On Sunday 09 September 2007, Rafael J. Wysocki wrote:
  On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
   On Sunday 01 July 2007, Rafael J. Wysocki wrote:
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
 Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
 suspend to disk :)

 Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
 pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
 libata with pata_ali driver.

 Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
 system hung at least once in every rcX. Up to rc6 those lockups were
 absolutely silent (black screen without reaction to any key). In rc6 I
 just got something different. After resume I got on screem:

 swsusp: Marking nosave pages: 0009f000-0010
 swsusp: Basic memory bitmaps created
 swsusp: Basic memory bitmaps freed

 After that it just sits there doing nothing. Ther was brief sound of 
 HDD
 but I suspect it was related more to power-on. System was responding 
 to
 power-on button press:

 ACPI Error (event-0305): No installed handler for fixed event 
 [0002
 20070125]

 And SysRq was functioning.
   
That probably means that there's a deadlock somewhere in there.
   
 Unfortunately I do not have serial console so I
 copy manually stacks from several last screens of output; I have 
 tried to
 make a photo but right now my kbluetooth is refusing to work at all 
 so I
 cannot transfer them :( (but I suspect quality would be too bad 
 anyway)

 laptop_mode D
   io_schedule+0xe/0x20
   
Looks suspicious to me.  Can you identify what line of code this points 
to?
   
   sync_buffer+0x35/0x40
   __wait_on_bit+0x45/0x70
   out_of_line_wait_on_bit+0x6c/0x80
   __wait_on_buffer+0x27/0x30
   search_by_key+0x15e/0x1250 [reiserfs]
   reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
   reiserfs_iget+0x7e/0xa0 [reiserfs]
   reiserfs_lookup+0xc7/0x120 [reiserfs]
   do_lookup+0x138/0x180
   __link_path_walk+0x787/0xce0
   link_path_walk+0x44/0xc0
   path_walk+0x18/0x20
   do_path_lookup_0x88/0x210
   __path_lookupintent_open+0x4d/0x90
   path_lookup_open+0x1f/0x30
   open_exec+0x28/0xb0
   do_execve+0x36/0x1d0
   sys_execve+0x2e/0x80
   sysenter_past_esp+0x5f/0x99

 90clock D
   __mutex_lock_slow_path+0xa1/0x290
   mutex_lock+0x21/0x30
   do_lookup+0xa1/0x180
   __link_path_walk+0x44/0xc0
   path_walk+0x18/0x20
   do_path_lookup+0x78/0x210
   __user_walk_fd+0x38/0x50
   vfs_stat_fd+0x21/0x50
   vfs_stat+0x11/0x20
   sys_stat64+0x14/0x30
   sysenter_past_esp+0x5f/0x99

 alsactl D
   io_schedule+0xe/0x20
   
Same here.  Hmm.
   
   sync_page+0x35/0x40
   __wait_on_bit_lock+0x3f/0x70
   __lock_page+0x68/0x70
   filemap_nopage+0x16c/0x300
   __handle_mm_faul+0x1d7/0x610
   do_page_fault+0x1d7/0x610
   error_code+0x6a/0x70
   padzero+0x1f/0x30
   load_elf_binary+0x743/0x1ab0
   search_binary_handler+0x7b/0x1f0
   do_execve+0x137/0x1d0
   sys_execve+0x2e/0x80
   sysenter_past_esp+0x5f/0x90

 After that I could remount, sync and reboot using SysRq (well, after
 reboot it still insisted on replaying insane number of transactions so
 may be it did *not* remount / ro after all). Before reboot there was
 brief output that resembled lockdep warnings, but it went too fast to 
 be
 readable.

 usual stuff follows
   
I see you're using CFQ as the default IO scheduler.  Can you please 
switch
to AS and see if that changes anything?
   
   
   I just had the same lockup on resume using AS with 2.6.23-rc5.
  
  Hm.  Does your root partition sit on reiserfs?
 
 I already answered this but yes, I do.
 
  
  
 
 just had it again on 2.6.24-rc3. Same thing - keys working (to some extent)
 Alt-SysRq allows me to reboot; unfortunately I switched (unintentionally)
 from resume message to tty1 and it was in funny state so SysRq-t was lost but
 I pretty much suspect it be the same.
 
 well, not sure how to debug problem that pops up once in three-four release 
 ...

And you never know when it happens ...

I have no idea.  It's probably related to your hardware configuration somehow,
as it doesn't seem to be reproducible in general.

Regards,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Andrey Borzenkov
On Sunday 09 September 2007, Rafael J. Wysocki wrote:
> On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
> > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > suspend to disk :)
> > > >
> > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
> > > > single pata_ali patch to switch off DMA on CD-ROM), single root on
> > > > reiserfs, libata with pata_ali driver.
> > > >
> > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > absolutely silent (black screen without reaction to any key). In rc6
> > > > I just got something different. After resume I got on screem:
> > > >
> > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > swsusp: Basic memory bitmaps created
> > > > swsusp: Basic memory bitmaps freed
> > > >
> > > > After that it just sits there doing nothing. Ther was brief sound of
> > > > HDD but I suspect it was related more to power-on. System was
> > > > responding to power-on button press:
> > > >
> > > > ACPI Error (event-0305): No installed handler for fixed event
> > > > [0002 20070125]
> > > >
> > > > And SysRq was functioning.
> > >
> > > That probably means that there's a deadlock somewhere in there.
> > >
> > > > Unfortunately I do not have serial console so I
> > > > copy manually stacks from several last screens of output; I have
> > > > tried to make a photo but right now my kbluetooth is refusing to work
> > > > at all so I cannot transfer them :( (but I suspect quality would be
> > > > too bad anyway)
> > > >
> > > > laptop_mode D
> > > > io_schedule+0xe/0x20
> > >
> > > Looks suspicious to me.  Can you identify what line of code this points
> > > to?
> > >
> > > > sync_buffer+0x35/0x40
> > > > __wait_on_bit+0x45/0x70
> > > > out_of_line_wait_on_bit+0x6c/0x80
> > > > __wait_on_buffer+0x27/0x30
> > > > search_by_key+0x15e/0x1250 [reiserfs]
> > > > reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > > > reiserfs_iget+0x7e/0xa0 [reiserfs]
> > > > reiserfs_lookup+0xc7/0x120 [reiserfs]
> > > > do_lookup+0x138/0x180
> > > > __link_path_walk+0x787/0xce0
> > > > link_path_walk+0x44/0xc0
> > > > path_walk+0x18/0x20
> > > > do_path_lookup_0x88/0x210
> > > > __path_lookupintent_open+0x4d/0x90
> > > > path_lookup_open+0x1f/0x30
> > > > open_exec+0x28/0xb0
> > > > do_execve+0x36/0x1d0
> > > > sys_execve+0x2e/0x80
> > > > sysenter_past_esp+0x5f/0x99
> > > >
> > > > 90clock D
> > > > __mutex_lock_slow_path+0xa1/0x290
> > > > mutex_lock+0x21/0x30
> > > > do_lookup+0xa1/0x180
> > > > __link_path_walk+0x44/0xc0
> > > > path_walk+0x18/0x20
> > > > do_path_lookup+0x78/0x210
> > > > __user_walk_fd+0x38/0x50
> > > > vfs_stat_fd+0x21/0x50
> > > > vfs_stat+0x11/0x20
> > > > sys_stat64+0x14/0x30
> > > > sysenter_past_esp+0x5f/0x99
> > > >
> > > > alsactl D
> > > > io_schedule+0xe/0x20
> > >
> > > Same here.  Hmm.
> > >
> > > > sync_page+0x35/0x40
> > > > __wait_on_bit_lock+0x3f/0x70
> > > > __lock_page+0x68/0x70
> > > > filemap_nopage+0x16c/0x300
> > > > __handle_mm_faul+0x1d7/0x610
> > > > do_page_fault+0x1d7/0x610
> > > > error_code+0x6a/0x70
> > > > padzero+0x1f/0x30
> > > > load_elf_binary+0x743/0x1ab0
> > > > search_binary_handler+0x7b/0x1f0
> > > > do_execve+0x137/0x1d0
> > > > sys_execve+0x2e/0x80
> > > > sysenter_past_esp+0x5f/0x90
> > > >
> > > > After that I could remount, sync and reboot using SysRq (well, after
> > > > reboot it still insisted on replaying insane number of transactions
> > > > so may be it did *not* remount / ro after all). Before reboot there
> > > > was brief output that resembled lockdep warnings, but it went too
> > > > fast to be readable.
> > > >
> > > > usual stuff follows
> > >
> > > I see you're using CFQ as the default IO scheduler.  Can you please
> > > switch to AS and see if that changes anything?
> >
> > I just had the same lockup on resume using AS with 2.6.23-rc5.
>
> Hm.  Does your root partition sit on reiserfs?

yes - "single root on reiserfs"


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Rafael J. Wysocki
On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
> On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > suspend to disk :)
> > >
> > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > > libata with pata_ali driver.
> > >
> > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > absolutely silent (black screen without reaction to any key). In rc6 I
> > > just got something different. After resume I got on screem:
> > >
> > > swsusp: Marking nosave pages: 0009f000-0010
> > > swsusp: Basic memory bitmaps created
> > > swsusp: Basic memory bitmaps freed
> > >
> > > After that it just sits there doing nothing. Ther was brief sound of HDD
> > > but I suspect it was related more to power-on. System was responding to
> > > power-on button press:
> > >
> > > ACPI Error (event-0305): No installed handler for fixed event [0002
> > > 20070125]
> > >
> > > And SysRq was functioning.
> >
> > That probably means that there's a deadlock somewhere in there.
> >
> > > Unfortunately I do not have serial console so I
> > > copy manually stacks from several last screens of output; I have tried to
> > > make a photo but right now my kbluetooth is refusing to work at all so I
> > > cannot transfer them :( (but I suspect quality would be too bad anyway)
> > >
> > > laptop_mode D
> > >   io_schedule+0xe/0x20
> >
> > Looks suspicious to me.  Can you identify what line of code this points to?
> >
> > >   sync_buffer+0x35/0x40
> > >   __wait_on_bit+0x45/0x70
> > >   out_of_line_wait_on_bit+0x6c/0x80
> > >   __wait_on_buffer+0x27/0x30
> > >   search_by_key+0x15e/0x1250 [reiserfs]
> > >   reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > >   reiserfs_iget+0x7e/0xa0 [reiserfs]
> > >   reiserfs_lookup+0xc7/0x120 [reiserfs]
> > >   do_lookup+0x138/0x180
> > >   __link_path_walk+0x787/0xce0
> > >   link_path_walk+0x44/0xc0
> > >   path_walk+0x18/0x20
> > >   do_path_lookup_0x88/0x210
> > >   __path_lookupintent_open+0x4d/0x90
> > >   path_lookup_open+0x1f/0x30
> > >   open_exec+0x28/0xb0
> > >   do_execve+0x36/0x1d0
> > >   sys_execve+0x2e/0x80
> > >   sysenter_past_esp+0x5f/0x99
> > >
> > > 90clock D
> > >   __mutex_lock_slow_path+0xa1/0x290
> > >   mutex_lock+0x21/0x30
> > >   do_lookup+0xa1/0x180
> > >   __link_path_walk+0x44/0xc0
> > >   path_walk+0x18/0x20
> > >   do_path_lookup+0x78/0x210
> > >   __user_walk_fd+0x38/0x50
> > >   vfs_stat_fd+0x21/0x50
> > >   vfs_stat+0x11/0x20
> > >   sys_stat64+0x14/0x30
> > >   sysenter_past_esp+0x5f/0x99
> > >
> > > alsactl D
> > >   io_schedule+0xe/0x20
> >
> > Same here.  Hmm.
> >
> > >   sync_page+0x35/0x40
> > >   __wait_on_bit_lock+0x3f/0x70
> > >   __lock_page+0x68/0x70
> > >   filemap_nopage+0x16c/0x300
> > >   __handle_mm_faul+0x1d7/0x610
> > >   do_page_fault+0x1d7/0x610
> > >   error_code+0x6a/0x70
> > >   padzero+0x1f/0x30
> > >   load_elf_binary+0x743/0x1ab0
> > >   search_binary_handler+0x7b/0x1f0
> > >   do_execve+0x137/0x1d0
> > >   sys_execve+0x2e/0x80
> > >   sysenter_past_esp+0x5f/0x90
> > >
> > > After that I could remount, sync and reboot using SysRq (well, after
> > > reboot it still insisted on replaying insane number of transactions so
> > > may be it did *not* remount / ro after all). Before reboot there was
> > > brief output that resembled lockdep warnings, but it went too fast to be
> > > readable.
> > >
> > > usual stuff follows
> >
> > I see you're using CFQ as the default IO scheduler.  Can you please switch
> > to AS and see if that changes anything?
> >
> 
> I just had the same lockup on resume using AS with 2.6.23-rc5.

Hm.  Does your root partition sit on reiserfs?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > suspend to disk :)
> >
> > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > libata with pata_ali driver.
> >
> > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > system hung at least once in every rcX. Up to rc6 those lockups were
> > absolutely silent (black screen without reaction to any key). In rc6 I
> > just got something different. After resume I got on screem:
> >
> > swsusp: Marking nosave pages: 0009f000-0010
> > swsusp: Basic memory bitmaps created
> > swsusp: Basic memory bitmaps freed
> >
> > After that it just sits there doing nothing. Ther was brief sound of HDD
> > but I suspect it was related more to power-on. System was responding to
> > power-on button press:
> >
> > ACPI Error (event-0305): No installed handler for fixed event [0002
> > 20070125]
> >
> > And SysRq was functioning.
>
> That probably means that there's a deadlock somewhere in there.
>
> > Unfortunately I do not have serial console so I
> > copy manually stacks from several last screens of output; I have tried to
> > make a photo but right now my kbluetooth is refusing to work at all so I
> > cannot transfer them :( (but I suspect quality would be too bad anyway)
> >
> > laptop_mode D
> > io_schedule+0xe/0x20
>
> Looks suspicious to me.  Can you identify what line of code this points to?
>
> > sync_buffer+0x35/0x40
> > __wait_on_bit+0x45/0x70
> > out_of_line_wait_on_bit+0x6c/0x80
> > __wait_on_buffer+0x27/0x30
> > search_by_key+0x15e/0x1250 [reiserfs]
> > reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > reiserfs_iget+0x7e/0xa0 [reiserfs]
> > reiserfs_lookup+0xc7/0x120 [reiserfs]
> > do_lookup+0x138/0x180
> > __link_path_walk+0x787/0xce0
> > link_path_walk+0x44/0xc0
> > path_walk+0x18/0x20
> > do_path_lookup_0x88/0x210
> > __path_lookupintent_open+0x4d/0x90
> > path_lookup_open+0x1f/0x30
> > open_exec+0x28/0xb0
> > do_execve+0x36/0x1d0
> > sys_execve+0x2e/0x80
> > sysenter_past_esp+0x5f/0x99
> >
> > 90clock D
> > __mutex_lock_slow_path+0xa1/0x290
> > mutex_lock+0x21/0x30
> > do_lookup+0xa1/0x180
> > __link_path_walk+0x44/0xc0
> > path_walk+0x18/0x20
> > do_path_lookup+0x78/0x210
> > __user_walk_fd+0x38/0x50
> > vfs_stat_fd+0x21/0x50
> > vfs_stat+0x11/0x20
> > sys_stat64+0x14/0x30
> > sysenter_past_esp+0x5f/0x99
> >
> > alsactl D
> > io_schedule+0xe/0x20
>
> Same here.  Hmm.
>
> > sync_page+0x35/0x40
> > __wait_on_bit_lock+0x3f/0x70
> > __lock_page+0x68/0x70
> > filemap_nopage+0x16c/0x300
> > __handle_mm_faul+0x1d7/0x610
> > do_page_fault+0x1d7/0x610
> > error_code+0x6a/0x70
> > padzero+0x1f/0x30
> > load_elf_binary+0x743/0x1ab0
> > search_binary_handler+0x7b/0x1f0
> > do_execve+0x137/0x1d0
> > sys_execve+0x2e/0x80
> > sysenter_past_esp+0x5f/0x90
> >
> > After that I could remount, sync and reboot using SysRq (well, after
> > reboot it still insisted on replaying insane number of transactions so
> > may be it did *not* remount / ro after all). Before reboot there was
> > brief output that resembled lockdep warnings, but it went too fast to be
> > readable.
> >
> > usual stuff follows
>
> I see you're using CFQ as the default IO scheduler.  Can you please switch
> to AS and see if that changes anything?
>

I just had the same lockup on resume using AS with 2.6.23-rc5.


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
 On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
  Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
  suspend to disk :)
 
  Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
  pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
  libata with pata_ali driver.
 
  Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
  system hung at least once in every rcX. Up to rc6 those lockups were
  absolutely silent (black screen without reaction to any key). In rc6 I
  just got something different. After resume I got on screem:
 
  swsusp: Marking nosave pages: 0009f000-0010
  swsusp: Basic memory bitmaps created
  swsusp: Basic memory bitmaps freed
 
  After that it just sits there doing nothing. Ther was brief sound of HDD
  but I suspect it was related more to power-on. System was responding to
  power-on button press:
 
  ACPI Error (event-0305): No installed handler for fixed event [0002
  20070125]
 
  And SysRq was functioning.

 That probably means that there's a deadlock somewhere in there.

  Unfortunately I do not have serial console so I
  copy manually stacks from several last screens of output; I have tried to
  make a photo but right now my kbluetooth is refusing to work at all so I
  cannot transfer them :( (but I suspect quality would be too bad anyway)
 
  laptop_mode D
  io_schedule+0xe/0x20

 Looks suspicious to me.  Can you identify what line of code this points to?

  sync_buffer+0x35/0x40
  __wait_on_bit+0x45/0x70
  out_of_line_wait_on_bit+0x6c/0x80
  __wait_on_buffer+0x27/0x30
  search_by_key+0x15e/0x1250 [reiserfs]
  reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
  reiserfs_iget+0x7e/0xa0 [reiserfs]
  reiserfs_lookup+0xc7/0x120 [reiserfs]
  do_lookup+0x138/0x180
  __link_path_walk+0x787/0xce0
  link_path_walk+0x44/0xc0
  path_walk+0x18/0x20
  do_path_lookup_0x88/0x210
  __path_lookupintent_open+0x4d/0x90
  path_lookup_open+0x1f/0x30
  open_exec+0x28/0xb0
  do_execve+0x36/0x1d0
  sys_execve+0x2e/0x80
  sysenter_past_esp+0x5f/0x99
 
  90clock D
  __mutex_lock_slow_path+0xa1/0x290
  mutex_lock+0x21/0x30
  do_lookup+0xa1/0x180
  __link_path_walk+0x44/0xc0
  path_walk+0x18/0x20
  do_path_lookup+0x78/0x210
  __user_walk_fd+0x38/0x50
  vfs_stat_fd+0x21/0x50
  vfs_stat+0x11/0x20
  sys_stat64+0x14/0x30
  sysenter_past_esp+0x5f/0x99
 
  alsactl D
  io_schedule+0xe/0x20

 Same here.  Hmm.

  sync_page+0x35/0x40
  __wait_on_bit_lock+0x3f/0x70
  __lock_page+0x68/0x70
  filemap_nopage+0x16c/0x300
  __handle_mm_faul+0x1d7/0x610
  do_page_fault+0x1d7/0x610
  error_code+0x6a/0x70
  padzero+0x1f/0x30
  load_elf_binary+0x743/0x1ab0
  search_binary_handler+0x7b/0x1f0
  do_execve+0x137/0x1d0
  sys_execve+0x2e/0x80
  sysenter_past_esp+0x5f/0x90
 
  After that I could remount, sync and reboot using SysRq (well, after
  reboot it still insisted on replaying insane number of transactions so
  may be it did *not* remount / ro after all). Before reboot there was
  brief output that resembled lockdep warnings, but it went too fast to be
  readable.
 
  usual stuff follows

 I see you're using CFQ as the default IO scheduler.  Can you please switch
 to AS and see if that changes anything?


I just had the same lockup on resume using AS with 2.6.23-rc5.


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Rafael J. Wysocki
On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
 On Sunday 01 July 2007, Rafael J. Wysocki wrote:
  On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
   Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
   suspend to disk :)
  
   Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
   pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
   libata with pata_ali driver.
  
   Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
   system hung at least once in every rcX. Up to rc6 those lockups were
   absolutely silent (black screen without reaction to any key). In rc6 I
   just got something different. After resume I got on screem:
  
   swsusp: Marking nosave pages: 0009f000-0010
   swsusp: Basic memory bitmaps created
   swsusp: Basic memory bitmaps freed
  
   After that it just sits there doing nothing. Ther was brief sound of HDD
   but I suspect it was related more to power-on. System was responding to
   power-on button press:
  
   ACPI Error (event-0305): No installed handler for fixed event [0002
   20070125]
  
   And SysRq was functioning.
 
  That probably means that there's a deadlock somewhere in there.
 
   Unfortunately I do not have serial console so I
   copy manually stacks from several last screens of output; I have tried to
   make a photo but right now my kbluetooth is refusing to work at all so I
   cannot transfer them :( (but I suspect quality would be too bad anyway)
  
   laptop_mode D
 io_schedule+0xe/0x20
 
  Looks suspicious to me.  Can you identify what line of code this points to?
 
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x6c/0x80
 __wait_on_buffer+0x27/0x30
 search_by_key+0x15e/0x1250 [reiserfs]
 reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
 reiserfs_iget+0x7e/0xa0 [reiserfs]
 reiserfs_lookup+0xc7/0x120 [reiserfs]
 do_lookup+0x138/0x180
 __link_path_walk+0x787/0xce0
 link_path_walk+0x44/0xc0
 path_walk+0x18/0x20
 do_path_lookup_0x88/0x210
 __path_lookupintent_open+0x4d/0x90
 path_lookup_open+0x1f/0x30
 open_exec+0x28/0xb0
 do_execve+0x36/0x1d0
 sys_execve+0x2e/0x80
 sysenter_past_esp+0x5f/0x99
  
   90clock D
 __mutex_lock_slow_path+0xa1/0x290
 mutex_lock+0x21/0x30
 do_lookup+0xa1/0x180
 __link_path_walk+0x44/0xc0
 path_walk+0x18/0x20
 do_path_lookup+0x78/0x210
 __user_walk_fd+0x38/0x50
 vfs_stat_fd+0x21/0x50
 vfs_stat+0x11/0x20
 sys_stat64+0x14/0x30
 sysenter_past_esp+0x5f/0x99
  
   alsactl D
 io_schedule+0xe/0x20
 
  Same here.  Hmm.
 
 sync_page+0x35/0x40
 __wait_on_bit_lock+0x3f/0x70
 __lock_page+0x68/0x70
 filemap_nopage+0x16c/0x300
 __handle_mm_faul+0x1d7/0x610
 do_page_fault+0x1d7/0x610
 error_code+0x6a/0x70
 padzero+0x1f/0x30
 load_elf_binary+0x743/0x1ab0
 search_binary_handler+0x7b/0x1f0
 do_execve+0x137/0x1d0
 sys_execve+0x2e/0x80
 sysenter_past_esp+0x5f/0x90
  
   After that I could remount, sync and reboot using SysRq (well, after
   reboot it still insisted on replaying insane number of transactions so
   may be it did *not* remount / ro after all). Before reboot there was
   brief output that resembled lockdep warnings, but it went too fast to be
   readable.
  
   usual stuff follows
 
  I see you're using CFQ as the default IO scheduler.  Can you please switch
  to AS and see if that changes anything?
 
 
 I just had the same lockup on resume using AS with 2.6.23-rc5.

Hm.  Does your root partition sit on reiserfs?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-09 Thread Andrey Borzenkov
On Sunday 09 September 2007, Rafael J. Wysocki wrote:
 On Sunday, 9 September 2007 16:00, Andrey Borzenkov wrote:
  On Sunday 01 July 2007, Rafael J. Wysocki wrote:
   On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
suspend to disk :)
   
Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
single pata_ali patch to switch off DMA on CD-ROM), single root on
reiserfs, libata with pata_ali driver.
   
Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
system hung at least once in every rcX. Up to rc6 those lockups were
absolutely silent (black screen without reaction to any key). In rc6
I just got something different. After resume I got on screem:
   
swsusp: Marking nosave pages: 0009f000-0010
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
   
After that it just sits there doing nothing. Ther was brief sound of
HDD but I suspect it was related more to power-on. System was
responding to power-on button press:
   
ACPI Error (event-0305): No installed handler for fixed event
[0002 20070125]
   
And SysRq was functioning.
  
   That probably means that there's a deadlock somewhere in there.
  
Unfortunately I do not have serial console so I
copy manually stacks from several last screens of output; I have
tried to make a photo but right now my kbluetooth is refusing to work
at all so I cannot transfer them :( (but I suspect quality would be
too bad anyway)
   
laptop_mode D
io_schedule+0xe/0x20
  
   Looks suspicious to me.  Can you identify what line of code this points
   to?
  
sync_buffer+0x35/0x40
__wait_on_bit+0x45/0x70
out_of_line_wait_on_bit+0x6c/0x80
__wait_on_buffer+0x27/0x30
search_by_key+0x15e/0x1250 [reiserfs]
reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
reiserfs_iget+0x7e/0xa0 [reiserfs]
reiserfs_lookup+0xc7/0x120 [reiserfs]
do_lookup+0x138/0x180
__link_path_walk+0x787/0xce0
link_path_walk+0x44/0xc0
path_walk+0x18/0x20
do_path_lookup_0x88/0x210
__path_lookupintent_open+0x4d/0x90
path_lookup_open+0x1f/0x30
open_exec+0x28/0xb0
do_execve+0x36/0x1d0
sys_execve+0x2e/0x80
sysenter_past_esp+0x5f/0x99
   
90clock D
__mutex_lock_slow_path+0xa1/0x290
mutex_lock+0x21/0x30
do_lookup+0xa1/0x180
__link_path_walk+0x44/0xc0
path_walk+0x18/0x20
do_path_lookup+0x78/0x210
__user_walk_fd+0x38/0x50
vfs_stat_fd+0x21/0x50
vfs_stat+0x11/0x20
sys_stat64+0x14/0x30
sysenter_past_esp+0x5f/0x99
   
alsactl D
io_schedule+0xe/0x20
  
   Same here.  Hmm.
  
sync_page+0x35/0x40
__wait_on_bit_lock+0x3f/0x70
__lock_page+0x68/0x70
filemap_nopage+0x16c/0x300
__handle_mm_faul+0x1d7/0x610
do_page_fault+0x1d7/0x610
error_code+0x6a/0x70
padzero+0x1f/0x30
load_elf_binary+0x743/0x1ab0
search_binary_handler+0x7b/0x1f0
do_execve+0x137/0x1d0
sys_execve+0x2e/0x80
sysenter_past_esp+0x5f/0x90
   
After that I could remount, sync and reboot using SysRq (well, after
reboot it still insisted on replaying insane number of transactions
so may be it did *not* remount / ro after all). Before reboot there
was brief output that resembled lockdep warnings, but it went too
fast to be readable.
   
usual stuff follows
  
   I see you're using CFQ as the default IO scheduler.  Can you please
   switch to AS and see if that changes anything?
 
  I just had the same lockup on resume using AS with 2.6.23-rc5.

 Hm.  Does your root partition sit on reiserfs?

yes - single root on reiserfs


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-02 Thread Kasper Sandberg
Sorry for top posting, but this is MAYBE a related matter, i am not
sure.

the thing is, i am running with libata and reiserfs on a raid5 with 6
disks, and after i changed to libata it has worked excellently (before
it used to give DMA errors and then go boom).

however now i sometimes, if theres some load on the array, see that the
hdd leds go fully on, and for ~10sec to 1 min, all IO just stops
complete, and after the time, it resumes and works perfectly.

any ideas? and i also bring this up as it may give a clue as to what
causes this. - I also use CFQ

On Sun, 2007-09-02 at 15:29 +0400, Andrey Borzenkov wrote:
> On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
> > > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > > suspend to disk :)
> > > > >
> > > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
> > > > > single pata_ali patch to switch off DMA on CD-ROM), single root on
> > > > > reiserfs, libata with pata_ali driver.
> > > > >
> > > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > > absolutely silent (black screen without reaction to any key). In rc6
> > > > > I just got something different. After resume I got on screem:
> > > > >
> > > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > > swsusp: Basic memory bitmaps created
> > > > > swsusp: Basic memory bitmaps freed
> > > > >
> > > > > After that it just sits there doing nothing. Ther was brief sound of
> > > > > HDD but I suspect it was related more to power-on. System was
> > > > > responding to power-on button press:
> > > > >
> > > > > ACPI Error (event-0305): No installed handler for fixed event
> > > > > [0002 20070125]
> > > > >
> > > > > And SysRq was functioning.
> > > >
> > > > That probably means that there's a deadlock somewhere in there.
> > > >
> > > > > Unfortunately I do not have serial console so I
> > > > > copy manually stacks from several last screens of output; I have
> > > > > tried to make a photo but right now my kbluetooth is refusing to work
> > > > > at all so I cannot transfer them :( (but I suspect quality would be
> > > > > too bad anyway)
> > > > >
> > > > > laptop_mode D
> > > > >   io_schedule+0xe/0x20
> > > >
> > > > Looks suspicious to me.  Can you identify what line of code this points
> > > > to?
> > >
> > > If you could explain how to ...
> >
> > Michal has already done that. :-)
> >
> > [--snip--]
> >
> > > > I see you're using CFQ as the default IO scheduler.  Can you please
> > > > switch to AS and see if that changes anything?
> > >
> > > Sure, but given that I have no idea how to reproduce the lockup, we may
> > > never know whether it actually helped.
> >
> > Well, if the lockup never happens with AS, that will indicate something ...
> >
> 
> I thought it is gone but it just happened again with 2.6.23-rc5. I thought I 
> have been running AS but no, I did use CFQ. Now I definitely switched to AS 
> default; let's see ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-02 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
> > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > suspend to disk :)
> > > >
> > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
> > > > single pata_ali patch to switch off DMA on CD-ROM), single root on
> > > > reiserfs, libata with pata_ali driver.
> > > >
> > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > absolutely silent (black screen without reaction to any key). In rc6
> > > > I just got something different. After resume I got on screem:
> > > >
> > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > swsusp: Basic memory bitmaps created
> > > > swsusp: Basic memory bitmaps freed
> > > >
> > > > After that it just sits there doing nothing. Ther was brief sound of
> > > > HDD but I suspect it was related more to power-on. System was
> > > > responding to power-on button press:
> > > >
> > > > ACPI Error (event-0305): No installed handler for fixed event
> > > > [0002 20070125]
> > > >
> > > > And SysRq was functioning.
> > >
> > > That probably means that there's a deadlock somewhere in there.
> > >
> > > > Unfortunately I do not have serial console so I
> > > > copy manually stacks from several last screens of output; I have
> > > > tried to make a photo but right now my kbluetooth is refusing to work
> > > > at all so I cannot transfer them :( (but I suspect quality would be
> > > > too bad anyway)
> > > >
> > > > laptop_mode D
> > > > io_schedule+0xe/0x20
> > >
> > > Looks suspicious to me.  Can you identify what line of code this points
> > > to?
> >
> > If you could explain how to ...
>
> Michal has already done that. :-)
>
> [--snip--]
>
> > > I see you're using CFQ as the default IO scheduler.  Can you please
> > > switch to AS and see if that changes anything?
> >
> > Sure, but given that I have no idea how to reproduce the lockup, we may
> > never know whether it actually helped.
>
> Well, if the lockup never happens with AS, that will indicate something ...
>

I thought it is gone but it just happened again with 2.6.23-rc5. I thought I 
have been running AS but no, I did use CFQ. Now I definitely switched to AS 
default; let's see ...


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-02 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
 On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
  On Sunday 01 July 2007, Rafael J. Wysocki wrote:
   On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
suspend to disk :)
   
Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
single pata_ali patch to switch off DMA on CD-ROM), single root on
reiserfs, libata with pata_ali driver.
   
Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
system hung at least once in every rcX. Up to rc6 those lockups were
absolutely silent (black screen without reaction to any key). In rc6
I just got something different. After resume I got on screem:
   
swsusp: Marking nosave pages: 0009f000-0010
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
   
After that it just sits there doing nothing. Ther was brief sound of
HDD but I suspect it was related more to power-on. System was
responding to power-on button press:
   
ACPI Error (event-0305): No installed handler for fixed event
[0002 20070125]
   
And SysRq was functioning.
  
   That probably means that there's a deadlock somewhere in there.
  
Unfortunately I do not have serial console so I
copy manually stacks from several last screens of output; I have
tried to make a photo but right now my kbluetooth is refusing to work
at all so I cannot transfer them :( (but I suspect quality would be
too bad anyway)
   
laptop_mode D
io_schedule+0xe/0x20
  
   Looks suspicious to me.  Can you identify what line of code this points
   to?
 
  If you could explain how to ...

 Michal has already done that. :-)

 [--snip--]

   I see you're using CFQ as the default IO scheduler.  Can you please
   switch to AS and see if that changes anything?
 
  Sure, but given that I have no idea how to reproduce the lockup, we may
  never know whether it actually helped.

 Well, if the lockup never happens with AS, that will indicate something ...


I thought it is gone but it just happened again with 2.6.23-rc5. I thought I 
have been running AS but no, I did use CFQ. Now I definitely switched to AS 
default; let's see ...


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-09-02 Thread Kasper Sandberg
Sorry for top posting, but this is MAYBE a related matter, i am not
sure.

the thing is, i am running with libata and reiserfs on a raid5 with 6
disks, and after i changed to libata it has worked excellently (before
it used to give DMA errors and then go boom).

however now i sometimes, if theres some load on the array, see that the
hdd leds go fully on, and for ~10sec to 1 min, all IO just stops
complete, and after the time, it resumes and works perfectly.

any ideas? and i also bring this up as it may give a clue as to what
causes this. - I also use CFQ

On Sun, 2007-09-02 at 15:29 +0400, Andrey Borzenkov wrote:
 On Sunday 01 July 2007, Rafael J. Wysocki wrote:
  On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
   On Sunday 01 July 2007, Rafael J. Wysocki wrote:
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
 Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
 suspend to disk :)

 Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
 single pata_ali patch to switch off DMA on CD-ROM), single root on
 reiserfs, libata with pata_ali driver.

 Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
 system hung at least once in every rcX. Up to rc6 those lockups were
 absolutely silent (black screen without reaction to any key). In rc6
 I just got something different. After resume I got on screem:

 swsusp: Marking nosave pages: 0009f000-0010
 swsusp: Basic memory bitmaps created
 swsusp: Basic memory bitmaps freed

 After that it just sits there doing nothing. Ther was brief sound of
 HDD but I suspect it was related more to power-on. System was
 responding to power-on button press:

 ACPI Error (event-0305): No installed handler for fixed event
 [0002 20070125]

 And SysRq was functioning.
   
That probably means that there's a deadlock somewhere in there.
   
 Unfortunately I do not have serial console so I
 copy manually stacks from several last screens of output; I have
 tried to make a photo but right now my kbluetooth is refusing to work
 at all so I cannot transfer them :( (but I suspect quality would be
 too bad anyway)

 laptop_mode D
   io_schedule+0xe/0x20
   
Looks suspicious to me.  Can you identify what line of code this points
to?
  
   If you could explain how to ...
 
  Michal has already done that. :-)
 
  [--snip--]
 
I see you're using CFQ as the default IO scheduler.  Can you please
switch to AS and see if that changes anything?
  
   Sure, but given that I have no idea how to reproduce the lockup, we may
   never know whether it actually helped.
 
  Well, if the lockup never happens with AS, that will indicate something ...
 
 
 I thought it is gone but it just happened again with 2.6.23-rc5. I thought I 
 have been running AS but no, I did use CFQ. Now I definitely switched to AS 
 default; let's see ...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-15 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
> > On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > > suspend to disk :)
> > > >
> > > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
> > > > single pata_ali patch to switch off DMA on CD-ROM), single root on
> > > > reiserfs, libata with pata_ali driver.
> > > >
> > > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > > absolutely silent (black screen without reaction to any key). In rc6
> > > > I just got something different. After resume I got on screem:
> > > >
> > > > swsusp: Marking nosave pages: 0009f000-0010
> > > > swsusp: Basic memory bitmaps created
> > > > swsusp: Basic memory bitmaps freed
> > > >
> > > > After that it just sits there doing nothing. Ther was brief sound of
> > > > HDD but I suspect it was related more to power-on. System was
> > > > responding to power-on button press:
> > > >
> > > > ACPI Error (event-0305): No installed handler for fixed event
> > > > [0002 20070125]
> > > >
> > > > And SysRq was functioning.
> > >
> > > That probably means that there's a deadlock somewhere in there.
> > >
> > > > Unfortunately I do not have serial console so I
> > > > copy manually stacks from several last screens of output; I have
> > > > tried to make a photo but right now my kbluetooth is refusing to work
> > > > at all so I cannot transfer them :( (but I suspect quality would be
> > > > too bad anyway)
> > > >
> > > > laptop_mode D
> > > > io_schedule+0xe/0x20
> > >
> > > Looks suspicious to me.  Can you identify what line of code this points
> > > to?
> >
> > If you could explain how to ...
>
> Michal has already done that. :-)
>

(gdb) l *io_schedule+0xe
0xc02aa84e is in io_schedule (include2/asm/atomic.h:110).
105  *
106  * Atomically decrements @v by 1.
107  */
108 static __inline__ void atomic_dec(atomic_t *v)
109 {
110 __asm__ __volatile__(
111 LOCK_PREFIX "decl %0"
112 :"+m" (v->counter));
113 }
114

> [--snip--]
>
> > > I see you're using CFQ as the default IO scheduler.  Can you please
> > > switch to AS and see if that changes anything?
> >
> > Sure, but given that I have no idea how to reproduce the lockup, we may
> > never know whether it actually helped.
>
> Well, if the lockup never happens with AS, that will indicate something ...
>

Well, I was about to say that it is probably gone as it hit again. Now with 
IDE, so we at least can rule out libata. But reiserfs is still in path, so I 
Cc list. It is stock 2.6.22. I will switch to AS, but it took 2 weeks to 
happen with CFS and in one week I will be off for a couple of weeks.

Here is hand-copied information about locks and blocked processes.

Showing all locks held in the system:

1 lock held by syslogd/2515
 #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30
1 lock held by X/3800:
 #0: (>mmap_sem){}, at: [] do_page_fault+0x1e8/0x5e0
6 times migetty - 1 lock held by mingetty/3838:
 #0: (>atomic_read_lock){--..}, at: [] 
mutex_lock_interruptible+0x21/0x30
2 times zsh - 1 lock held by zsh/4276:
 #0: (>atomic_read_lock){--..}, at: [] 
mutex_lock_interruptible+0x21/0x30
1 lock held by consolehelper-g/21231:
 #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30
1 lock held by kio_http/31282:
 #0: (>i_mutex){--..}, at: [] mutex_lock+0x21/0x30

and list of blocked tasks:

syslogd
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 __wait_on_buffer+0x27/0x30
 flush_commit_list+0x397/0x610 [reiserfs]
 do_journal_end+0xadc/0xc90 [reiserfs]
 journal_end_sync+0x5d/0x70 [reiserfs]
 reiserfs_commit_for_inode+0x17e/0x1a0 [reiserfs]
 reiserfs_sync_file+0x2d/0x70 [reiserfs]
 do_fsync+0x28/0x40
 sys_fsync+0xd/0x10
 sysenter_past_esp+0x5f/0x99

X
 io_schedule+0xe/0x20
 sync_page+0x3a/0x50
 __wait_on_bit_lock+0x3f/0x70
 __lock_page+0x4c/0x60
 __handle_mm_fault+0x657/0x860
 do_page_fault+0x2f4/0x5e0
 error_code+0x6a/0x70

consolehelper
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 __wait_on_buffer+0x27/0x30
 search_by_key+0x17e/0x1370 [reiserfs]
 search_by_entry_key+0x1c/0x2a0 [reiserfs]
 reiserfs_find_entry+0x7d/0x3a0 [reiserfs]
 reiserfs_lookup+0x75/0x120 [reiserfs]
 do_lookup+0x133/0x180
 __link_path_walk+0x765/0xd10
 link_path_walk+0x44/0xc0
 path_walk+0x18/0x20
 do_path_lookup+0x7c/0x200
 __path_lookup_intent_open+0x1f/0x30
 open_namei+0x66/0x670
 do_filp_open+0x2c/0x50
 do_sys_open+0x47/0xd0
 sys_open+0x1c/0x20
 syscall_call+0x7/0xb

kio_http
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 

Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-15 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
 On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
  On Sunday 01 July 2007, Rafael J. Wysocki wrote:
   On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
suspend to disk :)
   
Environment - vanilla kernel (2.6.22-rc6 currently + squashfs +
single pata_ali patch to switch off DMA on CD-ROM), single root on
reiserfs, libata with pata_ali driver.
   
Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
system hung at least once in every rcX. Up to rc6 those lockups were
absolutely silent (black screen without reaction to any key). In rc6
I just got something different. After resume I got on screem:
   
swsusp: Marking nosave pages: 0009f000-0010
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
   
After that it just sits there doing nothing. Ther was brief sound of
HDD but I suspect it was related more to power-on. System was
responding to power-on button press:
   
ACPI Error (event-0305): No installed handler for fixed event
[0002 20070125]
   
And SysRq was functioning.
  
   That probably means that there's a deadlock somewhere in there.
  
Unfortunately I do not have serial console so I
copy manually stacks from several last screens of output; I have
tried to make a photo but right now my kbluetooth is refusing to work
at all so I cannot transfer them :( (but I suspect quality would be
too bad anyway)
   
laptop_mode D
io_schedule+0xe/0x20
  
   Looks suspicious to me.  Can you identify what line of code this points
   to?
 
  If you could explain how to ...

 Michal has already done that. :-)


(gdb) l *io_schedule+0xe
0xc02aa84e is in io_schedule (include2/asm/atomic.h:110).
105  *
106  * Atomically decrements @v by 1.
107  */
108 static __inline__ void atomic_dec(atomic_t *v)
109 {
110 __asm__ __volatile__(
111 LOCK_PREFIX decl %0
112 :+m (v-counter));
113 }
114

 [--snip--]

   I see you're using CFQ as the default IO scheduler.  Can you please
   switch to AS and see if that changes anything?
 
  Sure, but given that I have no idea how to reproduce the lockup, we may
  never know whether it actually helped.

 Well, if the lockup never happens with AS, that will indicate something ...


Well, I was about to say that it is probably gone as it hit again. Now with 
IDE, so we at least can rule out libata. But reiserfs is still in path, so I 
Cc list. It is stock 2.6.22. I will switch to AS, but it took 2 weeks to 
happen with CFS and in one week I will be off for a couple of weeks.

Here is hand-copied information about locks and blocked processes.

Showing all locks held in the system:

1 lock held by syslogd/2515
 #0: (inode-i_mutex){--..}, at: [c02ab4d15] mutex_lock+0x21/0x30
1 lock held by X/3800:
 #0: (mm-mmap_sem){}, at: [c02ae2e8] do_page_fault+0x1e8/0x5e0
6 times migetty - 1 lock held by mingetty/3838:
 #0: (tty-atomic_read_lock){--..}, at: [c02ab0915] 
mutex_lock_interruptible+0x21/0x30
2 times zsh - 1 lock held by zsh/4276:
 #0: (tty-atomic_read_lock){--..}, at: [c02ab0915] 
mutex_lock_interruptible+0x21/0x30
1 lock held by consolehelper-g/21231:
 #0: (inode-i_mutex){--..}, at: [c02ab4b1] mutex_lock+0x21/0x30
1 lock held by kio_http/31282:
 #0: (inode-i_mutex){--..}, at: [c02ab4b1] mutex_lock+0x21/0x30

and list of blocked tasks:

syslogd
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 __wait_on_buffer+0x27/0x30
 flush_commit_list+0x397/0x610 [reiserfs]
 do_journal_end+0xadc/0xc90 [reiserfs]
 journal_end_sync+0x5d/0x70 [reiserfs]
 reiserfs_commit_for_inode+0x17e/0x1a0 [reiserfs]
 reiserfs_sync_file+0x2d/0x70 [reiserfs]
 do_fsync+0x28/0x40
 sys_fsync+0xd/0x10
 sysenter_past_esp+0x5f/0x99

X
 io_schedule+0xe/0x20
 sync_page+0x3a/0x50
 __wait_on_bit_lock+0x3f/0x70
 __lock_page+0x4c/0x60
 __handle_mm_fault+0x657/0x860
 do_page_fault+0x2f4/0x5e0
 error_code+0x6a/0x70

consolehelper
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 __wait_on_buffer+0x27/0x30
 search_by_key+0x17e/0x1370 [reiserfs]
 search_by_entry_key+0x1c/0x2a0 [reiserfs]
 reiserfs_find_entry+0x7d/0x3a0 [reiserfs]
 reiserfs_lookup+0x75/0x120 [reiserfs]
 do_lookup+0x133/0x180
 __link_path_walk+0x765/0xd10
 link_path_walk+0x44/0xc0
 path_walk+0x18/0x20
 do_path_lookup+0x7c/0x200
 __path_lookup_intent_open+0x1f/0x30
 open_namei+0x66/0x670
 do_filp_open+0x2c/0x50
 do_sys_open+0x47/0xd0
 sys_open+0x1c/0x20
 syscall_call+0x7/0xb

kio_http
 io_schedule+0xe/0x20
 sync_buffer+0x35/0x40
 __wait_on_bit+0x45/0x70
 out_of_line_wait_on_bit+0x50/0x60
 __wait_on_buffer+0x27/0x30
 search_by_key+0x17e/0x1370 [reiserfs]
 reiserfs_read_locked_inode+0x63/0x570 

Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-01 Thread Pavel Machek
Hi!

> > ACPI Error (event-0305): No installed handler for fixed event [0002 
> > 20070125]
> > 
> > And SysRq was functioning.
> 
> That probably means that there's a deadlock somewhere in there.
> 
> > Unfortunately I do not have serial console so I  
> > copy manually stacks from several last screens of output; I have tried to 
> > make a photo but right now my kbluetooth is refusing to work at all so I 
> > cannot transfer them :( (but I suspect quality would be too bad anyway)
> > 
> > laptop_mode D
> > io_schedule+0xe/0x20
> 
> Looks suspicious to me.  Can you identify what line of code this points to?

Actually, I see laptop_mode being locked. laptop_mode does disk
spindowns, which is somehow unusual. Does it happen w/o laptop mode?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-01 Thread Rafael J. Wysocki
On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
> On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> > On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > > suspend to disk :)
> > >
> > > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > > libata with pata_ali driver.
> > >
> > > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > > system hung at least once in every rcX. Up to rc6 those lockups were
> > > absolutely silent (black screen without reaction to any key). In rc6 I
> > > just got something different. After resume I got on screem:
> > >
> > > swsusp: Marking nosave pages: 0009f000-0010
> > > swsusp: Basic memory bitmaps created
> > > swsusp: Basic memory bitmaps freed
> > >
> > > After that it just sits there doing nothing. Ther was brief sound of HDD
> > > but I suspect it was related more to power-on. System was responding to
> > > power-on button press:
> > >
> > > ACPI Error (event-0305): No installed handler for fixed event [0002
> > > 20070125]
> > >
> > > And SysRq was functioning.
> >
> > That probably means that there's a deadlock somewhere in there.
> >
> > > Unfortunately I do not have serial console so I
> > > copy manually stacks from several last screens of output; I have tried to
> > > make a photo but right now my kbluetooth is refusing to work at all so I
> > > cannot transfer them :( (but I suspect quality would be too bad anyway)
> > >
> > > laptop_mode D
> > >   io_schedule+0xe/0x20
> >
> > Looks suspicious to me.  Can you identify what line of code this points to?
> >
> 
> If you could explain how to ... 

Michal has already done that. :-)

[--snip--]
> >
> > I see you're using CFQ as the default IO scheduler.  Can you please switch
> > to AS and see if that changes anything?
> >
> 
> Sure, but given that I have no idea how to reproduce the lockup, we may never 
> know whether it actually helped.

Well, if the lockup never happens with AS, that will indicate something ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-01 Thread Rafael J. Wysocki
On Saturday, 30 June 2007 23:34, Andrey Borzenkov wrote:
 On Sunday 01 July 2007, Rafael J. Wysocki wrote:
  On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
   Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
   suspend to disk :)
  
   Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
   pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
   libata with pata_ali driver.
  
   Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
   system hung at least once in every rcX. Up to rc6 those lockups were
   absolutely silent (black screen without reaction to any key). In rc6 I
   just got something different. After resume I got on screem:
  
   swsusp: Marking nosave pages: 0009f000-0010
   swsusp: Basic memory bitmaps created
   swsusp: Basic memory bitmaps freed
  
   After that it just sits there doing nothing. Ther was brief sound of HDD
   but I suspect it was related more to power-on. System was responding to
   power-on button press:
  
   ACPI Error (event-0305): No installed handler for fixed event [0002
   20070125]
  
   And SysRq was functioning.
 
  That probably means that there's a deadlock somewhere in there.
 
   Unfortunately I do not have serial console so I
   copy manually stacks from several last screens of output; I have tried to
   make a photo but right now my kbluetooth is refusing to work at all so I
   cannot transfer them :( (but I suspect quality would be too bad anyway)
  
   laptop_mode D
 io_schedule+0xe/0x20
 
  Looks suspicious to me.  Can you identify what line of code this points to?
 
 
 If you could explain how to ... 

Michal has already done that. :-)

[--snip--]
 
  I see you're using CFQ as the default IO scheduler.  Can you please switch
  to AS and see if that changes anything?
 
 
 Sure, but given that I have no idea how to reproduce the lockup, we may never 
 know whether it actually helped.

Well, if the lockup never happens with AS, that will indicate something ...

Greetings,
Rafael


-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-07-01 Thread Pavel Machek
Hi!

  ACPI Error (event-0305): No installed handler for fixed event [0002 
  20070125]
  
  And SysRq was functioning.
 
 That probably means that there's a deadlock somewhere in there.
 
  Unfortunately I do not have serial console so I  
  copy manually stacks from several last screens of output; I have tried to 
  make a photo but right now my kbluetooth is refusing to work at all so I 
  cannot transfer them :( (but I suspect quality would be too bad anyway)
  
  laptop_mode D
  io_schedule+0xe/0x20
 
 Looks suspicious to me.  Can you identify what line of code this points to?

Actually, I see laptop_mode being locked. laptop_mode does disk
spindowns, which is somehow unusual. Does it happen w/o laptop mode?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Michal Piotrowski
Andrey Borzenkov pisze:
> On Sunday 01 July 2007, Rafael J. Wysocki wrote:
>> On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
>>> Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
>>> suspend to disk :)
>>>
>>> Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
>>> pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
>>> libata with pata_ali driver.
>>>
>>> Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
>>> system hung at least once in every rcX. Up to rc6 those lockups were
>>> absolutely silent (black screen without reaction to any key). In rc6 I
>>> just got something different. After resume I got on screem:
>>>
>>> swsusp: Marking nosave pages: 0009f000-0010
>>> swsusp: Basic memory bitmaps created
>>> swsusp: Basic memory bitmaps freed
>>>
>>> After that it just sits there doing nothing. Ther was brief sound of HDD
>>> but I suspect it was related more to power-on. System was responding to
>>> power-on button press:
>>>
>>> ACPI Error (event-0305): No installed handler for fixed event [0002
>>> 20070125]
>>>
>>> And SysRq was functioning.
>> That probably means that there's a deadlock somewhere in there.
>>
>>> Unfortunately I do not have serial console so I
>>> copy manually stacks from several last screens of output; I have tried to
>>> make a photo but right now my kbluetooth is refusing to work at all so I
>>> cannot transfer them :( (but I suspect quality would be too bad anyway)
>>>
>>> laptop_mode D
>>> io_schedule+0xe/0x20
>> Looks suspicious to me.  Can you identify what line of code this points to?
>>
> 
> If you could explain how to ...

gdb vmlinux

(gdb) l *io_schedule+0xe

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
> On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> > Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
> > suspend to disk :)
> >
> > Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
> > pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
> > libata with pata_ali driver.
> >
> > Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
> > system hung at least once in every rcX. Up to rc6 those lockups were
> > absolutely silent (black screen without reaction to any key). In rc6 I
> > just got something different. After resume I got on screem:
> >
> > swsusp: Marking nosave pages: 0009f000-0010
> > swsusp: Basic memory bitmaps created
> > swsusp: Basic memory bitmaps freed
> >
> > After that it just sits there doing nothing. Ther was brief sound of HDD
> > but I suspect it was related more to power-on. System was responding to
> > power-on button press:
> >
> > ACPI Error (event-0305): No installed handler for fixed event [0002
> > 20070125]
> >
> > And SysRq was functioning.
>
> That probably means that there's a deadlock somewhere in there.
>
> > Unfortunately I do not have serial console so I
> > copy manually stacks from several last screens of output; I have tried to
> > make a photo but right now my kbluetooth is refusing to work at all so I
> > cannot transfer them :( (but I suspect quality would be too bad anyway)
> >
> > laptop_mode D
> > io_schedule+0xe/0x20
>
> Looks suspicious to me.  Can you identify what line of code this points to?
>

If you could explain how to ... (I never understood what those two numbers 
mean :) ) Here is disassembled function

 4168   .section .sched.text
 4169   .p2align 4,,15
 4170   .globl io_schedule
 4171   .type io_schedule,@function
 4172   io_schedule:
 4173 0cd0 55   pushl %ebp
 4174 0cd1 89E5 movl %esp,%ebp
 4175
 4176 0cd3 FF05140A incl per_cpu__runqueues+2388
 4176  
 4177
 4178 0cd9 E8FC call schedule
 4178  FF
 4179
 4180 0cde FF0D140A decl per_cpu__runqueues+2388
 4180  
 4181
 4182 0ce4 5D   popl %ebp
 4183 0ce5 C3   ret
 4184   .size io_schedule,.-io_schedule


> > sync_buffer+0x35/0x40
> > __wait_on_bit+0x45/0x70
> > out_of_line_wait_on_bit+0x6c/0x80
> > __wait_on_buffer+0x27/0x30
> > search_by_key+0x15e/0x1250 [reiserfs]
> > reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
> > reiserfs_iget+0x7e/0xa0 [reiserfs]
> > reiserfs_lookup+0xc7/0x120 [reiserfs]
> > do_lookup+0x138/0x180
> > __link_path_walk+0x787/0xce0
> > link_path_walk+0x44/0xc0
> > path_walk+0x18/0x20
> > do_path_lookup_0x88/0x210
> > __path_lookupintent_open+0x4d/0x90
> > path_lookup_open+0x1f/0x30
> > open_exec+0x28/0xb0
> > do_execve+0x36/0x1d0
> > sys_execve+0x2e/0x80
> > sysenter_past_esp+0x5f/0x99
> >
> > 90clock D
> > __mutex_lock_slow_path+0xa1/0x290
> > mutex_lock+0x21/0x30
> > do_lookup+0xa1/0x180
> > __link_path_walk+0x44/0xc0
> > path_walk+0x18/0x20
> > do_path_lookup+0x78/0x210
> > __user_walk_fd+0x38/0x50
> > vfs_stat_fd+0x21/0x50
> > vfs_stat+0x11/0x20
> > sys_stat64+0x14/0x30
> > sysenter_past_esp+0x5f/0x99
> >
> > alsactl D
> > io_schedule+0xe/0x20
>
> Same here.  Hmm.
>
> > sync_page+0x35/0x40
> > __wait_on_bit_lock+0x3f/0x70
> > __lock_page+0x68/0x70
> > filemap_nopage+0x16c/0x300
> > __handle_mm_faul+0x1d7/0x610
> > do_page_fault+0x1d7/0x610
> > error_code+0x6a/0x70
> > padzero+0x1f/0x30
> > load_elf_binary+0x743/0x1ab0
> > search_binary_handler+0x7b/0x1f0
> > do_execve+0x137/0x1d0
> > sys_execve+0x2e/0x80
> > sysenter_past_esp+0x5f/0x90
> >
> > After that I could remount, sync and reboot using SysRq (well, after
> > reboot it still insisted on replaying insane number of transactions so
> > may be it did *not* remount / ro after all). Before reboot there was
> > brief output that resembled lockdep warnings, but it went too fast to be
> > readable.
> >
> > usual stuff follows
>
> I see you're using CFQ as the default IO scheduler.  Can you please switch
> to AS and see if that changes anything?
>

Sure, but given that I have no idea how to reproduce the lockup, we may never 
know whether it actually helped.


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Rafael J. Wysocki
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
> Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend 
> to disk :)
> 
> Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single 
> pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata 
> with pata_ali driver.
> 
> Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system 
> hung at least once in every rcX. Up to rc6 those lockups were absolutely 
> silent (black screen without reaction to any key). In rc6 I just got 
> something different. After resume I got on screem:
> 
> swsusp: Marking nosave pages: 0009f000-0010
> swsusp: Basic memory bitmaps created
> swsusp: Basic memory bitmaps freed
> 
> After that it just sits there doing nothing. Ther was brief sound of HDD but 
> I 
> suspect it was related more to power-on. System was responding to power-on 
> button press:
> 
> ACPI Error (event-0305): No installed handler for fixed event [0002 
> 20070125]
> 
> And SysRq was functioning.

That probably means that there's a deadlock somewhere in there.

> Unfortunately I do not have serial console so I  
> copy manually stacks from several last screens of output; I have tried to 
> make a photo but right now my kbluetooth is refusing to work at all so I 
> cannot transfer them :( (but I suspect quality would be too bad anyway)
> 
> laptop_mode D
>   io_schedule+0xe/0x20

Looks suspicious to me.  Can you identify what line of code this points to?

>   sync_buffer+0x35/0x40
>   __wait_on_bit+0x45/0x70
>   out_of_line_wait_on_bit+0x6c/0x80
>   __wait_on_buffer+0x27/0x30
>   search_by_key+0x15e/0x1250 [reiserfs]
>   reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
>   reiserfs_iget+0x7e/0xa0 [reiserfs]
>   reiserfs_lookup+0xc7/0x120 [reiserfs]
>   do_lookup+0x138/0x180
>   __link_path_walk+0x787/0xce0
>   link_path_walk+0x44/0xc0
>   path_walk+0x18/0x20
>   do_path_lookup_0x88/0x210
>   __path_lookupintent_open+0x4d/0x90
>   path_lookup_open+0x1f/0x30
>   open_exec+0x28/0xb0
>   do_execve+0x36/0x1d0
>   sys_execve+0x2e/0x80
>   sysenter_past_esp+0x5f/0x99
> 
> 90clock D
>   __mutex_lock_slow_path+0xa1/0x290
>   mutex_lock+0x21/0x30
>   do_lookup+0xa1/0x180
>   __link_path_walk+0x44/0xc0
>   path_walk+0x18/0x20
>   do_path_lookup+0x78/0x210
>   __user_walk_fd+0x38/0x50
>   vfs_stat_fd+0x21/0x50
>   vfs_stat+0x11/0x20
>   sys_stat64+0x14/0x30
>   sysenter_past_esp+0x5f/0x99
> 
> alsactl D
>   io_schedule+0xe/0x20

Same here.  Hmm.

>   sync_page+0x35/0x40
>   __wait_on_bit_lock+0x3f/0x70
>   __lock_page+0x68/0x70
>   filemap_nopage+0x16c/0x300
>   __handle_mm_faul+0x1d7/0x610
>   do_page_fault+0x1d7/0x610
>   error_code+0x6a/0x70
>   padzero+0x1f/0x30
>   load_elf_binary+0x743/0x1ab0
>   search_binary_handler+0x7b/0x1f0
>   do_execve+0x137/0x1d0
>   sys_execve+0x2e/0x80
>   sysenter_past_esp+0x5f/0x90
> 
> After that I could remount, sync and reboot using SysRq (well, after reboot 
> it 
> still insisted on replaying insane number of transactions so may be it did 
> *not* remount / ro after all). Before reboot there was brief output that 
> resembled lockdep warnings, but it went too fast to be readable.
> 
> usual stuff follows

I see you're using CFQ as the default IO scheduler.  Can you please switch to
AS and see if that changes anything?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Rafael J. Wysocki
On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
 Since 2.6.18 I do not have suspend to RAM; now I am starting to lose suspend 
 to disk :)
 
 Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single 
 pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs, libata 
 with pata_ali driver.
 
 Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc system 
 hung at least once in every rcX. Up to rc6 those lockups were absolutely 
 silent (black screen without reaction to any key). In rc6 I just got 
 something different. After resume I got on screem:
 
 swsusp: Marking nosave pages: 0009f000-0010
 swsusp: Basic memory bitmaps created
 swsusp: Basic memory bitmaps freed
 
 After that it just sits there doing nothing. Ther was brief sound of HDD but 
 I 
 suspect it was related more to power-on. System was responding to power-on 
 button press:
 
 ACPI Error (event-0305): No installed handler for fixed event [0002 
 20070125]
 
 And SysRq was functioning.

That probably means that there's a deadlock somewhere in there.

 Unfortunately I do not have serial console so I  
 copy manually stacks from several last screens of output; I have tried to 
 make a photo but right now my kbluetooth is refusing to work at all so I 
 cannot transfer them :( (but I suspect quality would be too bad anyway)
 
 laptop_mode D
   io_schedule+0xe/0x20

Looks suspicious to me.  Can you identify what line of code this points to?

   sync_buffer+0x35/0x40
   __wait_on_bit+0x45/0x70
   out_of_line_wait_on_bit+0x6c/0x80
   __wait_on_buffer+0x27/0x30
   search_by_key+0x15e/0x1250 [reiserfs]
   reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
   reiserfs_iget+0x7e/0xa0 [reiserfs]
   reiserfs_lookup+0xc7/0x120 [reiserfs]
   do_lookup+0x138/0x180
   __link_path_walk+0x787/0xce0
   link_path_walk+0x44/0xc0
   path_walk+0x18/0x20
   do_path_lookup_0x88/0x210
   __path_lookupintent_open+0x4d/0x90
   path_lookup_open+0x1f/0x30
   open_exec+0x28/0xb0
   do_execve+0x36/0x1d0
   sys_execve+0x2e/0x80
   sysenter_past_esp+0x5f/0x99
 
 90clock D
   __mutex_lock_slow_path+0xa1/0x290
   mutex_lock+0x21/0x30
   do_lookup+0xa1/0x180
   __link_path_walk+0x44/0xc0
   path_walk+0x18/0x20
   do_path_lookup+0x78/0x210
   __user_walk_fd+0x38/0x50
   vfs_stat_fd+0x21/0x50
   vfs_stat+0x11/0x20
   sys_stat64+0x14/0x30
   sysenter_past_esp+0x5f/0x99
 
 alsactl D
   io_schedule+0xe/0x20

Same here.  Hmm.

   sync_page+0x35/0x40
   __wait_on_bit_lock+0x3f/0x70
   __lock_page+0x68/0x70
   filemap_nopage+0x16c/0x300
   __handle_mm_faul+0x1d7/0x610
   do_page_fault+0x1d7/0x610
   error_code+0x6a/0x70
   padzero+0x1f/0x30
   load_elf_binary+0x743/0x1ab0
   search_binary_handler+0x7b/0x1f0
   do_execve+0x137/0x1d0
   sys_execve+0x2e/0x80
   sysenter_past_esp+0x5f/0x90
 
 After that I could remount, sync and reboot using SysRq (well, after reboot 
 it 
 still insisted on replaying insane number of transactions so may be it did 
 *not* remount / ro after all). Before reboot there was brief output that 
 resembled lockdep warnings, but it went too fast to be readable.
 
 usual stuff follows

I see you're using CFQ as the default IO scheduler.  Can you please switch to
AS and see if that changes anything?

Greetings,
Rafael


-- 
Premature optimization is the root of all evil. - Donald Knuth
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Andrey Borzenkov
On Sunday 01 July 2007, Rafael J. Wysocki wrote:
 On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
  Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
  suspend to disk :)
 
  Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
  pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
  libata with pata_ali driver.
 
  Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
  system hung at least once in every rcX. Up to rc6 those lockups were
  absolutely silent (black screen without reaction to any key). In rc6 I
  just got something different. After resume I got on screem:
 
  swsusp: Marking nosave pages: 0009f000-0010
  swsusp: Basic memory bitmaps created
  swsusp: Basic memory bitmaps freed
 
  After that it just sits there doing nothing. Ther was brief sound of HDD
  but I suspect it was related more to power-on. System was responding to
  power-on button press:
 
  ACPI Error (event-0305): No installed handler for fixed event [0002
  20070125]
 
  And SysRq was functioning.

 That probably means that there's a deadlock somewhere in there.

  Unfortunately I do not have serial console so I
  copy manually stacks from several last screens of output; I have tried to
  make a photo but right now my kbluetooth is refusing to work at all so I
  cannot transfer them :( (but I suspect quality would be too bad anyway)
 
  laptop_mode D
  io_schedule+0xe/0x20

 Looks suspicious to me.  Can you identify what line of code this points to?


If you could explain how to ... (I never understood what those two numbers 
mean :) ) Here is disassembled function

 4168   .section .sched.text
 4169   .p2align 4,,15
 4170   .globl io_schedule
 4171   .type io_schedule,@function
 4172   io_schedule:
 4173 0cd0 55   pushl %ebp
 4174 0cd1 89E5 movl %esp,%ebp
 4175
 4176 0cd3 FF05140A incl per_cpu__runqueues+2388
 4176  
 4177
 4178 0cd9 E8FC call schedule
 4178  FF
 4179
 4180 0cde FF0D140A decl per_cpu__runqueues+2388
 4180  
 4181
 4182 0ce4 5D   popl %ebp
 4183 0ce5 C3   ret
 4184   .size io_schedule,.-io_schedule


  sync_buffer+0x35/0x40
  __wait_on_bit+0x45/0x70
  out_of_line_wait_on_bit+0x6c/0x80
  __wait_on_buffer+0x27/0x30
  search_by_key+0x15e/0x1250 [reiserfs]
  reiserfs_read_locked_inode+0x64/0x570 [reiserfs]
  reiserfs_iget+0x7e/0xa0 [reiserfs]
  reiserfs_lookup+0xc7/0x120 [reiserfs]
  do_lookup+0x138/0x180
  __link_path_walk+0x787/0xce0
  link_path_walk+0x44/0xc0
  path_walk+0x18/0x20
  do_path_lookup_0x88/0x210
  __path_lookupintent_open+0x4d/0x90
  path_lookup_open+0x1f/0x30
  open_exec+0x28/0xb0
  do_execve+0x36/0x1d0
  sys_execve+0x2e/0x80
  sysenter_past_esp+0x5f/0x99
 
  90clock D
  __mutex_lock_slow_path+0xa1/0x290
  mutex_lock+0x21/0x30
  do_lookup+0xa1/0x180
  __link_path_walk+0x44/0xc0
  path_walk+0x18/0x20
  do_path_lookup+0x78/0x210
  __user_walk_fd+0x38/0x50
  vfs_stat_fd+0x21/0x50
  vfs_stat+0x11/0x20
  sys_stat64+0x14/0x30
  sysenter_past_esp+0x5f/0x99
 
  alsactl D
  io_schedule+0xe/0x20

 Same here.  Hmm.

  sync_page+0x35/0x40
  __wait_on_bit_lock+0x3f/0x70
  __lock_page+0x68/0x70
  filemap_nopage+0x16c/0x300
  __handle_mm_faul+0x1d7/0x610
  do_page_fault+0x1d7/0x610
  error_code+0x6a/0x70
  padzero+0x1f/0x30
  load_elf_binary+0x743/0x1ab0
  search_binary_handler+0x7b/0x1f0
  do_execve+0x137/0x1d0
  sys_execve+0x2e/0x80
  sysenter_past_esp+0x5f/0x90
 
  After that I could remount, sync and reboot using SysRq (well, after
  reboot it still insisted on replaying insane number of transactions so
  may be it did *not* remount / ro after all). Before reboot there was
  brief output that resembled lockdep warnings, but it went too fast to be
  readable.
 
  usual stuff follows

 I see you're using CFQ as the default IO scheduler.  Can you please switch
 to AS and see if that changes anything?


Sure, but given that I have no idea how to reproduce the lockup, we may never 
know whether it actually helped.


signature.asc
Description: This is a digitally signed message part.


Re: [possible regression] 2.6.22 reiserfs/libata sporadically hangs on resume from hibernation

2007-06-30 Thread Michal Piotrowski
Andrey Borzenkov pisze:
 On Sunday 01 July 2007, Rafael J. Wysocki wrote:
 On Saturday, 30 June 2007 06:59, Andrey Borzenkov wrote:
 Since 2.6.18 I do not have suspend to RAM; now I am starting to lose
 suspend to disk :)

 Environment - vanilla kernel (2.6.22-rc6 currently + squashfs + single
 pata_ali patch to switch off DMA on CD-ROM), single root on reiserfs,
 libata with pata_ali driver.

 Until 2.6.22-rc I never had problems with hibernation. With 2.6.22-rc
 system hung at least once in every rcX. Up to rc6 those lockups were
 absolutely silent (black screen without reaction to any key). In rc6 I
 just got something different. After resume I got on screem:

 swsusp: Marking nosave pages: 0009f000-0010
 swsusp: Basic memory bitmaps created
 swsusp: Basic memory bitmaps freed

 After that it just sits there doing nothing. Ther was brief sound of HDD
 but I suspect it was related more to power-on. System was responding to
 power-on button press:

 ACPI Error (event-0305): No installed handler for fixed event [0002
 20070125]

 And SysRq was functioning.
 That probably means that there's a deadlock somewhere in there.

 Unfortunately I do not have serial console so I
 copy manually stacks from several last screens of output; I have tried to
 make a photo but right now my kbluetooth is refusing to work at all so I
 cannot transfer them :( (but I suspect quality would be too bad anyway)

 laptop_mode D
 io_schedule+0xe/0x20
 Looks suspicious to me.  Can you identify what line of code this points to?

 
 If you could explain how to ...

gdb vmlinux

(gdb) l *io_schedule+0xe

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/