Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-24 Thread Alan Stern
On Tue, 23 Jan 2007, Florin Iucha wrote:

> > It would be nice to learn exactly why the keyboard stopped working.  Try
> > using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
> > to see what happens when you type on the dead keyboard.  Be sure to turn
> > on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
> > you hit a key the USB controller should get an interrupt.
> 
> Attached is the output from usbmon, unfortunately this kernel did not
> have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.
> 
> So, the bus sees some traffic when the keyboard is used, but gdm does
> not receive any keystrokes.

So it's possible that the USB drivers are working correctly but the 
keystrokes are getting lost somewhere in the X server.  Can you switch to 
a VT (or kill the X server entirely) and see if the keyboard works then?

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-24 Thread Alan Stern
On Tue, 23 Jan 2007, Florin Iucha wrote:

  It would be nice to learn exactly why the keyboard stopped working.  Try
  using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
  to see what happens when you type on the dead keyboard.  Be sure to turn
  on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
  you hit a key the USB controller should get an interrupt.
 
 Attached is the output from usbmon, unfortunately this kernel did not
 have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.
 
 So, the bus sees some traffic when the keyboard is used, but gdm does
 not receive any keystrokes.

So it's possible that the USB drivers are working correctly but the 
keystrokes are getting lost somewhere in the X server.  Can you switch to 
a VT (or kill the X server entirely) and see if the keyboard works then?

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-23 Thread Florin Iucha
On Mon, Jan 15, 2007 at 10:58:29AM -0500, Alan Stern wrote:
> On Sun, 14 Jan 2007, Florin Iucha wrote:
> 
> > Jiri and Trond,
> > 
> > On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> > > On Sun, 14 Jan 2007, Florin Iucha wrote:
> > > 
> > > > All the testing was done via a ssh into the workstation.  The console 
> > > > was left as booted into, with the gdm running.  The remote nfs4 
> > > > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > > > that the keyboard was still functioning, I did not reboot but stayed in 
> > > > the same kernel and pulled the latest git then started bisecting.  
> > > 
> > > Hi Florin,
> > > 
> > > thanks a lot for the testing. Just to verify - what kernel is 'the same 
> > > kernel' mentioned above? (just to isolate whether the problem is really 
> > > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> > > or the situation has changed).
> > 
> > This happened with 2.6.19.  It worked last time, but I wanted to test
> > again, to make sure.  This time, it bombed, but half an hour after the 
> > transfer finished.
> > 
> > > > After recompiling, I moved over to the workstation to reboot it, but 
> > > > the 
> > > > keyboard was not functioning ;(
> > > 
> > > So this time the hang occured when the system was idle, not during the 
> > > transfers, right?
> > 
> > Yes it was idle.  Immediately after the transfer finished, the keyboard was
> > still functioning.  It "hang" minutes later, after the first bisected kernel
> > was compiled and installed.
> > 
> > > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > > > any oops, anything for that matter.  I have unplugged the keyboard and
> > > > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > > > Stracing "lsusb" showed it hang (entered the kernel) at opening the 
> > > > device
> > > > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > > > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > > > worked without problem, so it appears that crossing mountpoints causes
> > > > some hang in the kernel.
> > > 
> > > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> > > ssh, when your keyboard is dead) to see the calltraces of the processes 
> > > which are stuck inside kernel?
> > > 
> > > You will probably get a lot of output after the sysrq, so please either 
> > > put it somewhere on the web if possible, or just extract the interesting 
> > > processes out of it (mainly the ones which are stuck).
> > 
> > Will do.
> 
> It would be nice to learn exactly why the keyboard stopped working.  Try
> using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
> to see what happens when you type on the dead keyboard.  Be sure to turn
> on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
> you hit a key the USB controller should get an interrupt.

Attached is the output from usbmon, unfortunately this kernel did not
have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.

So, the bus sees some traffic when the keyboard is used, but gdm does
not receive any keystrokes.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163
81007fe91308 707039742 C Ii:003:01 0 8 = 2800 
81007fe91308 707039783 S Ii:003:01 -115 8 <
81007fe91308 707103733 C Ii:003:01 0 8 =  
81007fe91308 707103749 S Ii:003:01 -115 8 <
81007fe91308 707207728 C Ii:003:01 0 8 = 2800 
81007fe91308 707207744 S Ii:003:01 -115 8 <
81007fe91308 707271726 C Ii:003:01 0 8 =  
81007fe91308 707271741 S Ii:003:01 -115 8 <
81007fe91308 707375721 C Ii:003:01 0 8 = 2800 
81007fe91308 707375736 S Ii:003:01 -115 8 <
81007fe91308 707447718 C Ii:003:01 0 8 =  
81007fe91308 707447732 S Ii:003:01 -115 8 <
81007fe91308 707655708 C Ii:003:01 0 8 = 0f00 
81007fe91308 707655725 S Ii:003:01 -115 8 <
81007fe91308 707663708 C Ii:003:01 0 8 = 0f33 
81007fe91308 707663724 S Ii:003:01 -115 8 <
81007fe91308 707679708 C Ii:003:01 0 8 = 0f33 0e00
81007fe91308 707679724 S Ii:003:01 -115 8 <
81007fe91308 707719706 C Ii:003:01 0 8 = 0f33 0e0d
81007fe91308 707719721 S Ii:003:01 -115 8 <
81007fe91308 707727706 C Ii:003:01 0 8 = 0f0e 0d00
81007fe91308 707727721 S Ii:003:01 -115 8 <
81007fe91308 707743704 C Ii:003:01 0 8 = 0e0d 
81007fe91308 707743719 S Ii:003:01 -115 8 <
81007fe91308 707791703 C Ii:003:01 0 8 = 0d00 
81007fe91308 707791717 S Ii:003:01 -115 8 <
81007fe91308 707831702 C Ii:003:01 0 8 =  
81007fe91308 707831717 S Ii:003:01 -115 8 <
81007fe91308 707847701 C Ii:003:01 0 8 = 0900 
81007fe91308 707847716 S Ii:003:01 -115 8 <
81007fe91308 707879698 C 

Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-23 Thread Florin Iucha
On Mon, Jan 15, 2007 at 10:58:29AM -0500, Alan Stern wrote:
 On Sun, 14 Jan 2007, Florin Iucha wrote:
 
  Jiri and Trond,
  
  On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
   On Sun, 14 Jan 2007, Florin Iucha wrote:
   
All the testing was done via a ssh into the workstation.  The console 
was left as booted into, with the gdm running.  The remote nfs4 
directory was mounted on /mnt. After copying the 60+ GB and testing 
that the keyboard was still functioning, I did not reboot but stayed in 
the same kernel and pulled the latest git then started bisecting.  
   
   Hi Florin,
   
   thanks a lot for the testing. Just to verify - what kernel is 'the same 
   kernel' mentioned above? (just to isolate whether the problem is really 
   somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
   or the situation has changed).
  
  This happened with 2.6.19.  It worked last time, but I wanted to test
  again, to make sure.  This time, it bombed, but half an hour after the 
  transfer finished.
  
After recompiling, I moved over to the workstation to reboot it, but 
the 
keyboard was not functioning ;(
   
   So this time the hang occured when the system was idle, not during the 
   transfers, right?
  
  Yes it was idle.  Immediately after the transfer finished, the keyboard was
  still functioning.  It hang minutes later, after the first bisected kernel
  was compiled and installed.
  
I ran lsusb and it displayed all the devices. dmesg did not show
any oops, anything for that matter.  I have unplugged the keyboard and
run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
Stracing lsusb showed it hang (entered the kernel) at opening the 
device
that used to be the keyboard.  Stracing ls /mnt showed that it
hang at stat(/mnt).  Both processes were in D state.  ls /root
worked without problem, so it appears that crossing mountpoints causes
some hang in the kernel.
   
   Could you please do alt-sysrq-t (or echo t  /proc/sysrq-trigger via 
   ssh, when your keyboard is dead) to see the calltraces of the processes 
   which are stuck inside kernel?
   
   You will probably get a lot of output after the sysrq, so please either 
   put it somewhere on the web if possible, or just extract the interesting 
   processes out of it (mainly the ones which are stuck).
  
  Will do.
 
 It would be nice to learn exactly why the keyboard stopped working.  Try
 using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
 to see what happens when you type on the dead keyboard.  Be sure to turn
 on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
 you hit a key the USB controller should get an interrupt.

Attached is the output from usbmon, unfortunately this kernel did not
have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.

So, the bus sees some traffic when the keyboard is used, but gdm does
not receive any keystrokes.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163
81007fe91308 707039742 C Ii:003:01 0 8 = 2800 
81007fe91308 707039783 S Ii:003:01 -115 8 
81007fe91308 707103733 C Ii:003:01 0 8 =  
81007fe91308 707103749 S Ii:003:01 -115 8 
81007fe91308 707207728 C Ii:003:01 0 8 = 2800 
81007fe91308 707207744 S Ii:003:01 -115 8 
81007fe91308 707271726 C Ii:003:01 0 8 =  
81007fe91308 707271741 S Ii:003:01 -115 8 
81007fe91308 707375721 C Ii:003:01 0 8 = 2800 
81007fe91308 707375736 S Ii:003:01 -115 8 
81007fe91308 707447718 C Ii:003:01 0 8 =  
81007fe91308 707447732 S Ii:003:01 -115 8 
81007fe91308 707655708 C Ii:003:01 0 8 = 0f00 
81007fe91308 707655725 S Ii:003:01 -115 8 
81007fe91308 707663708 C Ii:003:01 0 8 = 0f33 
81007fe91308 707663724 S Ii:003:01 -115 8 
81007fe91308 707679708 C Ii:003:01 0 8 = 0f33 0e00
81007fe91308 707679724 S Ii:003:01 -115 8 
81007fe91308 707719706 C Ii:003:01 0 8 = 0f33 0e0d
81007fe91308 707719721 S Ii:003:01 -115 8 
81007fe91308 707727706 C Ii:003:01 0 8 = 0f0e 0d00
81007fe91308 707727721 S Ii:003:01 -115 8 
81007fe91308 707743704 C Ii:003:01 0 8 = 0e0d 
81007fe91308 707743719 S Ii:003:01 -115 8 
81007fe91308 707791703 C Ii:003:01 0 8 = 0d00 
81007fe91308 707791717 S Ii:003:01 -115 8 
81007fe91308 707831702 C Ii:003:01 0 8 =  
81007fe91308 707831717 S Ii:003:01 -115 8 
81007fe91308 707847701 C Ii:003:01 0 8 = 0900 
81007fe91308 707847716 S Ii:003:01 -115 8 
81007fe91308 707879698 C Ii:003:01 0 8 = 090f 
81007fe91308 707879714 S Ii:003:01 -115 8 
81007fe91308 707895700 C Ii:003:01 0 8 = 090f 3300
81007fe91308 707895715 S Ii:003:01 -115 8 
81007fe91308 

Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Alan Stern
On Sun, 14 Jan 2007, Florin Iucha wrote:

> Jiri and Trond,
> 
> On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> > On Sun, 14 Jan 2007, Florin Iucha wrote:
> > 
> > > All the testing was done via a ssh into the workstation.  The console 
> > > was left as booted into, with the gdm running.  The remote nfs4 
> > > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > > that the keyboard was still functioning, I did not reboot but stayed in 
> > > the same kernel and pulled the latest git then started bisecting.  
> > 
> > Hi Florin,
> > 
> > thanks a lot for the testing. Just to verify - what kernel is 'the same 
> > kernel' mentioned above? (just to isolate whether the problem is really 
> > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> > or the situation has changed).
> 
> This happened with 2.6.19.  It worked last time, but I wanted to test
> again, to make sure.  This time, it bombed, but half an hour after the 
> transfer finished.
> 
> > > After recompiling, I moved over to the workstation to reboot it, but the 
> > > keyboard was not functioning ;(
> > 
> > So this time the hang occured when the system was idle, not during the 
> > transfers, right?
> 
> Yes it was idle.  Immediately after the transfer finished, the keyboard was
> still functioning.  It "hang" minutes later, after the first bisected kernel
> was compiled and installed.
> 
> > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > > any oops, anything for that matter.  I have unplugged the keyboard and
> > > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > > Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> > > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > > worked without problem, so it appears that crossing mountpoints causes
> > > some hang in the kernel.
> > 
> > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> > ssh, when your keyboard is dead) to see the calltraces of the processes 
> > which are stuck inside kernel?
> > 
> > You will probably get a lot of output after the sysrq, so please either 
> > put it somewhere on the web if possible, or just extract the interesting 
> > processes out of it (mainly the ones which are stuck).
> 
> Will do.

It would be nice to learn exactly why the keyboard stopped working.  Try
using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
to see what happens when you type on the dead keyboard.  Be sure to turn
on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
you hit a key the USB controller should get an interrupt.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Florin Iucha
On Sun, Jan 14, 2007 at 11:11:13PM -0300, Horst H. von Brand wrote:
> Florin Iucha <[EMAIL PROTECTED]> wrote:
> 
> [...]
> 
> > Based on this info, I think we can rule out any USB.  I will try
> > testing with NFS3 to see if the problem persists.  Unfortunately there
> > is no oops or anything in "dmesg".
> 
> Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
> older.

The reported had and oops?  Luxury!  I get nothing ;)

I am testing again, this time on 2.6.20-rc5 compiled with extra debug
and I got a couple dozens of:

   "eth0: too many iterations (6) in nv_nic_irq."

in the kernel log.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Horst H. von Brand
Florin Iucha <[EMAIL PROTECTED]> wrote:

[...]

> Based on this info, I think we can rule out any USB.  I will try
> testing with NFS3 to see if the problem persists.  Unfortunately there
> is no oops or anything in "dmesg".

Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
older.
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de InformaticaFono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile   Fax:  +56 32 2797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Horst H. von Brand
Florin Iucha [EMAIL PROTECTED] wrote:

[...]

 Based on this info, I think we can rule out any USB.  I will try
 testing with NFS3 to see if the problem persists.  Unfortunately there
 is no oops or anything in dmesg.

Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
older.
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de InformaticaFono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile   Fax:  +56 32 2797513
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Florin Iucha
On Sun, Jan 14, 2007 at 11:11:13PM -0300, Horst H. von Brand wrote:
 Florin Iucha [EMAIL PROTECTED] wrote:
 
 [...]
 
  Based on this info, I think we can rule out any USB.  I will try
  testing with NFS3 to see if the problem persists.  Unfortunately there
  is no oops or anything in dmesg.
 
 Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
 older.

The reported had and oops?  Luxury!  I get nothing ;)

I am testing again, this time on 2.6.20-rc5 compiled with extra debug
and I got a couple dozens of:

   eth0: too many iterations (6) in nv_nic_irq.

in the kernel log.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Alan Stern
On Sun, 14 Jan 2007, Florin Iucha wrote:

 Jiri and Trond,
 
 On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
  On Sun, 14 Jan 2007, Florin Iucha wrote:
  
   All the testing was done via a ssh into the workstation.  The console 
   was left as booted into, with the gdm running.  The remote nfs4 
   directory was mounted on /mnt. After copying the 60+ GB and testing 
   that the keyboard was still functioning, I did not reboot but stayed in 
   the same kernel and pulled the latest git then started bisecting.  
  
  Hi Florin,
  
  thanks a lot for the testing. Just to verify - what kernel is 'the same 
  kernel' mentioned above? (just to isolate whether the problem is really 
  somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
  or the situation has changed).
 
 This happened with 2.6.19.  It worked last time, but I wanted to test
 again, to make sure.  This time, it bombed, but half an hour after the 
 transfer finished.
 
   After recompiling, I moved over to the workstation to reboot it, but the 
   keyboard was not functioning ;(
  
  So this time the hang occured when the system was idle, not during the 
  transfers, right?
 
 Yes it was idle.  Immediately after the transfer finished, the keyboard was
 still functioning.  It hang minutes later, after the first bisected kernel
 was compiled and installed.
 
   I ran lsusb and it displayed all the devices. dmesg did not show
   any oops, anything for that matter.  I have unplugged the keyboard and
   run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
   Stracing lsusb showed it hang (entered the kernel) at opening the device
   that used to be the keyboard.  Stracing ls /mnt showed that it
   hang at stat(/mnt).  Both processes were in D state.  ls /root
   worked without problem, so it appears that crossing mountpoints causes
   some hang in the kernel.
  
  Could you please do alt-sysrq-t (or echo t  /proc/sysrq-trigger via 
  ssh, when your keyboard is dead) to see the calltraces of the processes 
  which are stuck inside kernel?
  
  You will probably get a lot of output after the sysrq, so please either 
  put it somewhere on the web if possible, or just extract the interesting 
  processes out of it (mainly the ones which are stuck).
 
 Will do.

It would be nice to learn exactly why the keyboard stopped working.  Try
using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
to see what happens when you type on the dead keyboard.  Be sure to turn
on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
you hit a key the USB controller should get an interrupt.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
Jiri and Trond,

On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> On Sun, 14 Jan 2007, Florin Iucha wrote:
> 
> > All the testing was done via a ssh into the workstation.  The console 
> > was left as booted into, with the gdm running.  The remote nfs4 
> > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > that the keyboard was still functioning, I did not reboot but stayed in 
> > the same kernel and pulled the latest git then started bisecting.  
> 
> Hi Florin,
> 
> thanks a lot for the testing. Just to verify - what kernel is 'the same 
> kernel' mentioned above? (just to isolate whether the problem is really 
> somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> or the situation has changed).

This happened with 2.6.19.  It worked last time, but I wanted to test
again, to make sure.  This time, it bombed, but half an hour after the 
transfer finished.

> > After recompiling, I moved over to the workstation to reboot it, but the 
> > keyboard was not functioning ;(
> 
> So this time the hang occured when the system was idle, not during the 
> transfers, right?

Yes it was idle.  Immediately after the transfer finished, the keyboard was
still functioning.  It "hang" minutes later, after the first bisected kernel
was compiled and installed.

> > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > any oops, anything for that matter.  I have unplugged the keyboard and
> > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > worked without problem, so it appears that crossing mountpoints causes
> > some hang in the kernel.
> 
> Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> ssh, when your keyboard is dead) to see the calltraces of the processes 
> which are stuck inside kernel?
> 
> You will probably get a lot of output after the sysrq, so please either 
> put it somewhere on the web if possible, or just extract the interesting 
> processes out of it (mainly the ones which are stuck).

Will do.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Jiri Kosina
On Sun, 14 Jan 2007, Florin Iucha wrote:

> All the testing was done via a ssh into the workstation.  The console 
> was left as booted into, with the gdm running.  The remote nfs4 
> directory was mounted on "/mnt". After copying the 60+ GB and testing 
> that the keyboard was still functioning, I did not reboot but stayed in 
> the same kernel and pulled the latest git then started bisecting.  

Hi Florin,

thanks a lot for the testing. Just to verify - what kernel is 'the same 
kernel' mentioned above? (just to isolate whether the problem is really 
somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
or the situation has changed).

> After recompiling, I moved over to the workstation to reboot it, but the 
> keyboard was not functioning ;(

So this time the hang occured when the system was idle, not during the 
transfers, right?

> I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> any oops, anything for that matter.  I have unplugged the keyboard and
> run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> that used to be the keyboard.  Stracing "ls /mnt" showed that it
> hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> worked without problem, so it appears that crossing mountpoints causes
> some hang in the kernel.

Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
ssh, when your keyboard is dead) to see the calltraces of the processes 
which are stuck inside kernel?

You will probably get a lot of output after the sysrq, so please either 
put it somewhere on the web if possible, or just extract the interesting 
processes out of it (mainly the ones which are stuck).

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Trond Myklebust
On Sun, 2007-01-14 at 17:58 -0600, Florin Iucha wrote:
> All the testing was done via a ssh into the workstation.  The console
> was left as booted into, with the gdm running.  The remote nfs4
> directory was mounted on "/mnt".
> 
> After copying the 60+ GB and testing that the keyboard was still
> functioning, I did not reboot but stayed in the same kernel and pulled
> the latest git then started bisecting.  After recompiling, I moved
> over to the workstation to reboot it, but the keyboard was not
> functioning ;(
> 
> I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> any oops, anything for that matter.  I have unplugged the keyboard and
> run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> that used to be the keyboard.  Stracing "ls /mnt" showed that it
> hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> worked without problem, so it appears that crossing mountpoints causes
> some hang in the kernel.
> 
> Based on this info, I think we can rule out any USB.  I will try
> testing with NFS3 to see if the problem persists.  Unfortunately there
> is no oops or anything in "dmesg".

Did you try an 'echo t > /proc/sysrq-trigger' in order to find out where
the stat process is hanging?

  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
On Sun, Jan 14, 2007 at 04:57:01PM -0600,  wrote:
> On Wed, Jan 10, 2007 at 10:54:34AM -0500, Alan Stern wrote:
> > It's still possible that this is hardware related; perhaps some component
> > just began to wear out.  If you return to an earlier kernel, does the 
> > problem go away?
> 
> As reported in my original e-mail and verified just minutes ago, the
> copy succeeds with 2.6.19 (kernel.org vanilla, compiled with the same
> config as 2.6.20-rcX).  I will begin bisecting between .19 and .20-rc1
> after re-reading Jiri's messages.

All the testing was done via a ssh into the workstation.  The console
was left as booted into, with the gdm running.  The remote nfs4
directory was mounted on "/mnt".

After copying the 60+ GB and testing that the keyboard was still
functioning, I did not reboot but stayed in the same kernel and pulled
the latest git then started bisecting.  After recompiling, I moved
over to the workstation to reboot it, but the keyboard was not
functioning ;(

I ran "lsusb" and it displayed all the devices. "dmesg" did not show
any oops, anything for that matter.  I have unplugged the keyboard and
run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
Stracing "lsusb" showed it hang (entered the kernel) at opening the device
that used to be the keyboard.  Stracing "ls /mnt" showed that it
hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
worked without problem, so it appears that crossing mountpoints causes
some hang in the kernel.

Based on this info, I think we can rule out any USB.  I will try
testing with NFS3 to see if the problem persists.  Unfortunately there
is no oops or anything in "dmesg".

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
On Sun, Jan 14, 2007 at 04:57:01PM -0600,  wrote:
 On Wed, Jan 10, 2007 at 10:54:34AM -0500, Alan Stern wrote:
  It's still possible that this is hardware related; perhaps some component
  just began to wear out.  If you return to an earlier kernel, does the 
  problem go away?
 
 As reported in my original e-mail and verified just minutes ago, the
 copy succeeds with 2.6.19 (kernel.org vanilla, compiled with the same
 config as 2.6.20-rcX).  I will begin bisecting between .19 and .20-rc1
 after re-reading Jiri's messages.

All the testing was done via a ssh into the workstation.  The console
was left as booted into, with the gdm running.  The remote nfs4
directory was mounted on /mnt.

After copying the 60+ GB and testing that the keyboard was still
functioning, I did not reboot but stayed in the same kernel and pulled
the latest git then started bisecting.  After recompiling, I moved
over to the workstation to reboot it, but the keyboard was not
functioning ;(

I ran lsusb and it displayed all the devices. dmesg did not show
any oops, anything for that matter.  I have unplugged the keyboard and
run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
Stracing lsusb showed it hang (entered the kernel) at opening the device
that used to be the keyboard.  Stracing ls /mnt showed that it
hang at stat(/mnt).  Both processes were in D state.  ls /root
worked without problem, so it appears that crossing mountpoints causes
some hang in the kernel.

Based on this info, I think we can rule out any USB.  I will try
testing with NFS3 to see if the problem persists.  Unfortunately there
is no oops or anything in dmesg.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Trond Myklebust
On Sun, 2007-01-14 at 17:58 -0600, Florin Iucha wrote:
 All the testing was done via a ssh into the workstation.  The console
 was left as booted into, with the gdm running.  The remote nfs4
 directory was mounted on /mnt.
 
 After copying the 60+ GB and testing that the keyboard was still
 functioning, I did not reboot but stayed in the same kernel and pulled
 the latest git then started bisecting.  After recompiling, I moved
 over to the workstation to reboot it, but the keyboard was not
 functioning ;(
 
 I ran lsusb and it displayed all the devices. dmesg did not show
 any oops, anything for that matter.  I have unplugged the keyboard and
 run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
 Stracing lsusb showed it hang (entered the kernel) at opening the device
 that used to be the keyboard.  Stracing ls /mnt showed that it
 hang at stat(/mnt).  Both processes were in D state.  ls /root
 worked without problem, so it appears that crossing mountpoints causes
 some hang in the kernel.
 
 Based on this info, I think we can rule out any USB.  I will try
 testing with NFS3 to see if the problem persists.  Unfortunately there
 is no oops or anything in dmesg.

Did you try an 'echo t  /proc/sysrq-trigger' in order to find out where
the stat process is hanging?

  Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Jiri Kosina
On Sun, 14 Jan 2007, Florin Iucha wrote:

 All the testing was done via a ssh into the workstation.  The console 
 was left as booted into, with the gdm running.  The remote nfs4 
 directory was mounted on /mnt. After copying the 60+ GB and testing 
 that the keyboard was still functioning, I did not reboot but stayed in 
 the same kernel and pulled the latest git then started bisecting.  

Hi Florin,

thanks a lot for the testing. Just to verify - what kernel is 'the same 
kernel' mentioned above? (just to isolate whether the problem is really 
somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
or the situation has changed).

 After recompiling, I moved over to the workstation to reboot it, but the 
 keyboard was not functioning ;(

So this time the hang occured when the system was idle, not during the 
transfers, right?

 I ran lsusb and it displayed all the devices. dmesg did not show
 any oops, anything for that matter.  I have unplugged the keyboard and
 run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
 Stracing lsusb showed it hang (entered the kernel) at opening the device
 that used to be the keyboard.  Stracing ls /mnt showed that it
 hang at stat(/mnt).  Both processes were in D state.  ls /root
 worked without problem, so it appears that crossing mountpoints causes
 some hang in the kernel.

Could you please do alt-sysrq-t (or echo t  /proc/sysrq-trigger via 
ssh, when your keyboard is dead) to see the calltraces of the processes 
which are stuck inside kernel?

You will probably get a lot of output after the sysrq, so please either 
put it somewhere on the web if possible, or just extract the interesting 
processes out of it (mainly the ones which are stuck).

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
Jiri and Trond,

On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
 On Sun, 14 Jan 2007, Florin Iucha wrote:
 
  All the testing was done via a ssh into the workstation.  The console 
  was left as booted into, with the gdm running.  The remote nfs4 
  directory was mounted on /mnt. After copying the 60+ GB and testing 
  that the keyboard was still functioning, I did not reboot but stayed in 
  the same kernel and pulled the latest git then started bisecting.  
 
 Hi Florin,
 
 thanks a lot for the testing. Just to verify - what kernel is 'the same 
 kernel' mentioned above? (just to isolate whether the problem is really 
 somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
 or the situation has changed).

This happened with 2.6.19.  It worked last time, but I wanted to test
again, to make sure.  This time, it bombed, but half an hour after the 
transfer finished.

  After recompiling, I moved over to the workstation to reboot it, but the 
  keyboard was not functioning ;(
 
 So this time the hang occured when the system was idle, not during the 
 transfers, right?

Yes it was idle.  Immediately after the transfer finished, the keyboard was
still functioning.  It hang minutes later, after the first bisected kernel
was compiled and installed.

  I ran lsusb and it displayed all the devices. dmesg did not show
  any oops, anything for that matter.  I have unplugged the keyboard and
  run lsusb again, but it hang.  I ran ls /mnt and it hang as well.
  Stracing lsusb showed it hang (entered the kernel) at opening the device
  that used to be the keyboard.  Stracing ls /mnt showed that it
  hang at stat(/mnt).  Both processes were in D state.  ls /root
  worked without problem, so it appears that crossing mountpoints causes
  some hang in the kernel.
 
 Could you please do alt-sysrq-t (or echo t  /proc/sysrq-trigger via 
 ssh, when your keyboard is dead) to see the calltraces of the processes 
 which are stuck inside kernel?
 
 You will probably get a lot of output after the sysrq, so please either 
 put it somewhere on the web if possible, or just extract the interesting 
 processes out of it (mainly the ones which are stuck).

Will do.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature