Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-24 Thread Alan Stern
On Tue, 23 Jan 2007, Florin Iucha wrote:

> > It would be nice to learn exactly why the keyboard stopped working.  Try
> > using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
> > to see what happens when you type on the dead keyboard.  Be sure to turn
> > on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
> > you hit a key the USB controller should get an interrupt.
> 
> Attached is the output from usbmon, unfortunately this kernel did not
> have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.
> 
> So, the bus sees some traffic when the keyboard is used, but gdm does
> not receive any keystrokes.

So it's possible that the USB drivers are working correctly but the 
keystrokes are getting lost somewhere in the X server.  Can you switch to 
a VT (or kill the X server entirely) and see if the keyboard works then?

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-23 Thread Florin Iucha
On Mon, Jan 15, 2007 at 10:58:29AM -0500, Alan Stern wrote:
> On Sun, 14 Jan 2007, Florin Iucha wrote:
> 
> > Jiri and Trond,
> > 
> > On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> > > On Sun, 14 Jan 2007, Florin Iucha wrote:
> > > 
> > > > All the testing was done via a ssh into the workstation.  The console 
> > > > was left as booted into, with the gdm running.  The remote nfs4 
> > > > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > > > that the keyboard was still functioning, I did not reboot but stayed in 
> > > > the same kernel and pulled the latest git then started bisecting.  
> > > 
> > > Hi Florin,
> > > 
> > > thanks a lot for the testing. Just to verify - what kernel is 'the same 
> > > kernel' mentioned above? (just to isolate whether the problem is really 
> > > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> > > or the situation has changed).
> > 
> > This happened with 2.6.19.  It worked last time, but I wanted to test
> > again, to make sure.  This time, it bombed, but half an hour after the 
> > transfer finished.
> > 
> > > > After recompiling, I moved over to the workstation to reboot it, but 
> > > > the 
> > > > keyboard was not functioning ;(
> > > 
> > > So this time the hang occured when the system was idle, not during the 
> > > transfers, right?
> > 
> > Yes it was idle.  Immediately after the transfer finished, the keyboard was
> > still functioning.  It "hang" minutes later, after the first bisected kernel
> > was compiled and installed.
> > 
> > > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > > > any oops, anything for that matter.  I have unplugged the keyboard and
> > > > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > > > Stracing "lsusb" showed it hang (entered the kernel) at opening the 
> > > > device
> > > > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > > > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > > > worked without problem, so it appears that crossing mountpoints causes
> > > > some hang in the kernel.
> > > 
> > > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> > > ssh, when your keyboard is dead) to see the calltraces of the processes 
> > > which are stuck inside kernel?
> > > 
> > > You will probably get a lot of output after the sysrq, so please either 
> > > put it somewhere on the web if possible, or just extract the interesting 
> > > processes out of it (mainly the ones which are stuck).
> > 
> > Will do.
> 
> It would be nice to learn exactly why the keyboard stopped working.  Try
> using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
> to see what happens when you type on the dead keyboard.  Be sure to turn
> on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
> you hit a key the USB controller should get an interrupt.

Attached is the output from usbmon, unfortunately this kernel did not
have CONFIG_USB_DEBUG set.  This is kernel 2.6.20-rc5.

So, the bus sees some traffic when the keyboard is used, but gdm does
not receive any keystrokes.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163
81007fe91308 707039742 C Ii:003:01 0 8 = 2800 
81007fe91308 707039783 S Ii:003:01 -115 8 <
81007fe91308 707103733 C Ii:003:01 0 8 =  
81007fe91308 707103749 S Ii:003:01 -115 8 <
81007fe91308 707207728 C Ii:003:01 0 8 = 2800 
81007fe91308 707207744 S Ii:003:01 -115 8 <
81007fe91308 707271726 C Ii:003:01 0 8 =  
81007fe91308 707271741 S Ii:003:01 -115 8 <
81007fe91308 707375721 C Ii:003:01 0 8 = 2800 
81007fe91308 707375736 S Ii:003:01 -115 8 <
81007fe91308 707447718 C Ii:003:01 0 8 =  
81007fe91308 707447732 S Ii:003:01 -115 8 <
81007fe91308 707655708 C Ii:003:01 0 8 = 0f00 
81007fe91308 707655725 S Ii:003:01 -115 8 <
81007fe91308 707663708 C Ii:003:01 0 8 = 0f33 
81007fe91308 707663724 S Ii:003:01 -115 8 <
81007fe91308 707679708 C Ii:003:01 0 8 = 0f33 0e00
81007fe91308 707679724 S Ii:003:01 -115 8 <
81007fe91308 707719706 C Ii:003:01 0 8 = 0f33 0e0d
81007fe91308 707719721 S Ii:003:01 -115 8 <
81007fe91308 707727706 C Ii:003:01 0 8 = 0f0e 0d00
81007fe91308 707727721 S Ii:003:01 -115 8 <
81007fe91308 707743704 C Ii:003:01 0 8 = 0e0d 
81007fe91308 707743719 S Ii:003:01 -115 8 <
81007fe91308 707791703 C Ii:003:01 0 8 = 0d00 
81007fe91308 707791717 S Ii:003:01 -115 8 <
81007fe91308 707831702 C Ii:003:01 0 8 =  
81007fe91308 707831717 S Ii:003:01 -115 8 <
81007fe91308 707847701 C Ii:003:01 0 8 = 0900 
81007fe91308 707847716 S Ii:003:01 -115 8 <
81007fe91308 707879698 C Ii:

Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Alan Stern
On Sun, 14 Jan 2007, Florin Iucha wrote:

> Jiri and Trond,
> 
> On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> > On Sun, 14 Jan 2007, Florin Iucha wrote:
> > 
> > > All the testing was done via a ssh into the workstation.  The console 
> > > was left as booted into, with the gdm running.  The remote nfs4 
> > > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > > that the keyboard was still functioning, I did not reboot but stayed in 
> > > the same kernel and pulled the latest git then started bisecting.  
> > 
> > Hi Florin,
> > 
> > thanks a lot for the testing. Just to verify - what kernel is 'the same 
> > kernel' mentioned above? (just to isolate whether the problem is really 
> > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> > or the situation has changed).
> 
> This happened with 2.6.19.  It worked last time, but I wanted to test
> again, to make sure.  This time, it bombed, but half an hour after the 
> transfer finished.
> 
> > > After recompiling, I moved over to the workstation to reboot it, but the 
> > > keyboard was not functioning ;(
> > 
> > So this time the hang occured when the system was idle, not during the 
> > transfers, right?
> 
> Yes it was idle.  Immediately after the transfer finished, the keyboard was
> still functioning.  It "hang" minutes later, after the first bisected kernel
> was compiled and installed.
> 
> > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > > any oops, anything for that matter.  I have unplugged the keyboard and
> > > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > > Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> > > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > > worked without problem, so it appears that crossing mountpoints causes
> > > some hang in the kernel.
> > 
> > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> > ssh, when your keyboard is dead) to see the calltraces of the processes 
> > which are stuck inside kernel?
> > 
> > You will probably get a lot of output after the sysrq, so please either 
> > put it somewhere on the web if possible, or just extract the interesting 
> > processes out of it (mainly the ones which are stuck).
> 
> Will do.

It would be nice to learn exactly why the keyboard stopped working.  Try
using the usbmon facility (instructions in Documentation/usb/usbmon.txt)
to see what happens when you type on the dead keyboard.  Be sure to turn
on CONFIG_USB_DEBUG as well.  And also check /proc/interrupts; each time
you hit a key the USB controller should get an interrupt.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Florin Iucha
On Sun, Jan 14, 2007 at 11:11:13PM -0300, Horst H. von Brand wrote:
> Florin Iucha <[EMAIL PROTECTED]> wrote:
> 
> [...]
> 
> > Based on this info, I think we can rule out any USB.  I will try
> > testing with NFS3 to see if the problem persists.  Unfortunately there
> > is no oops or anything in "dmesg".
> 
> Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
> older.

The reported had and oops?  Luxury!  I get nothing ;)

I am testing again, this time on 2.6.20-rc5 compiled with extra debug
and I got a couple dozens of:

   "eth0: too many iterations (6) in nv_nic_irq."

in the kernel log.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-15 Thread Horst H. von Brand
Florin Iucha <[EMAIL PROTECTED]> wrote:

[...]

> Based on this info, I think we can rule out any USB.  I will try
> testing with NFS3 to see if the problem persists.  Unfortunately there
> is no oops or anything in "dmesg".

Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is
older.
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de InformaticaFono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile   Fax:  +56 32 2797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
Jiri and Trond,

On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote:
> On Sun, 14 Jan 2007, Florin Iucha wrote:
> 
> > All the testing was done via a ssh into the workstation.  The console 
> > was left as booted into, with the gdm running.  The remote nfs4 
> > directory was mounted on "/mnt". After copying the 60+ GB and testing 
> > that the keyboard was still functioning, I did not reboot but stayed in 
> > the same kernel and pulled the latest git then started bisecting.  
> 
> Hi Florin,
> 
> thanks a lot for the testing. Just to verify - what kernel is 'the same 
> kernel' mentioned above? (just to isolate whether the problem is really 
> somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
> or the situation has changed).

This happened with 2.6.19.  It worked last time, but I wanted to test
again, to make sure.  This time, it bombed, but half an hour after the 
transfer finished.

> > After recompiling, I moved over to the workstation to reboot it, but the 
> > keyboard was not functioning ;(
> 
> So this time the hang occured when the system was idle, not during the 
> transfers, right?

Yes it was idle.  Immediately after the transfer finished, the keyboard was
still functioning.  It "hang" minutes later, after the first bisected kernel
was compiled and installed.

> > I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> > any oops, anything for that matter.  I have unplugged the keyboard and
> > run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> > Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> > that used to be the keyboard.  Stracing "ls /mnt" showed that it
> > hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> > worked without problem, so it appears that crossing mountpoints causes
> > some hang in the kernel.
> 
> Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
> ssh, when your keyboard is dead) to see the calltraces of the processes 
> which are stuck inside kernel?
> 
> You will probably get a lot of output after the sysrq, so please either 
> put it somewhere on the web if possible, or just extract the interesting 
> processes out of it (mainly the ones which are stuck).

Will do.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Jiri Kosina
On Sun, 14 Jan 2007, Florin Iucha wrote:

> All the testing was done via a ssh into the workstation.  The console 
> was left as booted into, with the gdm running.  The remote nfs4 
> directory was mounted on "/mnt". After copying the 60+ GB and testing 
> that the keyboard was still functioning, I did not reboot but stayed in 
> the same kernel and pulled the latest git then started bisecting.  

Hi Florin,

thanks a lot for the testing. Just to verify - what kernel is 'the same 
kernel' mentioned above? (just to isolate whether the problem is really 
somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, 
or the situation has changed).

> After recompiling, I moved over to the workstation to reboot it, but the 
> keyboard was not functioning ;(

So this time the hang occured when the system was idle, not during the 
transfers, right?

> I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> any oops, anything for that matter.  I have unplugged the keyboard and
> run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> that used to be the keyboard.  Stracing "ls /mnt" showed that it
> hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> worked without problem, so it appears that crossing mountpoints causes
> some hang in the kernel.

Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via 
ssh, when your keyboard is dead) to see the calltraces of the processes 
which are stuck inside kernel?

You will probably get a lot of output after the sysrq, so please either 
put it somewhere on the web if possible, or just extract the interesting 
processes out of it (mainly the ones which are stuck).

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Trond Myklebust
On Sun, 2007-01-14 at 17:58 -0600, Florin Iucha wrote:
> All the testing was done via a ssh into the workstation.  The console
> was left as booted into, with the gdm running.  The remote nfs4
> directory was mounted on "/mnt".
> 
> After copying the 60+ GB and testing that the keyboard was still
> functioning, I did not reboot but stayed in the same kernel and pulled
> the latest git then started bisecting.  After recompiling, I moved
> over to the workstation to reboot it, but the keyboard was not
> functioning ;(
> 
> I ran "lsusb" and it displayed all the devices. "dmesg" did not show
> any oops, anything for that matter.  I have unplugged the keyboard and
> run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
> Stracing "lsusb" showed it hang (entered the kernel) at opening the device
> that used to be the keyboard.  Stracing "ls /mnt" showed that it
> hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
> worked without problem, so it appears that crossing mountpoints causes
> some hang in the kernel.
> 
> Based on this info, I think we can rule out any USB.  I will try
> testing with NFS3 to see if the problem persists.  Unfortunately there
> is no oops or anything in "dmesg".

Did you try an 'echo t > /proc/sysrq-trigger' in order to find out where
the stat process is hanging?

  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)

2007-01-14 Thread Florin Iucha
On Sun, Jan 14, 2007 at 04:57:01PM -0600,  wrote:
> On Wed, Jan 10, 2007 at 10:54:34AM -0500, Alan Stern wrote:
> > It's still possible that this is hardware related; perhaps some component
> > just began to wear out.  If you return to an earlier kernel, does the 
> > problem go away?
> 
> As reported in my original e-mail and verified just minutes ago, the
> copy succeeds with 2.6.19 (kernel.org vanilla, compiled with the same
> config as 2.6.20-rcX).  I will begin bisecting between .19 and .20-rc1
> after re-reading Jiri's messages.

All the testing was done via a ssh into the workstation.  The console
was left as booted into, with the gdm running.  The remote nfs4
directory was mounted on "/mnt".

After copying the 60+ GB and testing that the keyboard was still
functioning, I did not reboot but stayed in the same kernel and pulled
the latest git then started bisecting.  After recompiling, I moved
over to the workstation to reboot it, but the keyboard was not
functioning ;(

I ran "lsusb" and it displayed all the devices. "dmesg" did not show
any oops, anything for that matter.  I have unplugged the keyboard and
run "lsusb" again, but it hang.  I ran "ls /mnt" and it hang as well.
Stracing "lsusb" showed it hang (entered the kernel) at opening the device
that used to be the keyboard.  Stracing "ls /mnt" showed that it
hang at "stat(/mnt)".  Both processes were in "D" state.  "ls /root"
worked without problem, so it appears that crossing mountpoints causes
some hang in the kernel.

Based on this info, I think we can rule out any USB.  I will try
testing with NFS3 to see if the problem persists.  Unfortunately there
is no oops or anything in "dmesg".

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature