Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Tue, 23 Jan 2007, Florin Iucha wrote: > > It would be nice to learn exactly why the keyboard stopped working. Try > > using the usbmon facility (instructions in Documentation/usb/usbmon.txt) > > to see what happens when you type on the dead keyboard. Be sure to turn > > on CONFIG_USB_DEBUG as well. And also check /proc/interrupts; each time > > you hit a key the USB controller should get an interrupt. > > Attached is the output from usbmon, unfortunately this kernel did not > have CONFIG_USB_DEBUG set. This is kernel 2.6.20-rc5. > > So, the bus sees some traffic when the keyboard is used, but gdm does > not receive any keystrokes. So it's possible that the USB drivers are working correctly but the keystrokes are getting lost somewhere in the X server. Can you switch to a VT (or kill the X server entirely) and see if the keyboard works then? Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Mon, Jan 15, 2007 at 10:58:29AM -0500, Alan Stern wrote: > On Sun, 14 Jan 2007, Florin Iucha wrote: > > > Jiri and Trond, > > > > On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote: > > > On Sun, 14 Jan 2007, Florin Iucha wrote: > > > > > > > All the testing was done via a ssh into the workstation. The console > > > > was left as booted into, with the gdm running. The remote nfs4 > > > > directory was mounted on "/mnt". After copying the 60+ GB and testing > > > > that the keyboard was still functioning, I did not reboot but stayed in > > > > the same kernel and pulled the latest git then started bisecting. > > > > > > Hi Florin, > > > > > > thanks a lot for the testing. Just to verify - what kernel is 'the same > > > kernel' mentioned above? (just to isolate whether the problem is really > > > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, > > > or the situation has changed). > > > > This happened with 2.6.19. It worked last time, but I wanted to test > > again, to make sure. This time, it bombed, but half an hour after the > > transfer finished. > > > > > > After recompiling, I moved over to the workstation to reboot it, but > > > > the > > > > keyboard was not functioning ;( > > > > > > So this time the hang occured when the system was idle, not during the > > > transfers, right? > > > > Yes it was idle. Immediately after the transfer finished, the keyboard was > > still functioning. It "hang" minutes later, after the first bisected kernel > > was compiled and installed. > > > > > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show > > > > any oops, anything for that matter. I have unplugged the keyboard and > > > > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. > > > > Stracing "lsusb" showed it hang (entered the kernel) at opening the > > > > device > > > > that used to be the keyboard. Stracing "ls /mnt" showed that it > > > > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" > > > > worked without problem, so it appears that crossing mountpoints causes > > > > some hang in the kernel. > > > > > > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via > > > ssh, when your keyboard is dead) to see the calltraces of the processes > > > which are stuck inside kernel? > > > > > > You will probably get a lot of output after the sysrq, so please either > > > put it somewhere on the web if possible, or just extract the interesting > > > processes out of it (mainly the ones which are stuck). > > > > Will do. > > It would be nice to learn exactly why the keyboard stopped working. Try > using the usbmon facility (instructions in Documentation/usb/usbmon.txt) > to see what happens when you type on the dead keyboard. Be sure to turn > on CONFIG_USB_DEBUG as well. And also check /proc/interrupts; each time > you hit a key the USB controller should get an interrupt. Attached is the output from usbmon, unfortunately this kernel did not have CONFIG_USB_DEBUG set. This is kernel 2.6.20-rc5. So, the bus sees some traffic when the keyboard is used, but gdm does not receive any keystrokes. florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 81007fe91308 707039742 C Ii:003:01 0 8 = 2800 81007fe91308 707039783 S Ii:003:01 -115 8 < 81007fe91308 707103733 C Ii:003:01 0 8 = 81007fe91308 707103749 S Ii:003:01 -115 8 < 81007fe91308 707207728 C Ii:003:01 0 8 = 2800 81007fe91308 707207744 S Ii:003:01 -115 8 < 81007fe91308 707271726 C Ii:003:01 0 8 = 81007fe91308 707271741 S Ii:003:01 -115 8 < 81007fe91308 707375721 C Ii:003:01 0 8 = 2800 81007fe91308 707375736 S Ii:003:01 -115 8 < 81007fe91308 707447718 C Ii:003:01 0 8 = 81007fe91308 707447732 S Ii:003:01 -115 8 < 81007fe91308 707655708 C Ii:003:01 0 8 = 0f00 81007fe91308 707655725 S Ii:003:01 -115 8 < 81007fe91308 707663708 C Ii:003:01 0 8 = 0f33 81007fe91308 707663724 S Ii:003:01 -115 8 < 81007fe91308 707679708 C Ii:003:01 0 8 = 0f33 0e00 81007fe91308 707679724 S Ii:003:01 -115 8 < 81007fe91308 707719706 C Ii:003:01 0 8 = 0f33 0e0d 81007fe91308 707719721 S Ii:003:01 -115 8 < 81007fe91308 707727706 C Ii:003:01 0 8 = 0f0e 0d00 81007fe91308 707727721 S Ii:003:01 -115 8 < 81007fe91308 707743704 C Ii:003:01 0 8 = 0e0d 81007fe91308 707743719 S Ii:003:01 -115 8 < 81007fe91308 707791703 C Ii:003:01 0 8 = 0d00 81007fe91308 707791717 S Ii:003:01 -115 8 < 81007fe91308 707831702 C Ii:003:01 0 8 = 81007fe91308 707831717 S Ii:003:01 -115 8 < 81007fe91308 707847701 C Ii:003:01 0 8 = 0900 81007fe91308 707847716 S Ii:003:01 -115 8 < 81007fe91308 707879698 C Ii:
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Sun, 14 Jan 2007, Florin Iucha wrote: > Jiri and Trond, > > On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote: > > On Sun, 14 Jan 2007, Florin Iucha wrote: > > > > > All the testing was done via a ssh into the workstation. The console > > > was left as booted into, with the gdm running. The remote nfs4 > > > directory was mounted on "/mnt". After copying the 60+ GB and testing > > > that the keyboard was still functioning, I did not reboot but stayed in > > > the same kernel and pulled the latest git then started bisecting. > > > > Hi Florin, > > > > thanks a lot for the testing. Just to verify - what kernel is 'the same > > kernel' mentioned above? (just to isolate whether the problem is really > > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, > > or the situation has changed). > > This happened with 2.6.19. It worked last time, but I wanted to test > again, to make sure. This time, it bombed, but half an hour after the > transfer finished. > > > > After recompiling, I moved over to the workstation to reboot it, but the > > > keyboard was not functioning ;( > > > > So this time the hang occured when the system was idle, not during the > > transfers, right? > > Yes it was idle. Immediately after the transfer finished, the keyboard was > still functioning. It "hang" minutes later, after the first bisected kernel > was compiled and installed. > > > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show > > > any oops, anything for that matter. I have unplugged the keyboard and > > > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. > > > Stracing "lsusb" showed it hang (entered the kernel) at opening the device > > > that used to be the keyboard. Stracing "ls /mnt" showed that it > > > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" > > > worked without problem, so it appears that crossing mountpoints causes > > > some hang in the kernel. > > > > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via > > ssh, when your keyboard is dead) to see the calltraces of the processes > > which are stuck inside kernel? > > > > You will probably get a lot of output after the sysrq, so please either > > put it somewhere on the web if possible, or just extract the interesting > > processes out of it (mainly the ones which are stuck). > > Will do. It would be nice to learn exactly why the keyboard stopped working. Try using the usbmon facility (instructions in Documentation/usb/usbmon.txt) to see what happens when you type on the dead keyboard. Be sure to turn on CONFIG_USB_DEBUG as well. And also check /proc/interrupts; each time you hit a key the USB controller should get an interrupt. Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Sun, Jan 14, 2007 at 11:11:13PM -0300, Horst H. von Brand wrote: > Florin Iucha <[EMAIL PROTECTED]> wrote: > > [...] > > > Based on this info, I think we can rule out any USB. I will try > > testing with NFS3 to see if the problem persists. Unfortunately there > > is no oops or anything in "dmesg". > > Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is > older. The reported had and oops? Luxury! I get nothing ;) I am testing again, this time on 2.6.20-rc5 compiled with extra debug and I got a couple dozens of: "eth0: too many iterations (6) in nv_nic_irq." in the kernel log. florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 signature.asc Description: Digital signature
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
Florin Iucha <[EMAIL PROTECTED]> wrote: [...] > Based on this info, I think we can rule out any USB. I will try > testing with NFS3 to see if the problem persists. Unfortunately there > is no oops or anything in "dmesg". Take a look at bz #7796, a NFS bug + fix. But my feelin is that this is older. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de InformaticaFono: +56 32 2654431 Universidad Tecnica Federico Santa Maria +56 32 2654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
Jiri and Trond, On Mon, Jan 15, 2007 at 01:14:09AM +0100, Jiri Kosina wrote: > On Sun, 14 Jan 2007, Florin Iucha wrote: > > > All the testing was done via a ssh into the workstation. The console > > was left as booted into, with the gdm running. The remote nfs4 > > directory was mounted on "/mnt". After copying the 60+ GB and testing > > that the keyboard was still functioning, I did not reboot but stayed in > > the same kernel and pulled the latest git then started bisecting. > > Hi Florin, > > thanks a lot for the testing. Just to verify - what kernel is 'the same > kernel' mentioned above? (just to isolate whether the problem is really > somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, > or the situation has changed). This happened with 2.6.19. It worked last time, but I wanted to test again, to make sure. This time, it bombed, but half an hour after the transfer finished. > > After recompiling, I moved over to the workstation to reboot it, but the > > keyboard was not functioning ;( > > So this time the hang occured when the system was idle, not during the > transfers, right? Yes it was idle. Immediately after the transfer finished, the keyboard was still functioning. It "hang" minutes later, after the first bisected kernel was compiled and installed. > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show > > any oops, anything for that matter. I have unplugged the keyboard and > > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. > > Stracing "lsusb" showed it hang (entered the kernel) at opening the device > > that used to be the keyboard. Stracing "ls /mnt" showed that it > > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" > > worked without problem, so it appears that crossing mountpoints causes > > some hang in the kernel. > > Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via > ssh, when your keyboard is dead) to see the calltraces of the processes > which are stuck inside kernel? > > You will probably get a lot of output after the sysrq, so please either > put it somewhere on the web if possible, or just extract the interesting > processes out of it (mainly the ones which are stuck). Will do. florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 signature.asc Description: Digital signature
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Sun, 14 Jan 2007, Florin Iucha wrote: > All the testing was done via a ssh into the workstation. The console > was left as booted into, with the gdm running. The remote nfs4 > directory was mounted on "/mnt". After copying the 60+ GB and testing > that the keyboard was still functioning, I did not reboot but stayed in > the same kernel and pulled the latest git then started bisecting. Hi Florin, thanks a lot for the testing. Just to verify - what kernel is 'the same kernel' mentioned above? (just to isolate whether the problem is really somewhere between 2.6.19 and 2.6.20-rc2, as you stated in previous posts, or the situation has changed). > After recompiling, I moved over to the workstation to reboot it, but the > keyboard was not functioning ;( So this time the hang occured when the system was idle, not during the transfers, right? > I ran "lsusb" and it displayed all the devices. "dmesg" did not show > any oops, anything for that matter. I have unplugged the keyboard and > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. > Stracing "lsusb" showed it hang (entered the kernel) at opening the device > that used to be the keyboard. Stracing "ls /mnt" showed that it > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" > worked without problem, so it appears that crossing mountpoints causes > some hang in the kernel. Could you please do alt-sysrq-t (or "echo t > /proc/sysrq-trigger" via ssh, when your keyboard is dead) to see the calltraces of the processes which are stuck inside kernel? You will probably get a lot of output after the sysrq, so please either put it somewhere on the web if possible, or just extract the interesting processes out of it (mainly the ones which are stuck). Thanks, -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Sun, 2007-01-14 at 17:58 -0600, Florin Iucha wrote: > All the testing was done via a ssh into the workstation. The console > was left as booted into, with the gdm running. The remote nfs4 > directory was mounted on "/mnt". > > After copying the 60+ GB and testing that the keyboard was still > functioning, I did not reboot but stayed in the same kernel and pulled > the latest git then started bisecting. After recompiling, I moved > over to the workstation to reboot it, but the keyboard was not > functioning ;( > > I ran "lsusb" and it displayed all the devices. "dmesg" did not show > any oops, anything for that matter. I have unplugged the keyboard and > run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. > Stracing "lsusb" showed it hang (entered the kernel) at opening the device > that used to be the keyboard. Stracing "ls /mnt" showed that it > hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" > worked without problem, so it appears that crossing mountpoints causes > some hang in the kernel. > > Based on this info, I think we can rule out any USB. I will try > testing with NFS3 to see if the problem persists. Unfortunately there > is no oops or anything in "dmesg". Did you try an 'echo t > /proc/sysrq-trigger' in order to find out where the stat process is hanging? Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
heavy nfs[4]] causes fs badness Was: 2.6.20-rc4: known unfixed regressions (v2)
On Sun, Jan 14, 2007 at 04:57:01PM -0600, wrote: > On Wed, Jan 10, 2007 at 10:54:34AM -0500, Alan Stern wrote: > > It's still possible that this is hardware related; perhaps some component > > just began to wear out. If you return to an earlier kernel, does the > > problem go away? > > As reported in my original e-mail and verified just minutes ago, the > copy succeeds with 2.6.19 (kernel.org vanilla, compiled with the same > config as 2.6.20-rcX). I will begin bisecting between .19 and .20-rc1 > after re-reading Jiri's messages. All the testing was done via a ssh into the workstation. The console was left as booted into, with the gdm running. The remote nfs4 directory was mounted on "/mnt". After copying the 60+ GB and testing that the keyboard was still functioning, I did not reboot but stayed in the same kernel and pulled the latest git then started bisecting. After recompiling, I moved over to the workstation to reboot it, but the keyboard was not functioning ;( I ran "lsusb" and it displayed all the devices. "dmesg" did not show any oops, anything for that matter. I have unplugged the keyboard and run "lsusb" again, but it hang. I ran "ls /mnt" and it hang as well. Stracing "lsusb" showed it hang (entered the kernel) at opening the device that used to be the keyboard. Stracing "ls /mnt" showed that it hang at "stat(/mnt)". Both processes were in "D" state. "ls /root" worked without problem, so it appears that crossing mountpoints causes some hang in the kernel. Based on this info, I think we can rule out any USB. I will try testing with NFS3 to see if the problem persists. Unfortunately there is no oops or anything in "dmesg". florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 signature.asc Description: Digital signature