Bug#689268: linux-image-3.2.0-3-amd64: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-11-28 16:45, Riku Voipio wrote: Is there any updates since early november? I have a Ivy bridge PC now with PH8H77-V LE motherboard and 3570K cpu showing the mentioned symptomps. I can work on bisecting the issue if nobody else is already on it. I have been running the kernel mentioned above (3.3 with drm from 3.2) for 25 days now without any problems. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50ba1f53.7040...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-11-03 09:14, Jonathan Nieder wrote: Could you try 3.3~rc6-1~experimental.1? (I expect it will also work fine, but there's always a chance that we could get lucky and narrow down the range by a lot.) As you expected, two days without problems. If it works ok, here are instructions for testing 3.3-rc6 with drm code from 3.2: 0. prerequisites apt-get install git build-essential 1. get the kernel history, if you don't already have it git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2. configure, build, test cd linux git checkout v3.3-rc6 cp /boot/config-$(uname -r) .config; # current configuration scripts/config --disable DEBUG_INFO make localmodconfig; # optional: minimize configuration make deb-pkg; # optionally with -j for parallel build dpkg -i ../; # as root reboot Hopefully it works fine. So 3. try drm code from 3.2 cd linux git checkout v3.2 -- include/drm drivers/gpu/drm make deb-pkg; # maybe with -j4 dpkg -i ..; # as root reboot Hopefully it reproduces the bug. Unfortunately this has so far been stable for two days. I'll keep running it to see if the bug just takes longer to show with this combo. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/509a9ca9.7080...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-24 02:39, Jonathan Nieder wrote: Hi Per, Per Foreby wrote: Just to clear some confusion: In a previous comment you suggested trying 3.2.30-1 first (which seems to have been replaced by 3.2.32-1 a few days ago). So what should I try, and in what order? I have downloaded the following packages: linux-image-3.2.0-4-amd64_3.2.30-1_amd64.deb linux-image-3.2.0-4-amd64_3.2.32-1_amd64.deb linux-image-3.3.0-trunk-amd64_3.3.6-1~experimental.1_amd64.deb linux-image-3.4-trunk-amd64_3.4.4-1~experimental.1_amd64.deb Good question, thanks. 3.3.6 first. If it works, the newest 3.2.y kernel available would come next. If it doesn't work, 3.4.4 would come next. I've been running 3.3.6-1~experimental.1 for 10 days without problems. Just tried 3.2.32-1 and the system froze after 18 minutes. Now back on 3.3. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50945af0.5090...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-23 23:45, Per Foreby wrote: Thanks again for your help and patience. Could you try 3.3 next? That would narrow down the search for the fix by quite a bit. Just to clear some confusion: In a previous comment you suggested trying 3.2.30-1 first (which seems to have been replaced by 3.2.32-1 a few days ago). So what should I try, and in what order? I have downloaded the following packages: linux-image-3.2.0-4-amd64_3.2.30-1_amd64.deb linux-image-3.2.0-4-amd64_3.2.32-1_amd64.deb linux-image-3.3.0-trunk-amd64_3.3.6-1~experimental.1_amd64.deb linux-image-3.4-trunk-amd64_3.4.4-1~experimental.1_amd64.deb /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50872639.2070...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-23 22:10, Jonathan Nieder wrote: Per Foreby wrote: On 2012-10-22 00:00, Jonathan Nieder wrote: Oh, right --- I had forgotten. I think we should still move upstream after the experiment with vesa, though, and just be sure to mention which kernels were tried and what happened with each. [...] https://bugs.freedesktop.org/show_bug.cgi?id=56333 Thanks again for your help and patience. Could you try 3.3 next? That would narrow down the search for the fix by quite a bit. The upstream comments are just what I exepcted, and what I'm used to in my 25 years as a system manager: don't bother reporting anything unless you're on the latest and greatest release or have paid support. Which is very understandable. I'll give the suggested kernels a try to narrow down the problem. And I'll stay on 64 MB graphics memory since that seems to trigger the problem faster. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50870fe4.5090...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-22 00:00, Jonathan Nieder wrote: Oh, right --- I had forgotten. I think we should still move upstream after the experiment with vesa, though, and just be sure to mention which kernels were tried and what happened with each. Instructions for reporting are here: http://intellinuxgraphics.org/how_to_report_bug.html When doing so, please let us know the bug number so we can track it. I hate to sound like a broken record, but I still have no reason to believe the mtrr stuff is not a red herring. https://bugs.freedesktop.org/show_bug.cgi?id=56333 /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5086d959.8010...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-21 21:02, Jonathan Nieder wrote: Per Foreby wrote: [22.177] (II) VESA(0): Total Memory: 1023 64KB banks (65472kB) [22.200] (II) VESA(0): VESA VBE Total Mem: 65472 kB Good, it looks like the vesa driver is loading instead of the i915 driver. Can you reproduce the bug in this setup? Everything is OK so far, and I hardy notice any difference (apart from having to enable software scaling in mplayer2, but that's not a problem with an i7 3770 :) I'll give it a few days, and then I'll try "enable_mtrr_cleanup". If not, we will have shown the bug is probably in the i915 driver and we can take this upstream. Or maybe not, since I didn't experience any problems on the 3.5 kernel. With 256 MB GPU memory though. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50846cdc.1040...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-21 04:48, Jonathan Nieder wrote: Per Foreby wrote: Next thing to try is to blacklist the i915 module, but it doesn't seem to work. This is what I did: # echo "blacklist i915" > /etc/modprobe.d/i915-blacklist.conf # depmod -ae -F /boot/System.map-3.2.0-3-amd64 # update-initramfs -u -k all Module still loads. Does the i915 module still load if you boot in recovery mode (kernel 'single' or 'text')? If so, please attach lsmod output and output from "grep . /etc/modprobe.d/*". The module was not loaded in recovery, and when booting multiuser, lsmod shows no dependencies. I did however find a working solution in the Arch wiki (https://wiki.archlinux.org/index.php/Kernel_modules#Using_files_in_.2Fetc.2Fmodprobe.d.2F_2). So using "install i915 /bin/false" in blacklist.conf, the i915 module is finally gone. The output from lspci/dmsg/Xorg.0.log is still identical and indicates 256 MB, but the Vesa driver also added this to Xorg.0.log: [22.177] (II) VESA(0): Total Memory: 1023 64KB banks (65472kB) [22.200] (II) VESA(0): VESA VBE Total Mem: 65472 kB Note that this is 1023 memory banks, not 1024, so it's not exactly 64 MB (65536 kB). Maybe the reason why it almost works with 256 MB is that the kernel always thinks that we have exactly 256 MB but the Mobo supplies one memory bank less. Just a theory.. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5083e896.1070...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
New freeze after a few hours (vanilla kernel, 64 MB GPU Memory). Next thing to try is to blacklist the i915 module, but it doesn't seem to work. This is what I did: # echo "blacklist i915" > /etc/modprobe.d/i915-blacklist.conf # depmod -ae -F /boot/System.map-3.2.0-3-amd64 # update-initramfs -u -k all Module still loads. Tried adding modprobe.blacklist=i915 to the kernel command line, but still no success. What am I doing wrong here? /Pre -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50835b12.4010...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-20 00:39, Per Foreby wrote: I noticed something strange with allocation of GPU RAM. In the old BIOS, the default was 64 MB, but in the new bios, "Auto" was default. So I set it explicitly to 64 MB. However, this is what the OS reports: # lspci -vv ... 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller]) ... Region 0: Memory at f780 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at e000 (64-bit, prefetchable) [size=256M] ... # dmesg | grep aperture [0.00] Checking aperture... [1.644376] agpgart-intel :00:00.0: AGP aperture is 256M @ 0xe000 # grep -i mem /var/log/Xorg.0.log [18.557] (--) PCI:*(0:0:2:0) 8086:0162:1043:84ca rev 9, Mem @ 0xf780/4194304, 0xe000/268435456, I/O @ 0xf000/64 (268435456 is 256*1024*1024.) I changed the "iGPU memory" setting to 64 MB once more, and rebooted with the 3.5.5 kernel to see how much memory was reported. No change from 3.2.0 - everything still says 256 MB. Could this be the problem? As Ingo reported, his problem disappeared with 256 MB GPU memory, and for me it took much longer for the next crash to occur, so maybe it not entirely correct when using 256 MB, but enough to make the freeze very rare? And for my success (for 11 days) with the 3.5.5 kernel, maybe memory allocation works differently so it never (or more seldom) tries to access GPU memory that doesn't exist. Haven't tried 3.5.5 with 64 MB though. Back on 3.2.0 with 64 MB now to test this theory. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/508306cf.8060...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-18 12:37, Ingo wrote: Per, I am still watching this issue for interest (my case I do consider as "wrong" BIOS setting which solved it for me). Hmm, BIOS you said. I just check my BIOS version and found that the MB was delivered with the initial BIOS version from February. Since then ASUS have release five versions, all with "Improve system stability" among the few lines in the changelog. Now I'm on the latest BIOS (from August) and back on the 3.2.0 kernel. To my knowledge mtrr's are still used (not by i915 as Ben Hutchings stated) and probably here certain manufacturers of boards/BIOS probably set up different configurations. Probably it ist worth to try this kernel parameter with Wheezy's standard kernel: "enable_mtrr_cleanup" to allow kernel to re-arrange them and see if it has any influence in your case? I will try that if I get another freeze, and if that also fails, try the vesa driver, and then work my way up in the kernel versions as suggested by Jonathan. I noticed something strange with allocation of GPU RAM. In the old BIOS, the default was 64 MB, but in the new bios, "Auto" was default. So I set it explicitly to 64 MB. However, this is what the OS reports: # lspci -vv ... 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller]) ... Region 0: Memory at f780 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at e000 (64-bit, prefetchable) [size=256M] ... # dmesg | grep aperture [0.00] Checking aperture... [1.644376] agpgart-intel :00:00.0: AGP aperture is 256M @ 0xe000 # grep -i mem /var/log/Xorg.0.log [18.557] (--) PCI:*(0:0:2:0) 8086:0162:1043:84ca rev 9, Mem @ 0xf780/4194304, 0xe000/268435456, I/O @ 0xf000/64 (268435456 is 256*1024*1024.) I also found this ubuntu bug report: "lspci reports wrong video memory size" (https://bugs.launchpad.net/ubuntu/+source/pciutils/+bug/607991). Could the same thing that fools lspci also make the kernel and the i915 driver think I have more GPU memory than I actually have? On the other hand, the last freeze I had was with 256 MB configured in bios. *UPDATE* BIOS update did no good. Just had a freeze while typing this email. It only took minutes from reboot, and the trigger this time was doubleclicking the URI in firefox in an attempt to copy the launchpad link above. Now I'm back up with 256 MB GPU RAM, still on the vanilla wheezy kernel. The output from lspci/dmesg/Xorg.0.log looks exactly the same as with 64 MB. Strange. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5081d68a.4020...@foreby.se
Bug#689268: [wheezy] Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-18 00:02, Jonathan Nieder wrote: Per Foreby wrote: Correct, apart from the timing. It's been from a few hours to five days between the freezes. But with very little interactive use during the five days. Right --- how much interactive use does it take? Very little. I've had freezes so far, and all of the with almost no activity on the screen. Clicking a link, typing an email or closing an rxvt window. That should be cosmetic with your card, though there's always the possibility of something subtle happening. As a rule of thumb, every bugfix in a complex enough system introduces another bug. So I suppose that sometimes a bugfux might accidentally fix another bug :) With up to five days (so far) for a freeze to occur, it might take long to narrow down the change, and the computer in question is semi production (working from home), so the freezes are very annoying. But I'll give it a try in the name of the good cause. Maybe I should start with running 3.5.5 for a few weeks, just to make sure that the freezes really are gone? Selfishly, I would suggest first trying to reproduce it on a known-bad kernel and then trying 3.3 or disabling the intel driver. OK, I'll go back to 3.2.23 and work my way up. It might take some time though if the freeze doesn't behave... If you blacklist the i915 kernel module and use the vesa X driver, does that avoid trouble? Does the vesa driver support 1920x1200 these days? /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/507f2f31.1050...@foreby.se
Bug#689268: [wheezy] Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-17 05:21, Jonathan Nieder wrote: Per Foreby wrote: However my computer has been running without any problems for 11 days, so whatever caused this bug seems to be fixed in the 3.5.5 kernel. Drat. Ok. To recap: * Asus P8Z77-V LE. * Newish system. Works fine under load (e.g., Folding@Home) but when you started normal interactive use it started to freeze a few times a day while you were interacting with it (in particular, the freeze happens around the same time as a keyboard or mouse action). * The freeze is a bad one --- the fan spins down, the NIC stops responding, caps lock doesn't light up, ctrl+alt+del and magic sysrq have no effect. No messages about it in netconsole. * Happens reliably (how reliably? >80% of the time?) after a few hours of sustained use (?) * Logs available in the bug log. No obvious smoking guns. ;-) * Changing the amount of memory allocated to the integrated GPU in BIOS doesn't change anything. * The above describes 3.2.23-1. Based on a week and a half of running 3.5.5-1~experimental.1 it doesn't seem to be affected. Correct, apart from the timing. It's been from a few hours to five days between the freezes. But with very little interactive use during the five days. I can add that stressing the graphics doesn't seem to trigger the problem. None of the changes from 3.2 to 3.5.5 are jumping out as likely candidates for the fix, but that's a pretty wide range. What about the MTRR patch that Ben Hutchings mentioned above (commit 9e984bc1dffd405138ff22356188b6a1677c64c8)? Or maybe that's just a cosmetic change? According to https://bugs.freedesktop.org/show_bug.cgi?id=41648, this patch was added in 3.5-rc1. How reliably can you reproduce the hang on a known-bad kernel? If you have time to try 3.2.30-1 from sid, 3.4.4-1~experimental.1 from http://snapshot.debian.org/package/linux/ and 3.3.6-1~experimental.1 from http://snapshot.debian.org/package/linux-2.6/ then that could help narrow down the search. With up to five days (so far) for a freeze to occur, it might take long to narrow down the change, and the computer in question is semi production (working from home), so the freezes are very annoying. But I'll give it a try in the name of the good cause. Maybe I should start with running 3.5.5 for a few weeks, just to make sure that the freezes really are gone? Jonathan Nieder wrote: * Asus P8Z77-V LE. This makes as good a keyword for a web search as any. It found [1] which is not too encouraging. Maybe memtest86+ could be worth a try to rule some problems out. [1] http://thread.gmane.org/gmane.linux.debian.user.french/176707/focus=176710 Memory and cpu cooling were of course the first suspects. But I'm using a large Arctic heatpipe cooler and have never seen higher temperatures than +57.0°C on any core according to lm_sensors. And memtest86+ has been happy. I ran an extra pass today (about 2.5 hours) just to make sure, and everything was blue and white. The french discussion talks about RAM timing and voltage, but I'm not an overclocker (vanilla i7 3770, not 3770K) so that hardly applies. And please note that with FAH running 24/7, the computer is always under heavy load, but has never frozen while running unattended. Even when running interactively, the freezes have never been random, but has happened *exactly* when I was clicking or typing something. Btw, I found http://www.linuxbsdos.com/2012/10/06/ubuntu-12-04-lts-and-12-10-beta-2-on-intel-ivy-bridge-powered-computer/. One of the comments indicate that 3.4 fixes some sort of freeze problem on ivy bridge/HD4000. And the partiallysanedeveloper page that I linked to in the initial bug report says that the freezes are gone in 3.3. Here are some other other discussions of freezes on similar hardware on debian derivatives with the same kernel generation: http://forums.linuxmint.com/viewtopic.php?f=90&t=114382 http://phoronix.com/forums/showthread.php?71895-Intel-Ivy-Bridge-On-Linux-Two-Month-Redux http://forums.linuxmint.com/viewtopic.php?f=198&t=113070 http://ubuntuforums.org/showthread.php?t=1995945 /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/507f23b3.6020...@foreby.se
Bug#689268: [wheezy] Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-06 04:29, Per Foreby wrote: New freeze. Last entry in the debug log was more than 10 minutes before the freeze. Now running 3.5-trunk-amd64 #1 SMP Debian 3.5.5-1~experimental.1 (still with 256 MB iGPU Memory). I was going to give it two weeks before reporting, but today we had a power outage so I didn't quite reach two weeks. However my computer has been running without any problems for 11 days, so whatever caused this bug seems to be fixed in the 3.5.5 kernel. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/507de5ca.1040...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
Ingo wrote: With me all is still fine since 1 week, however that does not mean its fixed. I am right now trying to stress my machine with high memory loads and graphics to verify the workaround. I have also tried the stress tactics, but it doesn't seem to have anything to do with load. a) it happens when browsing with iceweasel - Javier can you also confirm this? Not iceweasel in my case. I like my browsers bleading edge, so I use the 64-bit version of vanilla firefox. And my freezes have happened on various mouse or kbd input, not only klicking a link in the web browser (see my original bug report). /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/50701002.1040...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-05 23:53, Jonathan Nieder wrote: Per Foreby wrote: So far I'm running whith the default wheezy kernel but with the iGPU memory set to 256 MB. My plan was to run with this setting, and if I had another crash, try the experimental kernel. That seems like a good plan. New freeze. Last entry in the debug log was more than 10 minutes before the freeze. Now running 3.5-trunk-amd64 #1 SMP Debian 3.5.5-1~experimental.1 (still with 256 MB iGPU Memory). /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506f9798.80...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-05 18:50, Jonathan Nieder wrote: Javier Cantero wrote: If it helps, I am using now linux-image-3.5-trunk-amd64 (3.5.2-1~experimental.1) kernel with no freezes since the change. That's good to hear. Per, Ingo, does that work around trouble on your machines, too? I hade two freezes last saturday, and two the day after. The next freeze was yesterday (five days later). So testing different options isn't that easy. So far I'm running whith the default wheezy kernel but with the iGPU memory set to 256 MB. My plan was to run with this setting, and if I had another crash, try the experimental kernel. But let me know if you'd rather have me reset the video memory to the default 64 MB and try the 3.5 kernel. Btw, my mobo is is an Asus P8Z77-V LE. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506f564d.1090...@foreby.se
Bug#689268: linux-image-3.2.0-3-amd64: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-04 13:30, Ingo wrote: I just had a freeze, the first one sinc sunday. Actually while browsing your comment :) Jonathan: netconsole didn't log anything interesting. These freezes happend most of the time when hitting a link in Iceweasel. They are so severe that even the MoBo reset button does not respond immediately, Same here. I press reset and the computer reboots 20 seconds later. I had to hit it several times. Additionally when PC stalls, monitor continues to display the frozen desktop (via displayport) I also use displayport. It seems like the last screen is cached in the monitor (HP ZR24w) because if I turn the monitor off and on again, I just get a black screen. power consumption rises from idle 38 watts to constant 86 watts - verry dangerous if also fan regulation fails. Don't know about power, but if this is true, the reset delay may very well be the temperature rising until the temperature protection resets the computer. Upon next boot I get "orphaned inodes" Me too, but this is of course expected on a cold reset. [drm] MTRR allocation failed. Graphics performance may suffer. I've got this one too. Checking mtrr's showed: cat /proc/mtrr reg00: base=0x0 (0MB), size= 8192MB, count=1: write-back reg01: base=0x2 ( 8192MB), size= 512MB, count=1: write-back reg02: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable reg03: base=0x0dc00 ( 3520MB), size= 64MB, count=1: uncachable reg04: base=0x0db80 ( 3512MB), size=8MB, count=1: uncachable reg05: base=0x21f80 ( 8696MB), size=8MB, count=1: uncachable reg06: base=0x21f60 ( 8694MB), size=2MB, count=1: uncachable In my case (with 32 GB): reg00: base=0x0 (0MB), size=32768MB, count=1: write-back reg01: base=0x8 (32768MB), size= 512MB, count=1: write-back reg02: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable reg03: base=0x0d000 ( 3328MB), size= 256MB, count=1: uncachable reg04: base=0x0cf00 ( 3312MB), size= 16MB, count=1: uncachable reg05: base=0x81fe0 (33278MB), size=2MB, count=1: uncachable Default setting in the BIOS of the DH77EB for video agp-aperture is "max" (values of 64, 128, 256 and 512MB are offered as options). I played around with different BIOS settings and observed that these settings are not respected by the i915 module. Dmesg always reports 256MB for the aperture: So the problem seems to be that i915 ignores (or cannot read) the BIOS setting. My BIOS setting is called "iGPU-Memory". It was by default at 64 MB. dmesg | grep agp Linux agpgart interface v0.103 agpgart-intel :00:00.0: Intel Ivybridge Chipset agpgart-intel :00:00.0: detected gtt size: 2097152K total, 262144K mappable agpgart-intel :00:00.0: detected 65536K stolen memory agpgart-intel :00:00.0: AGP aperture is 256M @ 0xe000 Identical to my logs. So I decided to set the BIOS AGP-aperture to 256MB as well and removed the kernel parameter 'enable_mtrr_cleanup'. Changed my setting to 256 MB as well. still suffering graphics performance and "mtrr missmatch" according to dmesg: mtrr: type mismatch for e000,1000 old: write-back new: write-combining [drm] MTRR allocation failed. Graphics performance may suffer. Me too. But since then I have never obseved any cras/freeze for days now. Keeping my fingers crossed for the same outcome. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506dd198.4080...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-02 00:45, Bjørn Mork wrote: Per Foreby writes: On 2012-10-01 23:29, Jonathan Nieder wrote: Per Foreby wrote: The debug logging from drm isn't forwarded via netconsole, so I suppose it isn't supposed to? Oh, that's because of the console_loglevel setting[1]. You can change it by running "dmesg -n 8" (or by adding the word "debug" or a loglevel= parameter to the kernel command line). Ah, RTFM :) However, it looks like the netconsole documentation and the dmesg man page should be updated: # dmesg -n 8 dmesg: unknown level '8' Instead I tried # dmesg -n debug # dmesg -E but still nothing at the remote end. Yes, I vaguely remember having struggled with the same issue the last time I use netconsole. Try echo 8 >/proc/sys/kernel/printk instead. I believe the bug is in the dmesg utility. It should shift all values by one. Setting "dmesg -n debug" will currently log all messages with a level *higher* than debug. You're probably right about the bug. I don't know what the four values in /proc/sys/kernel/printk are, but the first value was 7, not 8: # cat /proc/sys/kernel/printk 7 4 1 7 # echo 8 > /proc/sys/kernel/printk # cat /proc/sys/kernel/printk 8 4 1 7 However, this did not affect the remote logging, so I'm back to the remote syslog approach. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506a291a.20...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-01 23:29, Jonathan Nieder wrote: Per Foreby wrote: The debug logging from drm isn't forwarded via netconsole, so I suppose it isn't supposed to? Oh, that's because of the console_loglevel setting[1]. You can change it by running "dmesg -n 8" (or by adding the word "debug" or a loglevel= parameter to the kernel command line). Ah, RTFM :) However, it looks like the netconsole documentation and the dmesg man page should be updated: # dmesg -n 8 dmesg: unknown level '8' Instead I tried # dmesg -n debug # dmesg -E but still nothing at the remote end. Instead I added remote logging of kern.* in rsyslog.conf, so now I have everything at the server. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506a1078.4020...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
On 2012-10-01 19:52, Jonathan Nieder wrote: So far the only log messages on the remote server are from netconsole itself. Which leads me to this question: Are the default kernel debugging options OK, or do I need to enable more debugging? Does that mean it didn't capture the boot messages? netconsole is compiled as a module, so I never rebooted, just loaded it with modprobe. Either way, if you can handle the log spew then drm.debug=0xe would be great. But a log without that would already be interesting since it would catch the basic setup at boot time and symptoms such as assertion failures (kernel BUG or WARNING) near the time of the freeze. OK, now freshly rebooted with this config: GRUB_CMDLINE_LINUX="netconsole=@/,514@192.168.201.1/ drm.debug=0xe" The debug logging from drm isn't forwarded via netconsole, so I suppose it isn't supposed to? (Even though I've been working with unix/linux for the last 25 years, kernel debugging isn't my everyday trade.) But I have removed the "minus" from /var/log/kern.log in the syslog config, so hopefully everything should stick to the local log file. Or at least everything but the last crucial line, which netconsole hopefully will catch. So far i doesn't spew that much, it typically looks like this over and over again: [drm:i915_driver_open], [drm:i915_getparam], Unknown parameter 16 [drm:i915_getparam], Unknown parameter 17 [drm:i915_getparam], Unknown parameter 17 [drm:intel_crtc_cursor_set], [drm:intel_crtc_cursor_set], cursor off [drm:intel_crtc_cursor_set], [drm:intel_crtc_cursor_set], [drm:intel_crtc_cursor_set], cursor off Now we'll just have to wait and see. This is my workstation at home, and during the weekdays I don't use it as much as on weekends, so it might take longer to trigger the bug the next time. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/506a07db.8080...@foreby.se
Bug#689268: Intel HD 4000 (Ivy Bridge) graphics freeze
Hi, serial ports are rare these days, but I have netconsole running now (logging to syslogd on my server). So far the only log messages on the remote server are from netconsole itself. Which leads me to this question: Are the default kernel debugging options OK, or do I need to enable more debugging? I'll wait for the freeze to happen once more, and if it does, I will try the latest kernel from experimental. /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5069d238.5040...@foreby.se
Bug#689268: linux-image-3.2.0-3-amd64: Intel HD 4000 (Ivy Bridge) graphics freeze
Package: src:linux Version: 3.2.23-1 Severity: important I'm using the built-in graphics (HD4000) on a i7 3770 Ivy Bridge processor with Z77 chipset. The computer has been runing just fine under heavy load (Folding at Home) for some weeks, but a few days ago I started using it as a workstation, and since then it totally freezes a few times a day. This always happens on interactive input. So far these four events: - close a window - click a link in firefox - ctrl-r to reload a page in firefox - ctrl-k to delete a line in thunderbird's composer The computer is completely frozen. The cpu probably stops working since the fan spins down to it's lowest rpm, I can't ping the interface, caps lock doesn't light up, has to be power cycled. And no clues in the logs. After reboot, redoing the same actions doesn't trigger the bug. But about 3-4 hours later the computer hangs again. Maybe some data structure that is filled upp with time? Googling for solutions, I fond these pages: http://partiallysanedeveloper.blogspot.se/2012/05/ivy-bridge-hd4000-linux-freeze.html http://askubuntu.com/questions/155458/ubuntu-12-04-randomly-freezes-on-ivy-bridge-intel-hd-graphics-4000 http://askubuntu.com/questions/163890/weird-system-freeze-nothing-works-keyboard-mouse-reset-button-ubuntu-12-04-64 I haven't tried with another kernel yet. -- Package-specific info: ** Version: Linux version 3.2.0-3-amd64 (Debian 3.2.23-1) (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 02:45:17 UTC 2012 ** Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-3-amd64 root=/dev/md0 ro ** Not tainted ** Kernel log: [5.621585] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636493 [5.621624] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636605 [5.621636] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636595 [5.621651] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636593 [5.621659] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636592 [5.621667] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636584 [5.621673] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636520 [5.621680] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636514 [5.621687] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636501 [5.621694] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636496 [5.621700] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636491 [5.621706] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636191 [5.621718] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 5636508 [5.621743] EXT4-fs (md0): ext4_orphan_cleanup: deleting unreferenced inode 4064137 [5.621757] EXT4-fs (md0): 15 orphan inodes deleted [5.621808] EXT4-fs (md0): recovery complete [5.925942] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: (null) [7.181722] udevd[467]: starting version 175 [7.619643] ACPI: Requesting acpi_cpufreq [7.619661] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input4 [7.619667] ACPI: Power Button [PWRB] [7.620061] Monitor-Mwait will be used to enter C-1 state [7.620080] Monitor-Mwait will be used to enter C-2 state [7.620099] Monitor-Mwait will be used to enter C-3 state [7.620111] ACPI: acpi_idle registered with cpuidle [7.622036] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input5 [7.622104] ACPI: Power Button [PWRF] [7.687301] i801_smbus :00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18 [7.687369] ACPI: resource :00:1f.3 [io 0xf040-0xf05f] conflicts with ACPI region SMBI [io 0xf040-0xf04f] [7.687436] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [7.700685] input: PC Speaker as /devices/platform/pcspkr/input/input6 [7.720941] wmi: Mapper loaded [7.722731] iTCO_vendor_support: vendor-support=0 [7.740352] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [7.759663] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.07 [7.759776] iTCO_wdt: Found a Panther Point TCO device (Version=2, TCOBASE=0x0460) [7.759880] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0) [7.869045] [drm] Initialized drm 1.1.0 20060810 [7.890031] i915 :00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [7.890087] i915 :00:02.0: setting latency timer to 64 [7.921302] mtrr: type mismatch for e000,1000 old: write-back new: write-combining [7.921372] [drm] MTRR allocation failed. Graphics performance may suffer. [7.921596] i915 :00:02.0: irq 55 for MSI/MSI-X [7.921599] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [7.921652] [drm] Driver supports precise vblank timestamp
Bug#516374: Just one more question
One simple(?) questions before you close the bug: - Which kernel should be upgraded to avoid this bug? dom0, domU or both? /Per -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4ba10466.1060...@ddg.lth.se
Bug#516374: Me too, and it just got worse
I'm also having this problem. Dom0 running 2.6.26-2-xen-amd64, domU on 2.6.26-2-686-bigmem. All domU:s are using all available processor cores. I've been having problems on and off since the machine was installed last summer. Typically it would be days or weeks between lockups. I've been keeping up with the latest stable kernel version, but so far the upgrades haven't made any difference. The lastest upgrade (2.6.26-21lenny4) unfortunately made things worse. Now one of my domUs lockup in less than an hour. I get two different error messages in kern.log. Lots of "task xx blocked for more than 120 seconds" and fewer "BUG: soft lockup - CPU#n stuck...". Once a domU is stuck, there is no way to reboot it other than using xm/virsh destroy. Here are examples of the two types of output in the log: INFO: task nfsd:1700 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nfsd D 7caf7bc7 0 1700 2 ec94ce60 0246 7caf7bc7 1c5b ec94cfec c2578020 c0130978 ebb87e64 a760 c041a1bc c0130a8b 0200 0086759a 0086759a ec528600 e4c3675c c02c9057 eb8f3e64 c041b220 0086759a Call Trace: [] lock_timer_base+0x19/0x35 [] __mod_timer+0x99/0xa3 [] schedule_timeout+0x6b/0x86 [] process_timeout+0x0/0x5 [] schedule_timeout+0x66/0x86 [] journal_stop+0x7e/0x151 [jbd] [] __writeback_single_inode+0x15a/0x251 [] write_inode_now+0x63/0x9a [] nfsd_setattr+0x3ae/0x3cb [nfsd] [] nfsd3_proc_setattr+0x74/0x7d [nfsd] [] nfsd_dispatch+0xca/0x192 [nfsd] [] svc_process+0x3a1/0x620 [sunrpc] [] nfsd+0x171/0x268 [nfsd] [] nfsd+0x0/0x268 [nfsd] [] kernel_thread_helper+0x7/0x10 === BUG: soft lockup - CPU#3 stuck for 71s! [swapper:0] Modules linked in: autofs4 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc ipv6 nf_conntrack_ipv4 xt_state nf_conntrack xt_limit ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1) EIP: 0061:[] EFLAGS: 0246 CPU: 3 EIP is at _stext+0x3a7/0x1000 EAX: EBX: 0001 ECX: EDX: 00867599 ESI: 0003 EDI: EBP: ESP: ed049fa0 DS: 007b ES: 007b FS: 00d8 GS: SS: 0069 CR0: 8005003b CR2: b620d034 CR3: 2c1a CR4: 0660 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 [] xen_safe_halt+0xd/0x17 [] xen_idle+0x0/0x3a [] xen_idle+0x2b/0x3a [] cpu_idle+0xb0/0xd0 === -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4b9fab28.4000...@ddg.lth.se
Bug#501742: linux-image-2.6.26-1-amd64: Random hangs/slowness and forcedeth problem
Package: linux-image-2.6.26-1-amd64 Version: 2.6.26-5 Severity: important On versions before 2.6.26 i have been getting lots of messages like this: eth0: too many iterations (6) in nv_nic_irq Apart from filling up the log, the has been no noticable impact on the system. After upgrading to 2.6.26, the system started to misbehave. It would work for a few hours, and then it would slow down to the degree where a simple command could take several minutes to complete. Finally, it would become totally unresponsive leaving the reset button as the only option. Browsing through the bug reports, it looked like the hpet problem, so I tried booting with hpet=disable. With this kernel option the system worked for an hour and then the network stopped working with this message in the log: eth0: too many iterations (6) in nv_nic_irq. NETDEV WATCHDOG: eth0: transmit timed out eth0: Got tx_timeout. irq: 0032 eth0: Ring at 7d084000 eth0: Dumping tx registers eth0: Dumping tx ring eth0: tx_timeout: dead entries [ cut here ] WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0xa6/0xfb() Modules linked in: xt_limit xt_state ipt_REJECT xt_tcpudp ipt_MASQUERADE iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables video output ac battery nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc ipv6 it87 hwmon_vid loop parport_pc parport snd_hda_intel pcspkr k8temp usblp snd_pcm snd_timer snd soundcore snd_page_alloc i2c_nforce2 i2c_core button evdev ext3 jbd mbcache raid1 md_mod ide_cd_mod cdrom sd_mod ide_pci_generic jmicron usb_storage amd74xx ide_core floppy ahci ohci1394 ieee1394 forcedeth ata_generic sata_nv libata scsi_mod ehci_hcd dock ohci_hcd thermal processor fan thermal_sys Pid: 0, comm: swapper Not tainted 2.6.26-1-amd64 #1 Call Trace: [] warn_on _slowpath+0x51/0x7a [] :forcedeth:reg_delay+0x40/0x8a [] :forcedeth:nv_drain_tx+0xb4/0x186 [] :forcedeth:nv_tx_timeout+0x1fb/0x2a4 [] dev_watchdog+0x0/0xfb [] dev_watchdog+0xa6/0xfb [] dev_watchdog+0x0/0xfb [] run_timer_softirq+0x16a/0x1e2 [] ktime_get+0xc/0x41 [] __do_softirq+0x5c/0xd1 [] call_softirq+0x1c/0x28 [] do_softirq+0x3c/0x81 [] irq_exit+0x3f/0x83 [] smp_apic_timer_interrupt+0x8c/0xa4 [] default_idle+0x0/0x49 [] apic_timer_interrupt+0x72/0x80 [] lapic_next_event+0x0/0x13 [] native_safe_halt+0x2/0x3 [] native_safe_halt+0x2/0x3 [] default_idle+0x2a/0x49 [] cpu_idle+0x89/0xb3 ---[ end trace 314e3fb7eb127ca0 ]--- I don't know if the behavour with and without hpet=disable are symptoms of the same problem, or if it is two different bugs. The other network interface on this MB (Asus M2N-SLI Deluxe) also uses forcedeth, but doesn't report any problems. This is a production server/firewall, and I wasn't able to take any more downtime, so when hpet=disable didn't work, I reverted to a previous kernel (2.6.24-7). Apart from the "normal" error messages ("too many iterations...") the system has been stable for three days now. -- Package-specific info: -- System Information: Debian Release: lenny/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 2.6.24-1-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-1-amd64 depends on: ii debconf [debconf-2.0] 1.5.22 Debian configuration management sy ii initramfs-tools [linux-initra 0.92j tools for generating an initramfs ii module-init-tools 3.4-1 tools for managing Linux kernel mo linux-image-2.6.26-1-amd64 recommends no packages. Versions of packages linux-image-2.6.26-1-amd64 suggests: ii grub 0.97-47GRand Unified Bootloader (Legacy v pn linux-doc-2.6.26 (no description available) -- debconf information: linux-image-2.6.26-1-amd64/postinst/create-kimage-link-2.6.26-1-amd64: true shared/kernel-image/really-run-bootloader: true linux-image-2.6.26-1-amd64/postinst/kimage-is-a-directory: linux-image-2.6.26-1-amd64/preinst/bootloader-initrd-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/postinst/old-initrd-link-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/preinst/initrd-2.6.26-1-amd64: linux-image-2.6.26-1-amd64/postinst/old-system-map-link-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/postinst/depmod-error-initrd-2.6.26-1-amd64: false linux-image-2.6.26-1-amd64/preinst/overwriting-modules-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/preinst/elilo-initrd-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/postinst/bootloader-error-2.6.26-1-amd64: linux-image-2.6.26-1-amd64/preinst/abort-install-2.6.26-1-amd64: linux-image-2.6.26-1-amd64/preinst/lilo-initrd-2.6.26-1-amd64: true linux-image-2.6.26-1-amd64/postinst/depmod-error-2.6.26-1-amd64: false linux-image-2.6.26-1-amd64/prerm/re