Bug#875621: ditto
Hi, Without this kernel module shipped, users of X1 gen6 are forced to compile, which is significantly more taxing than just having to unblacklist the module. Can't it be shipped, yet added to the default blacklist, until the Yoga X11e issue is resolved? (It goes without saying that it's already most annoying to have to fiddle with anything at all to get basic mouse support on this machine, it's like it's 1998 all over again and I'm too old for this...) TIA. -- 2. That which causes joy or happiness.
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 06:30:48PM +0200, Josip Rodin wrote: LOCATIONOFFSET COUNT net_tx_action 0 1 qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0) backlog 0b 0p requeues 0 JFTR I worked around this problem by giving up on sch_tbf - I replaced it with an equivalent simple sch_htb setup (htb qdisc, htb class, sfq qdisc). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130821182927.ga18...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
Package: linux-image-3.2.0-4-amd64 Version: 3.2.46-1 Hi, I have a gateway machine, with $iface_Internet == xenbr2 and $iface_intranet == xenbr0, running these traffic control rules on the outside interface which are supposed to be a trivial ToS match and a limit on 20 Mbps: tc qdisc del dev $iface_Internet root || true tc qdisc add dev $iface_Internet root handle 1: prio tc qdisc add dev $iface_Internet parent 1:1 handle 10: sfq tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 20480 limit 16384 tc qdisc add dev $iface_Internet parent 1:3 handle 30: sfq This worked just fine for about seven years now on a machine running squeeze, and a fair few distro+kernel versions before that. I changed the rate from 10 to 20 on 2012-10-12, and everything kept working fine. However, the upgrade to this new kernel appears to have killed it - the tbf rule is causing outgoing HTTP connections to max out at around 8 Kbps. When I remove tbf, everything is fine. I think there's a software problem there - even if these rules were somehow broken to begin with, this is a poor way of telling me that. Please fix it. TIA. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817080030.ga31...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote: On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote: tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 20480 limit 16384 However, the upgrade to this new kernel appears to have killed it - the tbf rule is causing outgoing HTTP connections to max out at around 8 Kbps. [...] This might be the same as bug #708995. Does turning off GRO on the internal interface (not the bridge but the physical interface) work around it? Yes, it looks like ifenslave -c bond0 eth0 ethtool -K eth0 gro off makes TBF precise again, and vice versa. That's on one machine. But on another wheezy machine with the same setup but somewhat different hardware, turning off GRO didn't help. How do I debug this further? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817105659.ga32...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote: tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 20480 limit 16384 However, the upgrade to this new kernel appears to have killed it - the tbf rule is causing outgoing HTTP connections to max out at around 8 Kbps. [...] This might be the same as bug #708995. Does turning off GRO on the internal interface (not the bridge but the physical interface) work around it? Yes, it looks like ifenslave -c bond0 eth0 ethtool -K eth0 gro off makes TBF precise again, and vice versa. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817103302.ga26...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 02:06:57PM +0200, Ben Hutchings wrote: On Sat, 2013-08-17 at 12:56 +0200, Josip Rodin wrote: On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote: On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote: tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 20480 limit 16384 However, the upgrade to this new kernel appears to have killed it - the tbf rule is causing outgoing HTTP connections to max out at around 8 Kbps. [...] This might be the same as bug #708995. Does turning off GRO on the internal interface (not the bridge but the physical interface) work around it? Yes, it looks like ifenslave -c bond0 eth0 ethtool -K eth0 gro off makes TBF precise again, and vice versa. That's on one machine. But on another wheezy machine with the same setup but somewhat different hardware, turning off GRO didn't help. How do I debug this further? You could try using the perf dropmonitor script as I described on my bug report. Didn't you say that was also broken? :) The other machine might also have LRO enabled on the internal interface, although this is supposed to be disabled for bridged interfaces. If the other machine is also passing traffic from another VM on the same physical host, it might be necessary to disable TSO on the interface within the other VM. There's no distinction here between physical interfaces; I receive traffic on a bond0 throught several VLANs. On one machine there's eth0 and eth2 behind that bond0, and that's the one where the workaround works. On the other one, there's only eth0 behind that bond0 (by accident), and the workaround doesn't make tbf work, oddly enough. I also tried removing other offload options, but didn't make a dent. The machines have different hardware but identical netfilter and tc rules, and I shift traffic between them by moving the IP addresses, using keepalived. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817125423.ga20...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 02:58:07PM +0200, Ben Hutchings wrote: How do I debug this further? You could try using the perf dropmonitor script as I described on my bug report. Didn't you say that was also broken? :) [...] It's fixed now. Hmm. Googling says it was fixed in May, so it doesn't sound like something that's going to come close to entering 3.2... So I took the new script and placed into /usr/share/perf_3.2-core/scripts/python/net_dropmonitor.py But I still can't seem to run it: % perf script net_dropmonitor invalid or unsupported event: 'skb:kfree_skb' Help? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817133342.ga27...@entuzijast.net
Bug#719958: traffic control simple token bucket filter within prio broken in wheezy
On Sat, Aug 17, 2013 at 04:08:12PM +0200, Ben Hutchings wrote: On Sat, 2013-08-17 at 15:33 +0200, Josip Rodin wrote: On Sat, Aug 17, 2013 at 02:58:07PM +0200, Ben Hutchings wrote: How do I debug this further? You could try using the perf dropmonitor script as I described on my bug report. Didn't you say that was also broken? :) [...] It's fixed now. Hmm. Googling says it was fixed in May, so it doesn't sound like something that's going to come close to entering 3.2... So I took the new script and placed into /usr/share/perf_3.2-core/scripts/python/net_dropmonitor.py But I still can't seem to run it: % perf script net_dropmonitor invalid or unsupported event: 'skb:kfree_skb' Help? Try running it as root... Well, that was stupid. Anyway, my test file transfer that drags along like this: Length: 2586317 (2,5M) [application/octet-stream] Saving to: /dev/null 8% [==] 207.064 13,7K/s eta 2m 18s ^C Results in this: Starting trace (Ctrl-C to dump results) ^CGathering kallsyms data LOCATIONOFFSET COUNT net_tx_action 0 1 At the same time, the tc output changes from: qdisc prio 1: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 948738 bytes 5358 pkt (dropped 117, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 10: parent 1:1 limit 127p quantum 1514b divisor 1024 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s Sent 948738 bytes 5358 pkt (dropped 117, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 30: parent 1:3 limit 127p quantum 1514b divisor 1024 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 to this: qdisc prio 1: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 10: parent 1:1 limit 127p quantum 1514b divisor 1024 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc sfq 30: parent 1:3 limit 127p quantum 1514b divisor 1024 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130817163048.ga23...@entuzijast.net
Bug#700755: huge slab_unreclaimable in Xen domU
On Wed, Feb 20, 2013 at 10:27:02AM +, Ian Campbell wrote: On Sun, 2013-02-17 at 00:22 +0100, Josip Rodin wrote: Package: linux-image-2.6.32-5-xen-amd64 This is in a guest, right? Is it possible to try the non-Xen amd64 flavour? I forget the exact status in Squeeze but IIRC most of the domU functionality is present in the -amd64 flavour with the -xen-amd64 flavour only being required for dom0 and some of the more advanced domU features. The reason I ask this is that the non-xen flavour is closer to mainline and therefore should be easier to track down the issue with. If you are also able separately to try this with the Wheezy kernel that would be very useful too. OK, I can install both (it's got PV-GRUB), which do you prefer to test first? I'm asking because it'll likely take a few weeks for the bug to appear, judging by what it did before. The thing I noticed was the slab_unreclaimable explosion, by a factor of 122. That... doesn't sound like something that should be happenning. Indeed. Is the system responsive enough to login and examine /proc/slabinfo? There is probably one which has exploded in size, it may even be sufficient to observe this over time and see if one seems to be slowly creeping upwards towards $doom. I'm going to try to run slabtop the next time I catch it in this state, in order to try to glean some more information. That would be great. I did post two consecutive slabtop results... I thought they had all the relevant info from /proc/slabinfo. The two large elements that grew both in the total number of objects and the active number were (extracted from my previous message): OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME first readout: 65419 65419 100%4.00K 141798453728K kmalloc-4096 65390 65390 100%2.06K 13338 15426816K net_namespace second readout: 65428 65428 100%4.00K 141818453792K kmalloc-4096 65391 65391 100%2.06K 13339 15426848K net_namespace How do I trace which process is calling this? In comparison, now, under seemingly normal circumstances, slabtop looks like this on that machine: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 56124 25272 45%0.11K 1559 36 6236K buffer_head 24843 12898 51%0.19K 1183 21 4732K dentry 23100 16107 69%1.01K 1540 15 24640K nfs_inode_cache 11456 6403 55%0.06K179 64 716K kmalloc-64 10208 8864 86%0.12K319 32 1276K kmalloc-128 7308 5275 72%0.55K522 14 4176K radix_tree_node 4947 4940 99%0.08K 97 51 388K sysfs_dir_cache 3584 3573 99%0.01K 7 51228K kmalloc-8 3200 2016 63%0.79K160 20 2560K ext3_inode_cache 2068 1981 95%0.18K 94 22 376K vm_area_struct 1792 1790 99%0.02K 7 25628K kmalloc-16 1692 1631 96%0.63K141 12 1128K proc_inode_cache 1632 1588 97%1.00K102 16 1632K kmalloc-1024 1472 1442 97%0.25K 92 16 368K kmalloc-256 1428 1129 79%0.19K 68 21 272K kmalloc-192 1296 1284 99%4.00K1628 5184K kmalloc-4096 1275 1270 99%2.06K 85 15 2720K net_namespace [...] -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130220111856.ga20...@entuzijast.net
Bug#700755: huge slab_unreclaimable in Xen domU
On Wed, Feb 20, 2013 at 11:35:44AM +, Ian Campbell wrote: OK, I can install both (it's got PV-GRUB), which do you prefer to test first? I'm asking because it'll likely take a few weeks for the bug to appear, judging by what it did before. Probably at this stage I would be more interested in making sure Wheezy was going to be OK first. ACK I'm not sure. The net_namespace one should be easy enough to track in the code since: net_cachep = kmem_cache_create(net_namespace, sizeof(struct net), and therefore users of net_cachep must be responsible, I'd expect there to be not all that many of those. Are you actually using network namespaces in the guest? If I am, I'm not doing it intentionally :) I'd assume this was part of the LXC functionality, but as Ben noticed before, there's code in vsftpd that triggers their use...? The Debian kernels have SLUB: /boot/config-2.6.32-5-xen-amd64:CONFIG_SLUB_DEBUG=y /boot/config-2.6.32-5-xen-amd64:CONFIG_SLUB=y (same as native). Documentation/vm/slub.txt has some info on adding debugging stuff there, e.g. adding slub_debug to the command line. It doesn't look like rebuilding with the other two option would initially be useful (the first is equivalent to the command line option anyway) I'm wary of enabling slub_debug by default; the document says it's okay to enable individual items on runtime: % ls -l /sys/kernel/slab/net_namespace/ | grep rw -rw-r--r-- 1 root root 4096 2013-02-20 15:41 min_partial -rw-r--r-- 1 root root 4096 2013-02-20 15:41 order -rw-r--r-- 1 root root 4096 2013-02-20 15:41 poison -rw-r--r-- 1 root root 4096 2013-02-20 15:41 reclaim_account -rw-r--r-- 1 root root 4096 2013-02-20 15:41 red_zone -rw-r--r-- 1 root root 4096 2013-02-20 15:41 remote_node_defrag_ratio -rw-r--r-- 1 root root 4096 2013-02-20 15:41 sanity_checks -rw-r--r-- 1 root root 4096 2013-02-20 15:41 shrink -rw-r--r-- 1 root root 4096 2013-02-20 15:41 store_user -rw-r--r-- 1 root root 4096 2013-02-20 15:41 trace -rw-r--r-- 1 root root 4096 2013-02-20 15:41 validate From the documentation, I probably want: U User tracking (free and alloc) But which of the above files corresponds to that, 'store_user' or? I'll have to go look at the source. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130220144349.ga15...@entuzijast.net
Bug#519586: Huge Slab Unreclaimable and continually growing
On Sat, Feb 16, 2013 at 03:13:06AM +, Ben Hutchings wrote: On Fri, 2013-02-15 at 08:56 +0100, Josip Rodin wrote: I appear to be experiencing a serious problem with a 768 MB RAM Xen domU machine running an NFS client - every now and then (for months now), often in the middle of the night, it enters some kind of a broken state where a few semi-random processes (mainly apache2's and vsftpd's which are told to serve files from the NFS mount) [...] I caught it earlier just now, at: [950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0 [950084.590735] active_file:76 inactive_file:516 isolated_file:32 [950084.590737] unevictable:783 dirty:1 writeback:0 unstable:0 [950084.590739] free:26251 slab_reclaimable:15733 slab_unreclaimable:128868 [950084.590741] mapped:938 shmem:75 pagetables:651 bounce:0 And snuck in a few slabtops (even some -o invocations were getting killed, along with my shell and pretty much everything else): [...] 65390 65390 100%2.06K 13338 15426816K net_namespace [...] Looks like CVE-2011-2189, for which there was a fix/workaround in: vsftpd (2.3.2-3+squeeze2) stable-security; urgency=high * Non-maintainer upload by the Security Team. * Disable network isolation due to a problem with cleaning up network namespaces fast enough in kernels 2.6.35 (CVE-2011-2189). Thanks Ben Hutchings for the patch! * Fix possible DoS via globa expressions in STAT commands by limiting the matching loop (CVE-2011-0762; Closes: #622741). -- Nico Golde n...@debian.org Wed, 07 Sep 2011 20:39:59 + Do you have an old version of vsftpd, or perhaps an upstream version which doesn't include the workaround? No, 2.3.2-3+squeeze2 is there, has been since 2012-03-22. Anyway, I'm closing the bug report; please don't hijack closed bugs. Eh? It was not closed for being fixed, it was closed en masse on a procedural reason that could easily be wrong, and I don't believe I was hijacking it; you just confirmed that this is a kernel problem above, so how could this possibly be improper?! -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130216213705.ga13...@entuzijast.net
Bug#519586: Huge Slab Unreclaimable and continually growing
On Tue, Jan 22, 2013 at 10:59:17AM +0100, Josip Rodin wrote: I appear to be experiencing a serious problem with a 768 MB RAM Xen domU machine running an NFS client - every now and then (for months now), often in the middle of the night, it enters some kind of a broken state where a few semi-random processes (mainly apache2's and vsftpd's which are told to serve files from the NFS mount) start battling it out for the memory, and everything including sshd starts invoking the OOM killer, over and over again. Nothing seems to halt the downward spiral; manual invocation of the OOM killer does nothing of any use. Terminating all processes is the only thing that makes it go quiet, but then that's effectively the same as a reboot. This is the SysRq+M output on the machine once it's been in the broken state for a while: [...] active_anon:394 inactive_anon:3197 isolated_anon:0 active_file:25 inactive_file:176 isolated_file:32 unevictable:2659 dirty:1 writeback:0 unstable:0 free:21456 slab_reclaimable:16177 slab_unreclaimable:143165 mapped:677 shmem:76 pagetables:455 bounce:0 [...] The thing I noticed was the slab_unreclaimable explosion, by a factor of 122. That... doesn't sound like something that should be happenning. Googling for slab_unreclaimable found me this old bug report about slab_unreclaimable domU problems that was mass-closed with the switch to the new paravirtops Xen release. Granted, our use case is not Samba like with the original reporter, but the pattern of a file server was close enough for me to be uncomfortable with it :| I caught it earlier just now, at: [950084.590733] active_anon:2805 inactive_anon:11835 isolated_anon:0 [950084.590735] active_file:76 inactive_file:516 isolated_file:32 [950084.590737] unevictable:783 dirty:1 writeback:0 unstable:0 [950084.590739] free:26251 slab_reclaimable:15733 slab_unreclaimable:128868 [950084.590741] mapped:938 shmem:75 pagetables:651 bounce:0 And snuck in a few slabtops (even some -o invocations were getting killed, along with my shell and pretty much everything else): Active / Total Objects (% used): 555753 / 587128 (94.7%) Active / Total Slabs (% used) : 49430 / 49430 (100.0%) Active / Total Caches (% used) : 65 / 76 (85.5%) Active / Total Size (% used) : 546613.78K / 553025.01K (98.8%) Minimum / Average / Maximum Object : 0.01K / 0.94K / 8.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 90993 66836 73%0.19K 4333 21 17332K dentry 75840 73664 97%0.12K 2370 32 9480K kmalloc-128 68096 68092 99%0.01K133 512 532K kmalloc-8 65888 65655 99%0.25K 4118 16 16472K kmalloc-256 65820 65778 99%1.00K 4767 16 76272K kmalloc-1024 65436 65414 99%0.63K 5454 12 43632K proc_inode_cache 65419 65419 100%4.00K 141798453728K kmalloc-4096 65390 65390 100%2.06K 13338 15426816K net_namespace 4998 4990 99%0.08K 98 51 392K sysfs_dir_cache 4224 2018 47%0.06K 66 64 264K kmalloc-64 2288 2107 92%0.18K104 22 416K vm_area_struct 1792 1789 99%0.02K 7 25628K kmalloc-16 1470 1203 81%0.19K 70 21 280K kmalloc-192 1300402 30%0.79K 65 20 1040K ext3_inode_cache 896731 81%0.03K 7 12828K anon_vma 784532 67%0.55K 56 14 448K radix_tree_node A bit later: Active / Total Objects (% used): 555403 / 586704 (94.7%) Active / Total Slabs (% used) : 49394 / 49394 (100.0%) Active / Total Caches (% used) : 65 / 76 (85.5%) Active / Total Size (% used) : 546552.82K / 552827.43K (98.9%) Minimum / Average / Maximum Object : 0.01K / 0.94K / 8.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 90993 66779 73%0.19K 4333 21 17332K dentry 75840 73654 97%0.12K 2370 32 9480K kmalloc-128 68096 68092 99%0.01K133 512 532K kmalloc-8 65888 65601 99%0.25K 4118 16 16472K kmalloc-256 65852 65741 99%1.00K 4760 16 76160K kmalloc-1024 65436 65409 99%0.63K 5454 12 43632K proc_inode_cache 65428 65428 100%4.00K 141818453792K kmalloc-4096 65391 65391 100%2.06K 13339 15426848K net_namespace 4998 4986 99%0.08K 98 51 392K sysfs_dir_cache 4224 2017 47%0.06K 66 64 264K kmalloc-64 2134 2108 98%0.18K 97 22 388K vm_area_struct 1792 1789 99%0.02K 7 25628K kmalloc-16 1449 1078 74%0.19K 69 21 276K kmalloc-192 1100376 34%0.79K 55 20 880K ext3_inode_cache 896639 71%0.03K 7 12828K anon_vma 714554 77%0.55K 51 14
Bug#685360: [PATCH 1/1] HID: Fix missing Unifying device issue
On Mon, Sep 24, 2012 at 11:30:28AM +0200, Nestor Lopez Casado wrote: Josip, this is a different issue from the one addressed with the patch. 1) Can you try it on a 3.2 kernel ? I can try that too, I'll let you know how it went. (Unfortunately the machine is in the same room with a crib, so I don't get a lot of time slots for testing. *shrug* :) 2) The problem you describe, does it happen all the time ? Yes. The keyboard simply stopped working after I upgraded to Linux 3.2. It works fine under 3.1 and earlier, and under Windows. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120927070426.ga27...@entuzijast.net
Bug#685360: [PATCH 1/1] HID: Fix missing Unifying device issue
On Thu, Sep 27, 2012 at 09:04:26AM +0200, Josip Rodin wrote: On Mon, Sep 24, 2012 at 11:30:28AM +0200, Nestor Lopez Casado wrote: Josip, this is a different issue from the one addressed with the patch. 1) Can you try it on a 3.2 kernel ? I can try that too, I'll let you know how it went. Same thing, it doesn't work. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120927185628.ga30...@entuzijast.net
Bug#685360: [PATCH 1/1] HID: Fix missing Unifying device issue
On Fri, Sep 21, 2012 at 12:21:34PM +0200, Nestor Lopez Casado wrote: This patch fixes an issue introduced after commit 4ea5454203d991ec After that commit, hid-core silently discards any incoming packet that arrives while any hid driver's probe function is being executed. I managed to test this now, on top of Linux 3.5, but it didn't fix my keyboard. I still get the same sequence of messages with hid.debug=1: +usb 5-2: new full-speed USB device number 3 using ohci_hcd +drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 0 +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 0 +drivers/hid/hid-logitech-dj.c: logi_dj_probe: ignoring ifnum 0 +drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 1 +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 1 +drivers/hid/hid-logitech-dj.c: logi_dj_probe: ignoring ifnum 1 +drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 2 +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 2 +drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0110 wIndex=0x0002 wLength=7 +drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0111 wIndex=0x0002 wLength=20 +drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0120 wIndex=0x0002 wLength=15 +drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0121 wIndex=0x0002 wLength=32 +logitech-djreceiver 0003:046D:C52B.0005: claimed by neither input, hiddev nor hidraw +logitech-djreceiver 0003:046D:C52B.0005: logi_dj_probe:hid_hw_start returned error -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120923102318.ga12...@entuzijast.net
Bug#685360: Logitech USB keyboard broken with Linux 3.2 (regression from 3.1)
On Mon, Sep 17, 2012 at 12:57:06PM +0200, Jiri Kosina wrote: On Wed, 12 Sep 2012, Nestor Lopez Casado wrote: Take a look at this thread ... where a patch was published ... https://bugs.launchpad.net/ubuntu/+bug/958174 Your issue may come from the same problem. I will get back to you next week. I am OOO until monday. So, what is the progress here, please? Nestor, Josip? There is no progress because for some reason I did not receive Nestor's previous e-mail; I'll try to test it this week. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120917110008.ga16...@entuzijast.net
Bug#685360: Acknowledgement (AMD SB 750 + Logitech USB keyboard broken and system unbootable with Linux 3.2 (regression from 2.6.38))
Control: retitle -1 AMD SB 750 + Logitech USB keyboard brokenness with Linux 3.2 (regression from 3.1) On Mon, Aug 20, 2012 at 06:26:42PM +0200, Josip Rodin wrote: I'll try to bisect this now with my config. It looks like it's definitely in some way related with the introduction of CONFIG_HID_LOGITECH_DJ in 3.2+, because 3.1.0 works fine... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120911224629.ga32...@entuzijast.net
Bug#685360: Acknowledgement (AMD SB 750 + Logitech USB keyboard broken and system unbootable with Linux 3.2 (regression from 2.6.38))
On Wed, Sep 12, 2012 at 12:46:29AM +0200, Josip Rodin wrote: On Mon, Aug 20, 2012 at 06:26:42PM +0200, Josip Rodin wrote: I'll try to bisect this now with my config. It looks like it's definitely in some way related with the introduction of CONFIG_HID_LOGITECH_DJ in 3.2+, because 3.1.0 works fine... The dmesg difference between 3.1.0 (working) and 3.2.0 (broken) is a bit confusing - on one USB port, there's no change, but on the other the new module reports a failure (this output is with hid.debug=1 and is a bit fuzzy because of random harmless changes like spelling fixes or device indices 5 vs 3): -usb 2-5: new high speed USB device number 2 using ehci_hcd -usb 5-1: new low speed USB device number 2 using ohci_hcd +usb 2-5: new high-speed USB device number 2 using ehci_hcd +usb 3-1: new low-speed USB device number 2 using ohci_hcd drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 0 -input: Logitech USB Receiver as /devices/pci:00/:00:12.0/usb5/5-1/5-1:1.0/input/input3 +input: Logitech USB Receiver as /devices/pci:00/:00:12.0/usb3/3-1/3-1:1.0/input/input3 generic-usb 0003:046D:C51B.0001: input: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-:00:12.0-1/input drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 1 drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0110 wIndex=0x0001 wLength=7 generic-usb 0003:046D:C51B.0002: claimed by neither input, hiddev nor hidraw -usb 5-2: new full speed USB device number 3 using ohci_hcd +usb 3-2: new full-speed USB device number 3 using ohci_hcd drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 0 -drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Set_Report wValue=0x0200 wIndex=0x wLength=1 -input: Logitech USB Receiver as /devices/pci:00/:00:12.0/usb5/5-2/5-2:1.0/input/input4 -generic-usb 0003:046D:C52B.0003: input: USB HID v1.11 Keyboard [Logitech USB Receiver] on usb-:00:12.0-2/in +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 0 +drivers/hid/hid-logitech-dj.c: logi_dj_probe: ignoring ifnum 0 drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 1 -input: Logitech USB Receiver as /devices/pci:00/:00:12.0/usb5/5-2/5-2:1.1/input/input5 -generic-usb 0003:046D:C52B.0004: input: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-:00:12.0-2/input +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 1 +drivers/hid/hid-logitech-dj.c: logi_dj_probe: ignoring ifnum 1 drivers/hid/usbhid/hid-core.c: HID probe called for ifnum 2 +drivers/hid/hid-logitech-dj.c: logi_dj_probe called for ifnum 2 drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0110 wIndex=0x0002 wLength=7 drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0111 wIndex=0x0002 wLength=20 drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0120 wIndex=0x0002 wLength=15 drivers/hid/usbhid/hid-core.c: submitting ctrl urb: Get_Report wValue=0x0121 wIndex=0x0002 wLength=32 -generic-usb 0003:046D:C52B.0005: claimed by neither input, hiddev nor hidraw -usb 2-5: reset high speed USB device number 2 using ehci_hcd +logitech-djreceiver 0003:046D:C52B.0005: claimed by neither input, hiddev nor hidraw +logitech-djreceiver 0003:046D:C52B.0005: logi_dj_probe:hid_hw_start returned error:-19 Now where did the devices 0003:046D:C52B.000[34] go with the new kernel? Are they the ones that logi_dj_probe sees as 0 and 1? The code says: /* Ignore interfaces 0 and 1, they will not carry any data, dont create * any hid_device for them */ if (intf-cur_altsetting-desc.bInterfaceNumber != LOGITECH_DJ_INTERFACE_NUMBER) { dbg_hid(%s: ignoring ifnum %d\n, __func__, intf-cur_altsetting-desc.bInterfaceNumber); return -ENODEV; } Well, that probably explains it. But why does it do that? lsusb -v says the following about the hardware: Bus 005 Device 003: ID 046d:c52b Logitech, Inc. Unifying Receiver Device Descriptor: bLength18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x046d Logitech, Inc. idProduct 0xc52b Unifying Receiver bcdDevice 12.01 iManufacturer 1 Logitech iProduct2 USB Receiver iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 84 bNumInterfaces 3 bConfigurationValue 1 iConfiguration 4 RQR12.01_B0019 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 98mA Interface Descriptor: bLength 9
Bug#666386: more info
On Sun, Sep 02, 2012 at 02:42:38PM +0100, Ben Hutchings wrote: On Sat, 2012-09-01 at 10:45 +0200, Bastian Blank wrote: On Fri, Aug 31, 2012 at 05:52:03PM +0200, Josip Rodin wrote: auto vlan2 iface vlan2 inet manual vlan-raw-device xenbr0 Is vlan-over-bridge documented to be supported? If it was not supported then bridge devices would have NETIF_F_VLAN_CHALLENGED and you would not be able to create VLAN devices on top of them. But I don't expect this to work *well* at present. Rebooting, however... something is very wrong here. Usually I would use: | iface xenbr2 inet static | bridge-ports bond0.2 But as soon as I generate any traffic to or from 192.168.54.0/24 and that virtual machine (notice - not the right VLAN), the whole system instantly reboots, with no messages in syslog. Does it work without bond? I would switch to openvswitch. It documents bond/vlan setups, so they most likely work. (I don't use bond yet, but the rest works pretty flawless, however I have to submit the openvswitch support.) Definitely worth trying. Ah, good catch, I do actually seem to want to use the underlying device for vlan, rather than the bridge device. All my other setups are like that. It shouldn't crash anyway... JFTR the other thing I noticed, before I read the mail, was that the remote machine was actually visible on that L2 segment through ARP, but the outside world can't ping it - it's as if xen-netback/front don't let that through, for no apparent reason. And then, when I initiate traffic towards the outside world from the machine, it all goes poof. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120903162712.ga22...@entuzijast.net
Bug#666386: more info
On Mon, Sep 03, 2012 at 06:27:12PM +0200, Josip Rodin wrote: On Sun, Sep 02, 2012 at 02:42:38PM +0100, Ben Hutchings wrote: On Sat, 2012-09-01 at 10:45 +0200, Bastian Blank wrote: On Fri, Aug 31, 2012 at 05:52:03PM +0200, Josip Rodin wrote: auto vlan2 iface vlan2 inet manual vlan-raw-device xenbr0 Is vlan-over-bridge documented to be supported? If it was not supported then bridge devices would have NETIF_F_VLAN_CHALLENGED and you would not be able to create VLAN devices on top of them. But I don't expect this to work *well* at present. Rebooting, however... something is very wrong here. Usually I would use: | iface xenbr2 inet static | bridge-ports bond0.2 But as soon as I generate any traffic to or from 192.168.54.0/24 and that virtual machine (notice - not the right VLAN), the whole system instantly reboots, with no messages in syslog. Does it work without bond? Ah, good catch, I do actually seem to want to use the underlying device for vlan, rather than the bridge device. All my other setups are like that. Confirming it works with fixed vlan-raw-device, pointed to eth2. I should test with bond now, that was probably it... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120903163909.ga26...@entuzijast.net
Bug#666386: more info
Hi, I had removed the igb-based eth0 from the bonding interface, and the machine was running fine with it, but when the time had come to get some Xen domUs running on it, it failed miserably on me once again. The updated setup is: auto bond0 iface bond0 inet manual slaves eth2 bond_mode active-backup bond_miimon 100 auto xenbr0 iface xenbr0 inet static bridge-ports bond0 bridge-fd 0 address 192.168.54.2 netmask 255.255.255.0 auto vlan2 iface vlan2 inet manual vlan-raw-device xenbr0 auto xenbr2 iface xenbr2 inet static bridge-ports vlan2 bridge-fd 0 address 213.202.97.156 netmask 255.255.255.240 gateway 213.202.97.145 And the virtual machine has simply this: vif = [ mac=00:16:3e:7a:32:9b, bridge=xenbr2, ] But as soon as I generate any traffic to or from 192.168.54.0/24 and that virtual machine (notice - not the right VLAN), the whole system instantly reboots, with no messages in syslog. I should probably use the hypervisor's noreboot option, but I don't have a connection to its IPMI out-of-band access controller, and I'm off-site, so I'm SOL. This is with linux-image-3.2.0-0.bpo.2-amd64 and with latest .bpo.3. I'm going to try fiddling with ethtool -K eth2 gro/lro off, but with the reboots taking 3min on this hardware, this is most annoying... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120831155203.ga6...@entuzijast.net
Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken
On Sat, Apr 07, 2012 at 04:29:38AM +0100, Ben Hutchings wrote: I would like to take this upstream now, but first I need to check whether it has already been fixed after 2.6.32. Please can you test the current kernel package from testing, unstable or squeeze-backports (linux-image-3.2.0-2-amd64 or linux-image-3.2.0-0.bpo.2-amd64)? I installed linux-image-3.2.0-0.bpo.2-amd64, plus the upgraded linux-base and initramfs-tools, plus the indicated firmware-bnx2 upgrade -- and then rebooted into that kernel, but the machine wouldn't respond to ping over the xenbr2 interface (the one with the default gateway). I logged into it fine through the xenbr54 interface, and tried to ping the default gateway, and it didn't work. This was with the workaround - only bnx2/eth2 in the bonding interface. Then I removed the default gateway and added it back just to see if it'll work, and then it started pinging. Weird. After that, I tried to reproduce this bug, but failed, it looks like the bug is fixed there. I noticed a significant lag with some of those bonding --detach/--change-active actions, but after a few sections everything continued to work fine. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120411122453.ga29...@entuzijast.net
Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken
On Mon, Apr 02, 2012 at 05:22:37AM +0100, Ben Hutchings wrote: On Sun, 2012-04-01 at 12:40 +0200, Josip Rodin wrote: On Sun, Apr 01, 2012 at 03:09:56AM +0100, Ben Hutchings wrote: I bet this is due to the combination of LRO plus bridging. We try to turn off LRO in devices under a bridge, but that won't work if there's an intermediate bonding device. If you run: # ethtool -K eth0 lro off # ethtool -K eth2 lro off does the bridge start working? Err... % sudo ethtool -K eth0 lro off Cannot set large receive offload settings: Operation not supported % sudo ethtool -K eth2 lro off Cannot set large receive offload settings: Operation not supported Hmm. Well it shouldn't be a problem but you could try also turning off GRO (similar commands). Ah, there we go. Once I ran sudo ethtool -K eth0 gro off, sudo ifenslave bond54 eth0 produced a still-working bond54. That's with eth0 removed from bonding, and eth2 inside. So the bonding device has only one slave now? Yes, it was like that. What if you take the bonding device out completely and add eth2 directly to the bridge? I think I had already tested that and everything was fine, too. Do you want me to test that or is the GRO removal conclusive? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120404075557.ga3...@entuzijast.net
Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken
On Sun, Apr 01, 2012 at 03:09:56AM +0100, Ben Hutchings wrote: I bet this is due to the combination of LRO plus bridging. We try to turn off LRO in devices under a bridge, but that won't work if there's an intermediate bonding device. If you run: # ethtool -K eth0 lro off # ethtool -K eth2 lro off does the bridge start working? Err... % sudo ethtool -K eth0 lro off Cannot set large receive offload settings: Operation not supported % sudo ethtool -K eth2 lro off Cannot set large receive offload settings: Operation not supported That's with eth0 removed from bonding, and eth2 inside. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120401104044.ga28...@entuzijast.net
Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken
Package: linux-image-2.6.32-5-xen-amd64 Version: 2.6.32-41 Hi, The machine is a new IBM x3550 M3, with this network hardware: % lspci | grep Ethernet 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 0b:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 1a:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01) 1a:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01) One of each brands (eth0 and eth2) has a working cable plugged into a working Ethernet switch that's set up so that it serves a native VLAN (otherwise known as ID 54) and VLAN ID 2 trunked (tagged), among others. The devices are: lrwxrwxrwx 1 root root 0 Mar 19 15:42 /sys/class/net/eth0 - ../../devices/pci:00/:00:07.0/:1a:00.0/net/eth0/ lrwxrwxrwx 1 root root 0 Mar 19 15:42 /sys/class/net/eth2 - ../../devices/pci:00/:00:01.0/:0b:00.0/net/eth2/ So, if I read that right, eth0 is Intel, and eth2 is Broadcom. The desired network setup is, in interfaces(5) format: iface bond54 inet manual slaves eth0 eth2 bond_mode active-backup bond_miimon 100 iface xenbr54 inet static bridge-ports bond54 bridge-fd 0 address 192.168.54.2 netmask 255.255.255.0 iface vlan2 inet manual vlan-raw-device xenbr54 iface xenbr2 inet static bridge-ports vlan2 bridge-fd 0 address 213.202.97.156 netmask 255.255.255.240 gateway 213.202.97.145 This used to work for me elsewhere, however, on this machine it's broken as follows: Everything starts up fine, and the machine is perfectly usable (albeit I only used SSH) over the xenbr54 interface. However, over the xenbr2 interface, all the small network packets pass, such as ICMP, or the bringup and teardown of HTTP connections, but as soon as I try to actually GET something non-trivial over a seemingly established HTTP connection, the machine pretends it doesn't see that incoming traffic. Like this: % wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz --2012-03-30 11:15:23-- http://ftp.hr.debian.org/debian/ls-lR.gz Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11 Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected. HTTP request sent, awaiting response... In parallel, the trace shows: % sudo tshark -n -i xenbr2 0.00 213.202.97.156 - 161.53.160.11 TCP 51657 80 [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=232632046 TSER=0 WS=1 0.001797 161.53.160.11 - 213.202.97.156 TCP 80 51657 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=643552423 TSER=232632046 WS=8 0.001816 213.202.97.156 - 161.53.160.11 TCP 51657 80 [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=232632046 TSER=643552423 0.001906 213.202.97.156 - 161.53.160.11 HTTP GET /debian/ls-lR.gz HTTP/1.0 0.003625 161.53.160.11 - 213.202.97.156 TCP 80 51657 [ACK] Seq=1 Ack=131 Win=6912 Len=0 TSV=643552423 TSER=232632046 And then it sits there. The server machine (which I happen to have control over) says: 0.00 213.202.97.156 - 161.53.160.11 TCP 51660 80 [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=232668023 TSER=0 WS=1 0.23 161.53.160.11 - 213.202.97.156 TCP 80 51660 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=643588400 TSER=232668023 WS=8 0.003117 213.202.97.156 - 161.53.160.11 TCP 51660 80 [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=232668024 TSER=643588400 0.003125 213.202.97.156 - 161.53.160.11 HTTP GET /debian/ls-lR.gz HTTP/1.0 0.003145 161.53.160.11 - 213.202.97.156 TCP 80 51660 [ACK] Seq=1 Ack=131 Win=6912 Len=0 TSV=643588401 TSER=232668024 0.003480 161.53.160.11 - 213.202.97.156 TCP [TCP segment of a reassembled PDU] 0.003500 161.53.160.11 - 213.202.97.156 TCP [TCP segment of a reassembled PDU] 0.204965 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] 0.613959 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] 1.428964 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] 3.061959 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] 6.329958 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] 12.853960 161.53.160.11 - 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU] And then I Ctrl+C that wget, and the traces show: (on the client) 8.017451 213.202.97.156 - 161.53.160.11 TCP 51664 80 [FIN, ACK] Seq=131 Ack=1 Win=5840 Len=0 TSV=232696067 TSER=643614440 8.057740 161.53.160.11 - 213.202.97.156 TCP [TCP Previous segment lost] 80 51664 [ACK] Seq=4345 Ack=132 Win=6912 Len=0 TSV=643616454 TSER=232696067 (on the server) 8.017218 213.202.97.156 - 161.53.160.11 TCP 51664 80 [FIN, ACK] Seq=131 Ack=1 Win=5840 Len=0 TSV=232696067 TSER=643614440 8.055647 161.53.160.11 - 213.202.97.156 TCP 80 51664 [ACK] Seq=4345 Ack=132 Win=6912 Len=0
Bug#599161: ditto
On Tue, Jan 03, 2012 at 01:42:38PM +, Ian Campbell wrote: On Wed, 2011-12-28 at 01:49 +0100, Josip Rodin wrote: This clock jump by 2999 seconds also happened here, so per: http://old-list-archives.xen.org/archives/html/xen-devel/2011-02/msg01557.html we switched to clocksource=pit in /etc/default/grub's $GRUB_CMDLINE_XEN on the dom0. This seemed to have avoided the problem, but since then, the clock jumps started happening like this: Dec 21 19:42:23 dom0machine kernel: [6034768.658836] Clocksource tsc unstable (delta = -811538856601 ns) In addition, now I checked what the said machine thinks is its clocksource: % cat /sys/devices/system/clocksource/clocksource0/current_clocksource /sys/devices/system/clocksource/clocksource0/available_clocksource xen xen So there's neither pit nor tsc in the available list :) A PV kernel will (or should) always use xen as it's clocksource. This is a PV timesource based around the TSC + correction factors (to account for drift and PCPU migration). The clocksource=pit on the hypervisor command line controls the hypervisor's own timesource and not the dom0 kernels. I'm not sure how you query the hypervisor for its timesource but I guess it'll be in xl dmesg somewhere (Platform timer is ...). Ah, d'oh :) sorry, I wasn't really thinking. The xm dmesg output on HP DL360 machines that we have set to clocksource=pit and that have nevertheless happened to shifted by more than 35996 seconds in at least five incidents in the last six months says: (XEN) Platform timer is 1.193MHz PIT On a couple of FS RX300's that happened not to have clocksource=pit set but had time shift by 2999.69 seconds it's this: (XEN) Platform timer is 14.318MHz HPET Both also show the following message after the time shift: (XEN) Platform timer appears to have unexpectedly wrapped 10 or more times. The message you quote above says *tsc* unstable. Prior to that was the system actually using the tsc clocksource? It really shouldn't have been... Before that message did available_clocksource contain TSC? What about current_clocksource? (Before here ~= on a freshly booted system) The dom0 machines where we set clocksource=pit do see the sole xen clocksource. That didn't stop the time from going awry. On the dom0 machines that don't have the hypervisor fixated on clocksource=pit: * one dom0 that sees both xen and tsc in available_clocksource, but uses xen as current_clocksource. Not sure what it used at the time of the failure in September, probably the same because we didn't touch that. * one that recently failed has: % dmesg | grep unstable [4613030.883101] Clocksource tsc unstable (delta = -2999660301416 ns) % cat /sys/devices/system/clocksource/clocksource0/* xen xen What are your exact hypervisor and kernel command lines? Other than clocksource=pit are you overriding anything else in this regard? Most of the machines now seem to have: GRUB_CMDLINE_LINUX=console=tty0 console=ttyS1,115200n1 elevator=deadline GRUB_CMDLINE_XEN=dom0_mem=512M clocksource=pit cpuidle=0 The machines without clocksource=pit only had dom0_mem=512M for the hypervisor and nothing for the dom0 kernel. Can you press the 's' hypervisor debug key and report the resulting text from dmesg. (press a debug key == xl debug-key s + xl dmesg or press Ctrl-A 3 times on serial then press 's'). (Note that I used xm for both of those commands, I don't have xl.) This is the output on a couple of of the DL360's with clocksource=pit: (XEN) TSC has constant rate, deep Cstates possible, so not reliable, warp=3066 (count=1) (XEN) dom2: mode=0,ofs=0x21e231c896,khz=2333479,inc=1,vtsc count: 10647611967 kernel, 454486411 user (XEN) dom12: mode=0,ofs=0x21a01e68ddeb,khz=2333479,inc=1,vtsc count: 2478607037 kernel, 199833427 user (XEN) dom17: mode=0,ofs=0x8d12c3820bf0b,khz=2333479,inc=1,vtsc count: 918220049 kernel, 56818086 user (XEN) dom18: mode=0,ofs=0x8d1334e2f635f,khz=2333479,inc=1,vtsc count: 4707785417 kernel, 197043637 user (XEN) dom21: mode=0,ofs=0x1004cc1e5bf801,khz=2333479,inc=1,vtsc count: 6386763431 kernel, 166512523 user (XEN) dom22: mode=0,ofs=0x14b5955232a7e1,khz=2333479,inc=1,vtsc count: 2218555643 kernel, 88962103 user (XEN) TSC has constant rate, deep Cstates possible, so not reliable, warp=1715 (count=1) (XEN) dom1: mode=0,ofs=0x149170bd5f,khz=2333479,inc=1,vtsc count: 36234921552 kernel, 294922844 user This is the output on an RX300 without clocksource=pit: (XEN) TSC marked as reliable, warp = 0 (count=2) (XEN) dom1: mode=0,ofs=0x59e046806,khz=2400116,inc=1 (XEN) No domains have emulated TSC And finally this is the output on the odd machine that has tsc as an available clock source: (XEN) TSC marked as reliable, warp = 0 (count=2) (XEN) dom1: mode=0,ofs=0x593b1f9e8,khz=2400190,inc=1 (XEN) dom4: mode=0,ofs=0xf3c77d49e41e6,khz=2400190,inc=1 (XEN) No domains have emulated TSC In the latter case, I've no idea why the domU with the ID 4
Bug#599161: ditto
This clock jump by 2999 seconds also happened here, so per: http://old-list-archives.xen.org/archives/html/xen-devel/2011-02/msg01557.html we switched to clocksource=pit in /etc/default/grub's $GRUB_CMDLINE_XEN on the dom0. This seemed to have avoided the problem, but since then, the clock jumps started happening like this: Dec 21 19:42:23 dom0machine kernel: [6034768.658836] Clocksource tsc unstable (delta = -811538856601 ns) In addition, now I checked what the said machine thinks is its clocksource: % cat /sys/devices/system/clocksource/clocksource0/current_clocksource /sys/devices/system/clocksource/clocksource0/available_clocksource xen xen So there's neither pit nor tsc in the available list :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20111228004915.ga21...@entuzijast.net
Bug#622779: sparc config missing SERIAL_8250{,_PCI}
On Fri, Apr 15, 2011 at 04:14:12AM +0100, Ben Hutchings wrote: On Thu, 2011-04-14 at 17:10 +0200, Josip Rodin wrote: Package: linux-image-2.6.32-5-sparc64 Version: 2.6.32-31 Hi, /boot/config-2.6.32-5-sparc64 does not include CONFIG_SERIAL_8250 or SERIAL_8250_PCI, so it's impossible to use PCI cards with serial ports on them, which is useful for accessing e.g. serial consoles of other machines from a sparc machine. [...] Is a module OK or do you want it built-in for some reason? (That would be necessary for a serial console, but you can presumably already use the built-in serial port.) Yes, in fact I only tested it as a module :) For the serial console of the sparc machine itself, the ttyS* of sun* serial modules are used, and on server hardware these usually hardwire to ALOM. This is to be able to get a usable physical port to connect to another machine. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110415072354.ga7...@entuzijast.net
Bug#622779: sparc config missing SERIAL_8250{,_PCI}
Package: linux-image-2.6.32-5-sparc64 Version: 2.6.32-31 Hi, /boot/config-2.6.32-5-sparc64 does not include CONFIG_SERIAL_8250 or SERIAL_8250_PCI, so it's impossible to use PCI cards with serial ports on them, which is useful for accessing e.g. serial consoles of other machines from a sparc machine. This used to be impossible (upstream), but the issue that had caused that has long been fixed. (We got it to work in March 2009, judging by a /boot/config-2.6.28.7 of mine.) Please add this so that the PCI serial cards in lebrun.debian.org and schroeder.debian.org stop being useless with the default kernel. TIA :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110414151032.ga7...@entuzijast.net
Bug#610118: swapper: page allocation failure. order:0, mode:0x4020
Package: linux-image-2.6.32-5-xen-amd64 Version: 2.6.32-23~bpo50+1 Hi, Something like this was mentioned misplaced in #592497, and about a different network driver, so I'm filing a new bug because it should be unrelated to both issues over there :) I've just seen something similar with a tg3, randomly during normal operation (after 100 days of uptime), but also on a Xen dom0. Nothing very bad seems to have happened, the domUs haven't complained at all, but it's suspicious. Jan 15 13:39:10 virgo kernel: [9658533.157616] swapper: page allocation failure. order:0, mode:0x4020 Jan 15 13:39:10 virgo kernel: [9658533.157634] Pid: 0, comm: swapper Not tainted 2.6.32-bpo.5-xen-amd64 #1 Jan 15 13:39:10 virgo kernel: [9658533.157642] Call Trace: Jan 15 13:39:10 virgo kernel: [9658533.157649] IRQ [810baed4] ? __alloc_pages_nodemask+0x55b/0x5cf Jan 15 13:39:10 virgo kernel: [9658533.157677] [81295624] ? tcp_rcv_established+0x688/0x6d9 Jan 15 13:39:10 virgo kernel: [9658533.157690] [810e70d2] ? new_slab+0x5b/0x1ca Jan 15 13:39:10 virgo kernel: [9658533.157700] [810e7431] ? __slab_alloc+0x1f0/0x39b Jan 15 13:39:10 virgo kernel: [9658533.157712] [81258103] ? __netdev_alloc_skb+0x29/0x43 Jan 15 13:39:10 virgo kernel: [9658533.157723] [810e7e63] ? __kmalloc_node_track_caller+0xbb/0x11b Jan 15 13:39:10 virgo kernel: [9658533.157734] [81258103] ? __netdev_alloc_skb+0x29/0x43 Jan 15 13:39:10 virgo kernel: [9658533.157744] [81257518] ? __alloc_skb+0x69/0x15a Jan 15 13:39:10 virgo kernel: [9658533.157754] [81258103] ? __netdev_alloc_skb+0x29/0x43 Jan 15 13:39:10 virgo kernel: [9658533.157785] [a0019c25] ? tg3_alloc_rx_skb+0xd2/0x146 [tg3] Jan 15 13:39:10 virgo kernel: [9658533.157805] [a0021292] ? tg3_poll+0x484/0x93d [tg3] Jan 15 13:39:10 virgo kernel: [9658533.157818] [8100e5b5] ? xen_force_evtchn_callback+0x9/0xa Jan 15 13:39:10 virgo kernel: [9658533.157829] [8100ec72] ? check_events+0x12/0x20 Jan 15 13:39:10 virgo kernel: [9658533.157840] [8125e633] ? net_rx_action+0xae/0x1c9 Jan 15 13:39:10 virgo kernel: [9658533.157852] [810548ca] ? __do_softirq+0xdd/0x19f Jan 15 13:39:10 virgo kernel: [9658533.157863] [81012cac] ? call_softirq+0x1c/0x30 Jan 15 13:39:10 virgo kernel: [9658533.157873] [8101422b] ? do_softirq+0x3f/0x7c Jan 15 13:39:10 virgo kernel: [9658533.157883] [81054739] ? irq_exit+0x36/0x76 Jan 15 13:39:10 virgo kernel: [9658533.157894] [811f14b1] ? xen_evtchn_do_upcall+0x33/0x42 Jan 15 13:39:10 virgo kernel: [9658533.157905] [81012cfe] ? xen_do_hypervisor_callback+0x1e/0x30 Jan 15 13:39:10 virgo kernel: [9658533.157913] EOI [810093aa] ? hypercall_page+0x3aa/0x1001 Jan 15 13:39:10 virgo kernel: [9658533.157929] [810093aa] ? hypercall_page+0x3aa/0x1001 Jan 15 13:39:10 virgo kernel: [9658533.157940] [8100e633] ? xen_safe_halt+0xc/0x15 Jan 15 13:39:10 virgo kernel: [9658533.157950] [8100bf3f] ? xen_idle+0x37/0x40 Jan 15 13:39:10 virgo kernel: [9658533.157959] [81010eb1] ? cpu_idle+0xa2/0xda Jan 15 13:39:10 virgo kernel: [9658533.157977] [81502cd1] ? start_kernel+0x3dc/0x3e8 Jan 15 13:39:10 virgo kernel: [9658533.157987] [81504c7d] ? xen_start_kernel+0x57c/0x581 Jan 15 13:39:10 virgo kernel: [9658533.157995] Mem-Info: Jan 15 13:39:10 virgo kernel: [9658533.158000] Node 0 DMA per-cpu: Jan 15 13:39:10 virgo kernel: [9658533.158009] CPU0: hi:0, btch: 1 usd: 0 Jan 15 13:39:10 virgo kernel: [9658533.158017] CPU1: hi:0, btch: 1 usd: 0 Jan 15 13:39:10 virgo kernel: [9658533.158024] CPU2: hi:0, btch: 1 usd: 0 Jan 15 13:39:10 virgo kernel: [9658533.158031] CPU3: hi:0, btch: 1 usd: 0 Jan 15 13:39:10 virgo kernel: [9658533.158038] Node 0 DMA32 per-cpu: Jan 15 13:39:10 virgo kernel: [9658533.158046] CPU0: hi: 186, btch: 31 usd: 197 Jan 15 13:39:10 virgo kernel: [9658533.158053] CPU1: hi: 186, btch: 31 usd: 182 Jan 15 13:39:10 virgo kernel: [9658533.158061] CPU2: hi: 186, btch: 31 usd: 140 Jan 15 13:39:10 virgo kernel: [9658533.158068] CPU3: hi: 186, btch: 31 usd: 85 Jan 15 13:39:10 virgo kernel: [9658533.158080] active_anon:5831 inactive_anon:8962 isolated_anon:0 Jan 15 13:39:10 virgo kernel: [9658533.158082] active_file:36954 inactive_file:24010 isolated_file:0 Jan 15 13:39:10 virgo kernel: [9658533.158084] unevictable:5 dirty:2821 writeback:0 unstable:0 Jan 15 13:39:10 virgo kernel: [9658533.158086] free:753 slab_reclaimable:25777 slab_unreclaimable:3367 Jan 15 13:39:10 virgo kernel: [9658533.158088] mapped:4481 shmem:151 pagetables:489 bounce:0 Jan 15 13:39:10 virgo kernel: [9658533.158109] Node 0 DMA free:1976kB min:80kB low:100kB high:120kB active_anon:308kB inactive_anon:512kB active_file:5712kB inactive_file:3932kB unevictable:0kB isolated(anon):0kB
Bug#598057: our xen-netfront in featureset=xen kernels has smartpoll enabled, but probably shouldn't
Package: linux-image-2.6.32-5-xen-amd64 Version: 2.6.32-23 Hi, I just witnessed a strange situation - a domU had its kernel updated from 2.6.32-4-amd64 to 2.6.32-bpo.5-xen-amd64, and all seemed well, but after two hours it stopped responding on its (statically configured) eth0 device. tcpdump of the bridge and the vif on the dom0 said that everything was all right, the domU was making ARP requests for its gateway, and the gateway was responding, but then the domU would just repeat the same request, over and over again. That sounded to me like a xen-netfront problem. I happen to watch xen.git, so I know about this recent back-and-forth: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/xen/netfront After September 10th this year, several bugs were identified in this new smartpoll logic. I then checked in our package, and we seem to be using an August 13th copy, so we're missing those. I'm not at all sure that this was the issue in my problem, because I don't completely grok all that stuff - I don't exactly know if xennet_interrupt() or smart_poll_function() are what's getting stuck in my use case - but debian/patches/features/all/xen/pvops.patch includes the smartpoll changes and plain drivers/net/xen-netfront.c doesn't, and the latter has worked fine for many months here while the former screwed us shortly after installation, so that's suspicious enough for me. Please update the kernel pvops patch to include the more recent xen/netfront branch - where the benefit is both in that known smartpoll bugs are fixed, and this new feature is turned off by default until upstream is more comfortable with it. TIA. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100925230436.ga3...@entuzijast.net
Bug#598057: our xen-netfront in featureset=xen kernels has smartpoll enabled, but probably shouldn't
forcemerge 596635 598057 thanks On Sun, Sep 26, 2010 at 12:34:28AM +0100, Ben Hutchings wrote: We know; this is already going to be fixed: * [x86/xen] Disable netfront's smartpoll mode by default. (Closes: #596635) Sorry, I didn't check the applicable bug list before sending, because of the simple fact that linux-2.6's bug page wouldn't load within half a dozen times 8 seconds, so I gave up and just checked linux-image-2.6.32-5-xen-amd64's bug page, which didn't have anything that resembled this. Oh well. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100926003128.ga24...@entuzijast.net
Bug#597276: qla2xxx_eh_abort(5) - kernel NULL pointer dereference
On Sun, Sep 19, 2010 at 11:44:50PM -0700, Giridhar Malavali wrote: Thanks for letting us know about this problem. Can u please provide logs with ql2xextended_error_logging enabled. Also, can u please provide more details about the test case. OK. The machine has this hardware: % sudo lspci -v [...] 0b:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02) Subsystem: Hewlett-Packard Company Device 7041 Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at 5000 [size=256] Memory at fdef (64-bit, non-prefetchable) [size=16K] [virtual] Expansion ROM at d000 [disabled] [size=256K] Capabilities: [44] Power Management version 2 Capabilities: [4c] Express Endpoint, MSI 00 Capabilities: [64] Message Signalled Interrupts: Mask- 64bit+ Queue=0/4 Enable- Capabilities: [74] Vital Product Data ? Capabilities: [7c] MSI-X: Enable- Mask- TabSize=16 Capabilities: [100] Advanced Error Reporting ? Capabilities: [138] Power Budgeting ? Kernel driver in use: qla2xxx Kernel modules: qla2xxx 0b:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02) Subsystem: Hewlett-Packard Company Device 7041 Flags: bus master, fast devsel, latency 0, IRQ 17 I/O ports at 5400 [size=256] Memory at fdee (64-bit, non-prefetchable) [size=16K] [virtual] Expansion ROM at d004 [disabled] [size=256K] Capabilities: [44] Power Management version 2 Capabilities: [4c] Express Endpoint, MSI 00 Capabilities: [64] Message Signalled Interrupts: Mask- 64bit+ Queue=0/4 Enable- Capabilities: [74] Vital Product Data ? Capabilities: [7c] MSI-X: Enable- Mask- TabSize=16 Capabilities: [100] Advanced Error Reporting ? Capabilities: [138] Power Budgeting ? Kernel driver in use: qla2xxx Kernel modules: qla2xxx 13:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02) Subsystem: Hewlett-Packard Company Device 7041 Flags: bus master, fast devsel, latency 0, IRQ 17 I/O ports at 6000 [size=256] Memory at fdff (64-bit, non-prefetchable) [size=16K] [virtual] Expansion ROM at d020 [disabled] [size=256K] Capabilities: [44] Power Management version 2 Capabilities: [4c] Express Endpoint, MSI 00 Capabilities: [64] Message Signalled Interrupts: Mask- 64bit+ Queue=0/4 Enable- Capabilities: [74] Vital Product Data ? Capabilities: [7c] MSI-X: Enable- Mask- TabSize=16 Capabilities: [100] Advanced Error Reporting ? Capabilities: [138] Power Budgeting ? Kernel driver in use: qla2xxx Kernel modules: qla2xxx 13:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02) Subsystem: Hewlett-Packard Company Device 7041 Flags: bus master, fast devsel, latency 0, IRQ 18 I/O ports at 6400 [size=256] Memory at fdfe (64-bit, non-prefetchable) [size=16K] [virtual] Expansion ROM at d024 [disabled] [size=256K] Capabilities: [44] Power Management version 2 Capabilities: [4c] Express Endpoint, MSI 00 Capabilities: [64] Message Signalled Interrupts: Mask- 64bit+ Queue=0/4 Enable- Capabilities: [74] Vital Product Data ? Capabilities: [7c] MSI-X: Enable- Mask- TabSize=16 Capabilities: [100] Advanced Error Reporting ? Capabilities: [138] Power Budgeting ? Kernel driver in use: qla2xxx Kernel modules: qla2xxx Anyway, we had been running an earlier 2.6.32 kernel up until a few days ago, which gave us this on boot: [2.656008] QLogic Fibre Channel HBA Driver: 8.03.01-k6-debug [2.656188] qla2xxx :0b:00.0: PCI INT A - GSI 16 (level, low) - IRQ 16 [2.710842] qla2xxx :0b:00.0: Found an ISP2432, irq 16, iobase 0xc9c6c000 [2.719526] qla2xxx :0b:00.0: MSI-X: Unsupported ISP2432 (0x2, 0x0). [2.727776] alloc irq_desc for 61 on node -1 [2.727778] alloc kstat_irqs on node -1 [2.728002] qla2xxx :0b:00.0: irq 61 for MSI/MSI-X [2.728184] qla2xxx :0b:00.0: MSI: Enabled. [2.732040] IRQ 59/cciss0: IRQF_DISABLED is not guaranteed on shared IRQs [2.732058] cciss0: 0x3230 at PCI :06:00.0 IRQ 59 using DAC [2.747326] qla2xxx :0b:00.0: Configuring PCI space... [2.747479] cciss/c0d0: p1 [2.755773] qla2xxx :0b:00.0: setting latency timer to 64 [2.756280] p2 [2.760467] qla2xxx :0b:00.0: FLTL[DEF] = 0x11400. [2.773807] qla2xxx :0b:00.0: FLT[DEF]: boot=0x0 fw=0x2 vpd_nvram=0x48000 vpd=0x0 nvram=0x0 fdt=0x11000 flt=0x11400 [2.787143] qla2xxx :0b:00.0: FDT[MID]: (0xbf/0x80) erase=0x7ffd0352 pro=0 upro=0 wrtd=0x9c blk=0x8000. [2.789701] qla2xxx :0b:00.0:
Bug#597276: qla2xxx_eh_abort(5) - kernel NULL pointer dereference
Package: linux-2.6 Version: 2.6.32-21~bpo50+1 Hi, Got this in dmesg on a server: Sep 18 02:46:52 birdun kernel: [387093.744649] qla2xxx_eh_abort(5): aborting sp 8801b58013c0 from RISC. pid=46881441. Sep 18 02:46:56 birdun kernel: [387093.836909] BUG: unable to handle kernel NULL pointer dereference at 0040 Sep 18 02:46:56 birdun kernel: [387093.924511] IP: [812f8ea1] _spin_lock_irqsave+0x1a/0x34 Sep 18 02:46:56 birdun kernel: [387093.996511] PGD 22d846067 PUD 22d678067 PMD 0 Sep 18 02:46:56 birdun kernel: [387094.048511] Oops: 0002 [#1] SMP Sep 18 02:46:56 birdun kernel: [387094.086651] last sysfs file: /sys/devices/pci:00/:00:04.0/:13:00.0/host4/rport-4:0-3/target4:0:3/fc_transport/target4:0:3/node_name Sep 18 02:46:56 birdun kernel: [387094.236007] CPU 4 Sep 18 02:46:56 birdun kernel: [387094.260007] Modules linked in: ipmi_devintf nf_conntrack_ipv6 ip6t_LOG ip6table_filter ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT ipt_LOG iptable_filter ip_tables x_tables bonding xfs exportfs dm_round_robin dm_multipath scsi_dh loop snd_pcsp snd_pcm snd_timer psmouse ipmi_si rng_core snd soundcore i5000_edac serio_raw hpilo ipmi_msghandler snd_page_alloc edac_core evdev container i5k_amb button processor shpchp pci_hotplug ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod st ch osst sd_mod crc_t10dif sg sr_mod cdrom ata_piix ata_generic qla2xxx scsi_transport_fc libata scsi_tgt cciss usbhid hid bnx2 ehci_hcd uhci_hcd floppy usbcore nls_base scsi_mod thermal fan thermal_sys Sep 18 02:46:56 birdun kernel: [387095.008511] Pid: 763, comm: scsi_eh_5 Not tainted 2.6.32-bpo.5-amd64 #1 ProLiant DL360 G5 Sep 18 02:46:56 birdun kernel: [387095.104511] RIP: 0010:[812f8ea1] [812f8ea1] _spin_lock_irqsave+0x1a/0x34 Sep 18 02:46:56 birdun kernel: [387095.204007] RSP: 0018:88022b1c5d70 EFLAGS: 00010082 Sep 18 02:46:56 birdun kernel: [387095.264511] RAX: 0282 RBX: 0040 RCX: 381d Sep 18 02:46:56 birdun kernel: [387095.348511] RDX: 0001 RSI: 0282 RDI: 0040 Sep 18 02:46:56 birdun kernel: [387095.432258] RBP: 8801b58013c0 R08: 000a26c8 R09: 000a Sep 18 02:46:56 birdun kernel: [387095.512512] R10: R11: 81673868 R12: 0001 Sep 18 02:46:56 birdun kernel: [387095.596512] R13: 88014066e100 R14: 8801b5801e80 R15: Sep 18 02:46:56 birdun kernel: [387095.684513] FS: () GS:880008d0() knlGS: Sep 18 02:46:56 birdun kernel: [387095.780002] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Sep 18 02:46:56 birdun kernel: [387095.844512] CR2: 0040 CR3: 00022d42b000 CR4: 06e0 Sep 18 02:46:56 birdun kernel: [387095.928512] DR0: DR1: DR2: Sep 18 02:46:56 birdun kernel: [387096.012511] DR3: DR6: 0ff0 DR7: 0400 Sep 18 02:46:56 birdun kernel: [387096.096005] Process scsi_eh_5 (pid: 763, threadinfo 88022b1c4000, task 88022ba39c40) Sep 18 02:46:56 birdun kernel: [387096.192511] Stack: Sep 18 02:46:56 birdun kernel: [387096.216511] 381d a014cb8b 0286 Sep 18 02:46:56 birdun kernel: [387096.300959] 0 ff10 8801b58013c0 2002 0286 Sep 18 02:46:56 birdun kernel: [387096.390206] 0 88022df0a900 88022b1c 88022b881840 a01407e4 Sep 18 02:46:56 birdun kernel: [387096.480511] Call Trace: Sep 18 02:46:56 birdun kernel: [387096.508511] [a014cb8b] ? qla24xx_abort_command+0x3f/0x1db [qla2xxx] Sep 18 02:46:56 birdun kernel: [387096.592513] [a01407e4] ? qla2xxx_eh_abort+0xf2/0x250 [qla2xxx] Sep 18 02:46:56 birdun kernel: [387096.672511] [a001ccde] ? scsi_error_handler+0x302/0x5b5 [scsi_mod] Sep 18 02:46:56 birdun kernel: [387096.756512] [a001c9dc] ? scsi_error_handler+0x0/0x5b5 [scsi_mod] Sep 18 02:46:56 birdun kernel: [387096.836513] [81063601] ? kthread+0x79/0x81 Sep 18 02:46:56 birdun kernel: [387096.896512] [81011baa] ? child_rip+0xa/0x20 Sep 18 02:46:56 birdun kernel: [387096.956511] [81063588] ? kthread+0x0/0x81 Sep 18 02:46:56 birdun kernel: [387097.012512] [81011ba0] ? child_rip+0x0/0x20 Sep 18 02:46:56 birdun kernel: [387097.072511] Code: 31 d2 89 d0 c3 f0 83 2f 01 79 05 e8 ca ae e9 ff c3 48 83 ec 08 9c 58 0f 1f 44 00 00 48 89 c6 fa 66 0f 1f 44 00 00 ba 00 00 01 00 f0 0f c1 17 0f b7 ca c1 ea 10 39 d1 74 07 f3 90 0f b7 0f eb f5 Sep 18 02:46:56 birdun kernel: [387097.292511] RIP [812f8ea1] _spin_lock_irqsave+0x1a/0x34 Sep 18 02:46:56 birdun kernel: [387097.364514] RSP 88022b1c5d70 Sep 18 02:46:56 birdun kernel: [387097.404511] CR2: 0040 Sep 18 02:46:56 birdun kernel:
Bug#594604: linux-image-2.6.32-5-sparc64-smp: Kernel panic - not syncing: Irrecoverable deferred error trap.
On Fri, Aug 27, 2010 at 07:29:13PM -0700, David Miller wrote: From: Josip Rodin j...@debbugs.entuzijast.net Date: Fri, 27 Aug 2010 21:31:37 +0200 David, can you please queue this sunxvr500.c post-2.6.32 bugfix to sta...@kernel.org? http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=bdd32ce95f79fb5cc964cd789d7ae4500bba7c6f Same story as the last one :) As you asked me to last week, I submitted this fix to all of the active branches of -stable earlier this week, so it should show up in the next round of -stable releases. :-) Oh, I'm sorry about that, I actually thought it was a different one. Oh well, better safe than sorry :) Thanks. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100828150515.ga24...@entuzijast.net
Bug#594604: linux-image-2.6.32-5-sparc64-smp: Kernel panic - not syncing: Irrecoverable deferred error trap.
On Fri, Aug 27, 2010 at 11:49:06PM +0800, James Andrewartha wrote: Package: linux-2.6 Version: 2.6.32-21 Severity: important Tags: patch I'm getting the same error and kernel log as mentioned in http://thread.gmane.org/gmane.linux.ports.sparc/13092 for which there is a patch at http://thread.gmane.org/gmane.linux.ports.sparc/13092/focus=13101 The hardware is a SunBlade 2000 with an XVR-1200. I've tested the patch against linux-source-2.6.32 version 2.6.32-21 and it boots successfully. This patch was included in 2.6.34 bdd32ce95f79fb5cc964cd789d7ae4500bba7c6f. David, can you please queue this sunxvr500.c post-2.6.32 bugfix to sta...@kernel.org? http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=bdd32ce95f79fb5cc964cd789d7ae4500bba7c6f Same story as the last one :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100827193137.ga22...@entuzijast.net
Bug#574243: please restore Sun XVR video drivers, and add the latest one [was Re: Debian Sparc 5.4 on Sun blade 2500]
On Sun, Aug 01, 2010 at 09:45:45PM -0400, Moritz Muehlenhoff wrote: On Wed, Mar 17, 2010 at 01:13:45AM +0100, Josip Rodin wrote: Package: linux-2.6 Version: 2.6.32-9 On Tue, Mar 16, 2010 at 03:24:37PM -0700, David Miller wrote: Josip, please make sure this gets fixed, please get my sunxvr1000 driver added (attached) and then add: CONFIG_FB_XVR500=y CONFIG_FB_XVR2500=y CONFIG_FB_XVR1000=y to the config for sparc64. Ok, further checking shows that lenny has XVR500 and XVR2500 enabled (doing a test install with a XVR-500 card right now) but testing doesn't. Indeed, they seem to have gone missing somehow from debian/config/sparc/config where there's just: # CONFIG_FB_LEO is not set # CONFIG_FB_S1D13XXX is not set whereas in the same source we have arch/sparc/configs/sparc64_defconfig where there's: # CONFIG_FB_LEO is not set CONFIG_FB_XVR500=y CONFIG_FB_XVR2500=y # CONFIG_FB_S1D13XXX is not set I'm filing a bug report with this message, thanks for the exact hint. As for the new driver that Dave mentioned as an attachment, it's already in mainline at: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2d378b9179881b46a0faf11430efb421fe03ddd8 That should apply pretty easily to .32 stable. I've added the patch and activated it in the Sparc config. Do you have the hardware, can you test a build? Me personally, no, but if you post a link I'm sure someone will crop up :) Just yesterday we had someone ask about XVR-1200. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100802074356.ga29...@orion.carnet.hr
upgrade to new xen domU on old xen dom0?
Hi, If I try to boot 2.6.32-4-xen-amd64 on a 2.6.26-2-xen-amd64 (lenny) dom0, it gets stuck at: [0.120653] XENBUS: Device with no driver: device/vbd/769 [0.120658] XENBUS: Device with no driver: device/vif/0 [0.120663] XENBUS: Device with no driver: device/console/0 [0.120679] /build/mattems-linux-2.6_2.6.32-10-amd64-Ff7Wwa/linux-2.6-2.6.32-10/debian/build/source_amd64_xen/drivers/rtc/hctosys.c: unable to open rtc device (rtc0) [0.120822] Freeing unused kernel memory: 588k freed [0.121088] Write protecting the kernel read-only data: 4264k Loading, please wait... Begin: Loading essential drivers ... done. Begin: Running /scripts/init-premount ... FATAL: Error inserting fan (/lib/modules/2.6.32-4-xen-amd64/kernel/drivers/acpi/fan.ko): No such device FATAL: Error inserting thermal (/lib/modules/2.6.32-4-xen-amd64/kernel/drivers/acpi/thermal.ko): No such device [0.610445] blkfront: xvda1: barriers enabled done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Waiting for root file system ... Can anything be done? I thought the domUs were supposed to be a safe upgrade? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100327105625.ga20...@orion.carnet.hr
Re: upgrade to new xen domU on old xen dom0?
On Sat, Mar 27, 2010 at 12:02:01PM +, Ian Campbell wrote: xen-blkfront is a module in the pvops based 2.6.32-x-xen-amd64 where as it was statically linked in the non-pvops 2.6.26-x-xen-and64 images. This already happened in Lenny for 32 bit guests (sort of) since the -686-bigmem kernel (which supports Xen) also uses modules for the drivers. I think the change is generally a step in the right direction. Perhaps running mkinitramfs within the 2.6.26 environment causes the 2.6.32 initrd to not contain the correct module? (since it can't detect the requirement for the module because the current kernel has it statically linked?) This should be fixable with some configuration in the guest (e.g. add the modules to /etc/initramfs-tools/modules). I ran the default install of the image package on the guest running .18, and then copied the image and initrd over to the parent. I extracted that initrd image now and I see lib/modules/2.6.32-4-xen-amd64/kernel/drivers/block/xen-blkfront.ko in it. Are you saying it could have gotten missed by the initrd init scripts even though it's there? Couldn't we fix that automatism? I diffed the trees and noticed that kernel/drivers/net/xen-netfront.ko is missing from the initrd, but that's probably non-fatal. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100327161006.ga25...@orion.carnet.hr
Re: upgrade to new xen domU on old xen dom0?
On Sat, Mar 27, 2010 at 11:56:25AM +0100, joy wrote: [0.610445] blkfront: xvda1: barriers enabled done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Waiting for root file system ... Can anything be done? I thought the domUs were supposed to be a safe upgrade? I missed Bastian's message as I'm not subscribed - please keep me in Cc:. What was the last known working version? The one from lenny. Well, for some values of working at least :) You are supposed to provide the complete log. There are several possible pitfalls. If you are upgrading from an old-style image and followed old documentation it is most likely a wrong root device. I just replaced the kernel and ramdisk settings on the old dom0. The relevant settings, that work with our .26 and .18, are: root= '/dev/hda1 ro' disk= [ 'phy:pavo/lastovo,hda1,w' ] What do I need to change? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100327161510.ga27...@orion.carnet.hr
Re: upgrade to new xen domU on old xen dom0?
On Sat, Mar 27, 2010 at 05:28:28PM +0100, Bastian Blank wrote: What was the last known working version? The one from lenny. Well, for some values of working at least :) Well, Lenny have two variants. The early pv-ops and the oldstyle one. We had early pvops in lenny? Where? :) If you are upgrading from an old-style image and followed old documentation it is most likely a wrong root device. I just replaced the kernel and ramdisk settings on the old dom0. The relevant settings, that work with our .26 and .18, are: root= '/dev/hda1 ro' disk= [ 'phy:pavo/lastovo,hda1,w' ] Yeah, using [hs]d[a-z]* was already deprecated in Lenny. What do I need to change? Use xvda as device name. OK, that works, thanks. We have got to get this documented somewhere now that the deprecated option is broken. There is no mention of it at http://wiki.debian.org/Xen and simple googling is far from conclusive. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100327170726.ga6...@orion.carnet.hr
Re: upgrade to new xen domU on old xen dom0?
On Sat, Mar 27, 2010 at 06:11:15PM +, Ian Campbell wrote: OK, that works, thanks. We have got to get this documented somewhere now that the deprecated option is broken. There is no mention of it at http://wiki.debian.org/Xen and simple googling is far from conclusive. Would you mind updating the wiki with your findings? OK, done. Speaking of the new domU, does anyone know anything about this: [0.00] Calgary: detecting Calgary via BIOS EBDA area [0.00] Calgary: Unable to locate Rio Grande table in EBDA - bailing! -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100327210925.ga29...@orion.carnet.hr
Re: Xen dom0 2.6.32 stable branch
On Thu, Mar 18, 2010 at 01:52:34AM +0100, joy wrote: On Wed, Feb 24, 2010 at 01:05:53PM +0100, joy wrote: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/xen/stable It's great to see the new packages :) I didn't want to rain on the parade by instantly filing bug reports, but I must point out a bit of a problem with the .32 kernel that may have something to do with (the lack of) the NX bit: http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00243.html http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00658.html I'm getting ready to start bisecting. Sadly that didn't help, but regardless, that problem was fixed yesterday with http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=de67ec8b23629776f786d62c3109552ea7f8cc27 Please update the package with the up-to-date xen/stable. Want a critical bug report as a reminder? :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100324092424.ga16...@orion.carnet.hr
Bug#575183: fails to boot on SGI C2108-F6 server under Xen 3.4 hypervisor
Ian Campbell wrote: IIRC these kernels require a newer hypervisor than is in stable at the moment, at a minimum you need 3.4.3, RC's are available in testing. I'd just like to confirm this, I distinctly recall seeing the mention of the exact Mercurial changeset on the xen-devel list for a new hypercall. Russell also mentioned it implicitly at http://etbe.coker.com.au/2010/03/21/xen-debian-squeeze/ William, did you try the hypervisor upgrade, does it work then? The new paravirt_ops Xen dom0 kernel packages should probably simply have a: Conflicts: xen-hypervisor-3.4-$ARCH ( 3.4.3~rc3), xen-hypervisor-3.2-$ARCH, xen-hypervisor-3.0-$ARCH -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100324094420.ga19...@orion.carnet.hr
Bug#575183: fails to boot on SGI C2108-F6 server under Xen 3.4 hypervisor
On Wed, Mar 24, 2010 at 11:09:45AM +0100, Bastian Blank wrote: IIRC these kernels require a newer hypervisor than is in stable at the moment, at a minimum you need 3.4.3, RC's are available in testing. The new paravirt_ops Xen dom0 kernel packages should probably simply have a: Conflicts: xen-hypervisor-3.4-$ARCH ( 3.4.3~rc3), xen-hypervisor-3.2-$ARCH, xen-hypervisor-3.0-$ARCH No. Kernels are co-installable. Oh, crap, I forgot, yes. Maybe postinst messages then? Since the alternative is usually an instant reboot loop, which will inevitably result in people complaining. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100324114720.ga17...@orion.carnet.hr
Re: Xen dom0 2.6.32 stable branch
On Wed, Feb 24, 2010 at 01:05:53PM +0100, joy wrote: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/xen/stable It's great to see the new packages :) I didn't want to rain on the parade by instantly filing bug reports, but I must point out a bit of a problem with the .32 kernel that may have something to do with (the lack of) the NX bit: http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00243.html http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00658.html I'm getting ready to start bisecting. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100318005234.ga28...@orion.carnet.hr
Bug#574243: please restore Sun XVR video drivers, and add the latest one [was Re: Debian Sparc 5.4 on Sun blade 2500]
Package: linux-2.6 Version: 2.6.32-9 On Tue, Mar 16, 2010 at 03:24:37PM -0700, David Miller wrote: Josip, please make sure this gets fixed, please get my sunxvr1000 driver added (attached) and then add: CONFIG_FB_XVR500=y CONFIG_FB_XVR2500=y CONFIG_FB_XVR1000=y to the config for sparc64. Ok, further checking shows that lenny has XVR500 and XVR2500 enabled (doing a test install with a XVR-500 card right now) but testing doesn't. Indeed, they seem to have gone missing somehow from debian/config/sparc/config where there's just: # CONFIG_FB_LEO is not set # CONFIG_FB_S1D13XXX is not set whereas in the same source we have arch/sparc/configs/sparc64_defconfig where there's: # CONFIG_FB_LEO is not set CONFIG_FB_XVR500=y CONFIG_FB_XVR2500=y # CONFIG_FB_S1D13XXX is not set I'm filing a bug report with this message, thanks for the exact hint. As for the new driver that Dave mentioned as an attachment, it's already in mainline at: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2d378b9179881b46a0faf11430efb421fe03ddd8 That should apply pretty easily to .32 stable. I'm not sure offhand what the Debian policy is about tracking linux-stable vs. adding new code, but either way this seems pretty uncontroversial - it's a separate new driver which won't hurt any existing users, because its OF match of SUNW,gfb simply does not overlap with anything else in drivers/video/. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100317001345.ga14...@orion.carnet.hr
Bug#572442: sparc 2.6.29+ NMI watchdog deadlock on Sun Fire V240 etc
Package: linux-2.6 Severity: serious Tags: upstream patch Hi there, Ever since kernel 2.6.29 came out, several classes of sparc machines have been unable to upgrade, because they would get stuck while initializing the new NMI watchdog code. The process of trying to figure it out is mostly documented in this long-running mailing list thread that spanned many months: http://lists.debian.org/debian-sparc/2009/08/msg5.html http://lists.debian.org/debian-sparc/2009/09/msg00018.html http://lists.debian.org/debian-sparc/2009/10/msg00015.html http://lists.debian.org/debian-sparc/2009/11/msg00034.html http://lists.debian.org/debian-sparc/2009/12/msg0.html Had this gone unattended, sparc release requalification might have been in trouble, because the bug affects the Fire V240 sparc buildd machines as well as Jurij Smakov's test machine, and that's a lot in our little universe :) Fortunately David Miller came to the rescue and personally debugged the problem on one of the buildds, and fixed the problem. His solution, that we are currently running on schroeder.debian.org, is attached. Please include the patch in the sparc kernel package so that we can test it widely, preferably ASAP. TIA. - Forwarded message from David Miller da...@davemloft.net - Date: Wed, 03 Mar 2010 09:11:41 -0800 (PST) Subject: Re: Sparc release requalification Ok, I think I fixed it. Attached are two versions of the fix, the first attachment is for 2.6.33 and the second one is for any kernel 2.6.32 and previous. Give it a good test on any machine you've seen this problem on and let me know how it goes. Thanks. From 8a4fd1e4922413cfdfa6c51a59efb720d904a5eb Mon Sep 17 00:00:00 2001 From: David S. Miller da...@davemloft.net Date: Wed, 3 Mar 2010 09:06:03 -0800 Subject: [PATCH] sparc64: Make prom entry spinlock NMI safe. If we do something like try to print to the OF console from an NMI while we're already in OpenFirmware, we'll deadlock on the spinlock. Use a raw spinlock and disable NMIs when we take it. Signed-off-by: David S. Miller da...@davemloft.net --- arch/sparc/prom/p1275.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/sparc/prom/p1275.c b/arch/sparc/prom/p1275.c index 4b7c937..2d8b70d 100644 --- a/arch/sparc/prom/p1275.c +++ b/arch/sparc/prom/p1275.c @@ -32,10 +32,9 @@ extern void prom_cif_interface(void); extern void prom_cif_callback(void); /* - * This provides SMP safety on the p1275buf. prom_callback() drops this lock - * to allow recursuve acquisition. + * This provides SMP safety on the p1275buf. */ -DEFINE_SPINLOCK(prom_entry_lock); +DEFINE_RAW_SPINLOCK(prom_entry_lock); long p1275_cmd(const char *service, long fmt, ...) { @@ -47,7 +46,9 @@ long p1275_cmd(const char *service, long fmt, ...) p = p1275buf.prom_buffer; - spin_lock_irqsave(prom_entry_lock, flags); + raw_local_save_flags(flags); + raw_local_irq_restore(PIL_NMI); + raw_spin_lock(prom_entry_lock); p1275buf.prom_args[0] = (unsigned long)p; /* service */ strcpy (p, service); @@ -139,7 +140,8 @@ long p1275_cmd(const char *service, long fmt, ...) va_end(list); x = p1275buf.prom_args [nargs + 3]; - spin_unlock_irqrestore(prom_entry_lock, flags); + raw_spin_unlock(prom_entry_lock); + raw_local_irq_restore(flags); return x; } -- 1.6.6.1 sparc64: Make prom entry spinlock NMI safe. If we do something like try to print to the OF console from an NMI while we're already in OpenFirmware, we'll deadlock on the spinlock. Disable NMIs when we take it. Signed-off-by: David S. Miller da...@davemloft.net diff --git a/arch/sparc/prom/p1275.c b/arch/sparc/prom/p1275.c index 4b7c937..815cab6 100644 --- a/arch/sparc/prom/p1275.c +++ b/arch/sparc/prom/p1275.c @@ -32,8 +32,7 @@ extern void prom_cif_interface(void); extern void prom_cif_callback(void); /* - * This provides SMP safety on the p1275buf. prom_callback() drops this lock - * to allow recursuve acquisition. + * This provides SMP safety on the p1275buf. */ DEFINE_SPINLOCK(prom_entry_lock); @@ -47,7 +46,9 @@ long p1275_cmd(const char *service, long fmt, ...) p = p1275buf.prom_buffer; - spin_lock_irqsave(prom_entry_lock, flags); + raw_local_save_flags(flags); + raw_local_irq_restore(PIL_NMI); + spin_lock(prom_entry_lock); p1275buf.prom_args[0] = (unsigned long)p; /* service */ strcpy (p, service); @@ -139,7 +140,8 @@ long p1275_cmd(const char *service, long fmt, ...) va_end(list); x = p1275buf.prom_args [nargs + 3]; - spin_unlock_irqrestore(prom_entry_lock, flags); + spin_unlock(prom_entry_lock); + raw_local_irq_restore(flags); return x; } - End forwarded message - -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Mar 04, 2010 at 05:21:31PM +0100, Markus Hochholdinger wrote: In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. have you any service like ntp running on these boxes!? What will the app do if ntp corrects the time!? NTP always corrects time in a very subtle manner (see its documentation). I guess it's possible for it to screw with this, yet it never has. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100304162452.ga22...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
Hi, Markus Hochholdinger wrote: Here is my solution to this problem, lenny xen kernel: * dom0 with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * domU with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 Using jiffies as a clock source is not a solution, it's a workaround, because its resolution (CONFIG_HZ^1) is not good enough for reading microseconds, that is, time with microseconds will become just monotonic. This will cause problems for any program that wants its time readouts to be strictly increasing, as real-world time usually is :) In other words you will get this (real example from a while ago): % i=0; while :; do i=$((i+1)); if [ $i = 20 ]; then break; fi; date --rfc-3339=ns; done 2009-09-23 13:35:13.123400807+02:00 2009-09-23 13:35:13.127400857+02:00 2009-09-23 13:35:13.131400906+02:00 2009-09-23 13:35:13.135400956+02:00 2009-09-23 13:35:13.139401005+02:00 2009-09-23 13:35:13.143401055+02:00 2009-09-23 13:35:13.147401104+02:00 2009-09-23 13:35:13.151401154+02:00 2009-09-23 13:35:13.151401154+02:00 2009-09-23 13:35:13.155401203+02:00 2009-09-23 13:35:13.155401203+02:00 2009-09-23 13:35:13.159401253+02:00 2009-09-23 13:35:13.163401302+02:00 2009-09-23 13:35:13.167401352+02:00 2009-09-23 13:35:13.171401401+02:00 2009-09-23 13:35:13.171401401+02:00 2009-09-23 13:35:13.175401451+02:00 2009-09-23 13:35:13.179401500+02:00 2009-09-23 13:35:13.183401550+02:00 In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Having said that, if one doesn't switch to jiffies but wants to use live migration, that ends up being hampered by the Time went backwards problem. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225111854.ga18...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 01:33:05PM +0100, Bastian Blank wrote: On Thu, Feb 25, 2010 at 12:18:54PM +0100, Josip Rodin wrote: Using jiffies as a clock source is not a solution, it's a workaround, because its resolution (CONFIG_HZ^1) is not good enough for reading microseconds, that is, time with microseconds will become just monotonic. This will cause problems for any program that wants its time readouts to be strictly increasing, as real-world time usually is :) No. The time resolution is not defined and within one step it will always provide the same value. What? :) The problem here is that a time readout function provides the same value across *two* steps. A monotonic function is one which allows for that. A strictly increasing function is one which does not. Most of the time, just monotonic is okay, but not always. and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? The key column has a unique constraint and a default value of current timestamp. Even if two perfectly concurrent writers come in to add a new record, it's still logical to expect for them to be serialized to a minimal extent, because the database itself is explicitly instructed to input all values and maintain their uniqueness. The expectation that all updates take at least one minimal unit of time is perhaps not theoretically valid, but it's certainly like that in the real world (every action takes *some* perceivable time). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225132755.ga11...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 03:01:41PM +0100, Bastian Blank wrote: No. The time resolution is not defined and within one step it will always provide the same value. What? :) The problem here is that a time readout function provides the same value across *two* steps. A monotonic function is one which allows for that. A strictly increasing function is one which does not. Most of the time, just monotonic is okay, but not always. No, the time is only monotone, not strictly monotone. (With discreet values, it is not possible to make it strictly monotone.) You mean discrete. It's impossible to make it strictly monotone in the resolution that is smaller than the smallest unit of time (or one that converges into zero). But anyway, the problem isn't just the monotonicity, it's simply that e.g. with HZ of 250, a jiffie takes 4ms, so if you need to do anything with something that takes a comparable amount of time, you're shit outta luck. and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? The key column has a unique constraint and a default value of current timestamp. Even if two perfectly concurrent writers come in to add a new record, it's still logical to expect for them to be serialized to a minimal extent, because the database itself is explicitly instructed to input all values and maintain their uniqueness. The expectation that all updates take at least one minimal unit of time is perhaps not theoretically valid, but it's certainly like that in the real world (every action takes *some* perceivable time). Wrong answer. Where is this documented as working in the postgresql documentation? I have no idea. Why would I need an exact documentation of this use case? The unique and default key parameters, and the definition of the timestamp data type are documented. Indeed, I just checked and the resolution of a timestamp is explicitly documented as 1 microsecond, so if the underlying system has a resolution of 4000 microseconds, that simply precludes it. If you're trying to argue that nobody should be using anything using microseconds because they're not supported by clocksource=jiffies, well, then we might as well cease this pointless discussion. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225150156.ga...@orion.carnet.hr
Xen dom0 2.6.32 stable branch
Hi, Just in case I'm the first to notice, we now have: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/xen/stable This is the new upstream default branch, with paravirt_ops dom0, and based on 2.6.32-stable, so it's presumably suitable for inclusion as a new patch that would restore our packages linux-image-2.6-xen-{686,amd64} which were previously based on forward-ported .18 dom0 patches. Yay. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100224120553.ga2...@orion.carnet.hr
Re: [Users] New Kernel Patch
On Sat, Jan 16, 2010 at 12:17:19PM +0100, Suno Ano wrote: currently (January 2010) mainline is in development for the .33 release, .32 is stable and used by most Linux Distributions like for example Debian, Ubuntu, Suse, etc. From what it looks now Debian and Ubuntu are going into freeze for their next stable release in March 2010. Will there be an up-to-date OpenVZ kernel patch available by then? Debian is targeting to ship .32 with their next stable release called squeeze. In case OpenVZ will not be available on at least one of the major Linux distributions and its offsprings, no need to mention how horrid that would be ... http://lists.debian.org/debian-devel-announce/2009/10/msg3.html said OpenVZ will remain supported, but and http://lists.debian.org/debian-release/2009/08/msg00233.html had previously went unanswered and I don't see anything new at http://packages.debian.org/linux-image-2.6-openvz-686 I'm thinking the most usable compromise would be if someone volunteered to maintain the Debian packages of the actual kernel stable release 2.6.27 - where the meaning of stable more closely corresponds to the Debian stable release concept. For off-the-shelf usage, mainline releases can satisfy the same definition, but for corner cases it's doubtful because they tend to move too fast for people to track them reliably. I have to mention that Xen has a similar problem - there are XCI 2.6.27 patches which seem to be maintained, whereas it's doubtful anyone really wants to continue forward-porting the old branch to .32. Xen upstream do have an advanced paravirt_ops dom0 branch (it's much further along than LXC vs. OpenVZ, judging by the LXC description in this thread), but it would still be a regression compared to the old branch for some people who use some of those still-unimplemented features, so it's not a drop-in replacement yet. I'm Cc:ing Adrian Bunk - given that you initated the marking of .27 as the real stable, and Greg KH is still maintaining .27 upstream, I can't help but wonder if you might be willing to maintain those packages? :) Also Cc:'ing the debian-kernel mailing list. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#559035: fyi .27 stable has it
stable/linux-2.6.27.y has this patch since 2009-10-12: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.27.y.git;a=commitdiff;h=2578cf95969936c372db29ee2bbc21c9b6a299aa it's included in the release since v2.6.27.37. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#535331: ditto
On Fri, Oct 23, 2009 at 11:54:21AM +0200, Josip Rodin wrote: I've experienced the same problem. I've got two lenny machines which have [...] FWIW Here's the last upgrade output pasted exactly as it just happened: % sudo apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: linux-image-2.6.26-2-686 1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 20,2MB of archives. After this operation, 0B of additional disk space will be used. Do you want to continue [Y/n]? Get:1 http://security.debian.org lenny/updates/main linux-image-2.6.26-2-686 2.6.26-19lenny2 [20,2MB] Fetched 20,2MB in 21s (929kB/s) Preconfiguring packages ... (Reading database ... 24703 files and directories currently installed.) Preparing to replace linux-image-2.6.26-2-686 2.6.26-19lenny1 (using .../linux-image-2.6.26-2-686_2.6.26-19lenny2_i386.deb) ... The directory /lib/modules/2.6.26-2-686 still exists. Continuing as directed. Done. Unpacking replacement linux-image-2.6.26-2-686 ... Setting up linux-image-2.6.26-2-686 (2.6.26-19lenny2) ... Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) % % sudo perl -pi -e 's,^(my \$loader\s+=\s+),$1lilo,' /var/lib/dpkg/info/linux-image-2.6.26-2-686.postinst % sudo dpkg-reconfigure linux-image-2.6.26-2-686 Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny2 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny2 was configured last, according to dpkg) You already have a LILO configuration in /etc/lilo.conf Running boot loader as requested Testing lilo.conf ... Testing successful. Installing the partition boot sector... Running /sbin/lilo ... Installation successful. % [...] after upgrading linux-image-2.6.26-2-686, I just get [...] FWIW it also happens on the amd64 version, exactly the same: % sudo apt-get upgrade Reading package lists... Done Building dependency tree Reading state information... Done The following packages will be upgraded: linux-image-2.6.26-2-amd64 1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 20,9MB of archives. After this operation, 4096B of additional disk space will be used. Do you want to continue [Y/n]? Get:1 http://security.debian.org lenny/updates/main linux-image-2.6.26-2-amd64 2.6.26-19lenny2 [20,9MB] Fetched 20,9MB in 20s (1013kB/s) Preconfiguring packages ... (Reading database ... 20849 files and directories currently installed.) Preparing to replace linux-image-2.6.26-2-amd64 2.6.26-19lenny1 (using .../linux-image-2.6.26-2-amd64_2.6.26-19lenny2_amd64.deb) ... The directory /lib/modules/2.6.26-2-amd64 still exists. Continuing as directed. Done. Unpacking replacement linux-image-2.6.26-2-amd64 ... Setting up linux-image-2.6.26-2-amd64 (2.6.26-19lenny2) ... Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) % % sudo perl -pi -e 's,^(my \$loader\s+=\s+),$1lilo,' /var/lib/dpkg/info/linux-image-2.6.26-2-amd64.postinst % sudo dpkg-reconfigure linux-image-2.6.26-2-amd64 Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny2 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny2 was configured last, according to dpkg) You already have a LILO configuration in /etc/lilo.conf Running boot loader as requested Testing lilo.conf ... Testing successful. Installing the partition boot sector... Running /sbin/lilo ... Installation successful. % -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#535331: ditto
Hi, I've experienced the same problem. I've got two lenny machines which have GPT paritition tables and Linux root on LVM, and they can't use anything but LILO (there are some novelty hacks for GRUB but I haven't been able to test them yet because this is in production). I have kernel-img.conf set up right, but after upgrading linux-image-2.6.26-2-686, I just get the not updating symbolic links messages and no triggers or boot loaders are run. If left unattended, this typically renders these two systems unbootable. It really looks like a failure to define the $loader variable in the predefined variables section. If I just put 'lilo' in there and re-run dpkg-reconfigure linux-image-2.6.26-2-686, the output changes to: Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) You already have a LILO configuration in /etc/lilo.conf Running boot loader as requested Testing lilo.conf ... Testing successful. Installing the partition boot sector... Running /sbin/lilo ... Installation successful. This is what would be expected. The run_lilo() function goes out of its way to determine whether the existence of /etc/lilo.conf is sufficient reason to run lilo, so there doesn't appear to be any reason to completely omit it. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#535331: ditto
On Fri, Oct 23, 2009 at 01:15:52PM +0200, maximilian attems wrote: On Fri, Oct 23, 2009 at 11:54:21AM +0200, Josip Rodin wrote: Hi, I've experienced the same problem. I've got two lenny machines which have GPT paritition tables and Linux root on LVM, and they can't use anything but LILO (there are some novelty hacks for GRUB but I haven't been able to test them yet because this is in production). I have kernel-img.conf set up right, but after upgrading linux-image-2.6.26-2-686, I just get the not updating symbolic links messages and no triggers or boot loaders are run. If left unattended, this typically renders these two systems unbootable. It really looks like a failure to define the $loader variable in the predefined variables section. If I just put 'lilo' in there and re-run dpkg-reconfigure linux-image-2.6.26-2-686, the output changes to: Running depmod. Running mkinitramfs-kpkg. Not updating initrd symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) Not updating image symbolic links since we are being updated/reinstalled (2.6.26-19lenny1 was configured last, according to dpkg) You already have a LILO configuration in /etc/lilo.conf Running boot loader as requested Testing lilo.conf ... Testing successful. Installing the partition boot sector... Running /sbin/lilo ... Installation successful. This is what would be expected. The run_lilo() function goes out of its way to determine whether the existence of /etc/lilo.conf is sufficient reason to run lilo, so there doesn't appear to be any reason to completely omit it. from the affected box: cat /etc/kernel-img.conf I fail to see the benefit, but here goes - on both it's identical: % cat /etc/kernel-img.conf # Kernel image management overrides # See kernel-img.conf(5) for details do_symlinks = yes relative_links = yes do_bootloader = yes do_bootfloppy = no do_initrd = yes link_in_boot = yes -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#525958: Sparc release requalification
On Tue, Sep 15, 2009 at 08:38:22PM +0100, Jurij Smakov wrote: If the PROM console driver still has some utility, maybe the boot option is the way to go... does it? Does anyone still manufacture new machines with new and strange console types that we don't support? :) The PROM console driver has no relevance today at all. It should simply never be used. OK, cool, please remove it, and also please propagate that to the stable branches so we don't miss anyone who's not on the bleeding edge. Please feel free to follow up on http://bugs.debian.org/525958 which I've filed in April to have the CONFIG_PROM_CONSOLE removed. Yeah, definitely, Cc:ed. Dave has in the meantime killed it completely in his stable tree: http://git.kernel.org/?p=linux/kernel/git/davem/sparc-2.6.git;a=commit;h=09d3f3f0e02c8a900d076c302c5c02227f33572d There's another commit that takes it out of the defconfigs completely - it was already unset there for a while now. So it's pretty much official now (well, Linus still has to take it in, but that should be a formality). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#514418: [FIX]: ultra45 boot failing...
On Sun, Feb 08, 2009 at 04:58:08PM -0800, David Miller wrote: So you're saying that X working is more important than machines actually booting at all? These priorities are wrong. When N (where N 0) users complain about dead X, and 0 users complain about not being able to boot, the priorities are fairly clear... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#514418: [FIX]: ultra45 boot failing...
On Mon, Feb 09, 2009 at 12:28:18AM -0800, David Miller wrote: So you're saying that X working is more important than machines actually booting at all? These priorities are wrong. When N (where N 0) users complain about dead X, and 0 users complain about not being able to boot, the priorities are fairly clear... If their machine won't even boot into the installer they are unlikely to even make a report. Nobody (before you that is) reported an installer failure on these machines, so the situation is still clear from our point of view - it's certainly not perfect or even good, but the system as a whole depends on user input. Furthermore the point remains that you put a change into the kernel that I would never have advocated had you presented the bug to me. I would have suggested ways to fix the X server and even worked on the patch. But since nobody contacted me about this, a broken change went into the kernel instead. That is true, someone should have contacted you (sparclinux list at least) about that. But then, it would have been completely your prerogative to respond to that simply by saying - DTRT and go upgrade X, patching old X is a waste of my time, and I guess nobody wanted to risk hearing that answer? :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#500358: Fix found
On Sun, Nov 09, 2008 at 10:30:38PM +0100, Bastian Blank wrote: SPARC is a traditionally brand architecture. This case affects Ultra 5 and may be several other workstation. So if something doesn't function on one box it doesn't function on a whole generation of boxes. I think this is quite a big part of all Debian SPARC users. This still does not qualify for the severity grave: | makes the package in question unusable or mostly so, It still runs. And the Sparc machines I use don't show such problems. Seeing how you're interested in this kind of bureaucratic nitpicking :p I should point out that grave is actually too light a severity for this bug, and critical should be used instead - the kernel upgrade broke the X server, so it's a critical bug by definition (makes unrelated software on the system break). The part that fit the grave severity was makes the package in question mostly unusable, which is what any typical X user would say in this situation. In any case this is a pointless exercise, let's just make sure the bug is fixed and go forward. I hope I'll be verifying the submitted patch on my Ultra 5 soon. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#498536: Acknowledgement (.26 breaks firmware loading for qla2xxx on sparc)
This has been fixed a couple of days ago by Andrew Vasquez with some help by Dave Miller. The patch is sent to the linux-scsi list/maintainers for inclusion in -next as well as in -stable (Message-Id: [EMAIL PROTECTED]). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#498536: .26 breaks firmware loading for qla2xxx on sparc
Package: linux-image-2.6.26-1-sparc64-smp Version: 2.6.26-4 Severity: grave Tags: upstream Hi, qla2xxx's firmware loading thingy got hosed between .25 and .26, I've already reported something along these lines to upstream, but I just verified it with our kernel image so I'm filing it here too. [egin: Loading essential drivers... ... 49.444722] SCSI subsystem initialized [ 49.524563] QLogic Fibre Channel HBA Driver: 8.02.01-k4-debug [ 49.602779] qla2xxx 0001:00:04.0: Found an ISP2200, irq 19, iobase 0x07fd [010 [ 49.714344] qla2xxx 0001:00:04.0: Configuring PCI space... [ 49.789041] scsi(0): No matching ROM signature. [ 49.851178] qla2xxx 0001:00:04.0: Configure NVRAM parameters... [ 50.025526] qla2xxx 0001:00:04.0: Inconsistent NVRAM detected: checksum=0x0 i L=4qla2xxx 0001:00:04.0: Falling back to functioning (yet invalid -- WWPN) def Bults. [ 50.230463] scsi(0): NVRAM configuration failed! [ 50.293695] qla2xxx 0001:00:04.0: Verifying loaded RISC code... [ 50.373898] scsi(0): Load RISC code 0 50.449209] firmware: requesting ql2200_fw.bin [ 110.508456] scsi(0): Failed to load firmware image (ql2200_fw.bin). [ 110.593119] qla2xxx 0001:00:04.0: Firmware image unavailable. [ 110.671025] qla2xxx 0001:00:04.0: Firmware images can be retrieved from: ftp: [/ftp.qlogic.com/outgoing/linux/firmware/. d 110.819988] scsi(0): Setup chip FAILED . a 110.884362] qla2xxx 0001:00:04.0: Failed to initialize adapter [ 110.963425] scsi(0): Failed to initialize adapter - Adapter flags 10. [one. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: closed by maximilian attems [EMAIL PROTECTED] (Re: snd-intel8x0 line-in not working in later 2.6.x kernels)
On Tue, May 20, 2008 at 04:54:06PM +, Debian Bug Tracking System wrote: closing as according to upstream not a driver issue. marked as resolved thus closing. I would appreciate it if you could first answer the question which I asked in August last year (which was the reason I didn't close the bug myself): Can someone tell me the proper steps to test the default value to see if the bug was really just some local mishap? It's a bit annoying to see people ignore you for nine months, and then close the bug report rather than answering it. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: closed by maximilian attems [EMAIL PROTECTED] (Re: snd-intel8x0 line-in not working in later 2.6.x kernels)
reopen 439072 thanks On Tue, May 20, 2008 at 07:08:59PM +0200, Josip Rodin wrote: On Tue, May 20, 2008 at 04:54:06PM +, Debian Bug Tracking System wrote: closing as according to upstream not a driver issue. marked as resolved thus closing. I would appreciate it if you could first answer the question which I asked in August last year (which was the reason I didn't close the bug myself): Can someone tell me the proper steps to test the default value to see if the bug was really just some local mishap? It's a bit annoying to see people ignore you for nine months, and then close the bug report rather than answering it. I almost forgot - that's disregarding the simple fact that even if the driver made no mistake with this, this 2ch vs 4ch issue is not documented anywhere, neither in kernel (linux-2.6/Documentation/sound/alsa/?) nor in userland (alsamixer(1), arecord(1), ...?), so if anyone else sees this behaviour, whether due to one's own change or a possible bad default, there is no recourse. Closing bugs just like that may well be acceptable elsewhere, but I thought we had a bit higher standards in Debian. :| -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: closed by maximilian attems [EMAIL PROTECTED] (Re: snd-intel8x0 line-in not working in later 2.6.x kernels)
On Tue, May 20, 2008 at 08:01:05PM +0200, maximilian attems wrote: closing as according to upstream not a driver issue. marked as resolved thus closing. I would appreciate it if you could first answer the question which I asked in August last year (which was the reason I didn't close the bug myself): Can someone tell me the proper steps to test the default value to see if the bug was really just some local mishap? It's a bit annoying to see people ignore you for nine months, and then close the bug report rather than answering it. I almost forgot - that's disregarding the simple fact that even if the driver made no mistake with this, this 2ch vs 4ch issue is not documented anywhere, neither in kernel (linux-2.6/Documentation/sound/alsa/?) nor in userland (alsamixer(1), arecord(1), ...?), so if anyone else sees this behaviour, whether due to one's own change or a possible bad default, there is no recourse. Closing bugs just like that may well be acceptable elsewhere, but I thought we had a bit higher standards in Debian. :| great. firstly this is *not* a kernel bug. Well, assertions are nice, but useless. It's a bug in the kernel module if it produces completely unexpected results after an option is changed - if such results are expected, then that expectation needs to be documented *somewhere*. secondly this i not the way you'll get alsa userland support You're free to clone and/or reassign the bug to the right alsa userland packages. thirdly your message did mention that you even *tried* current 2.6.25 The Documentation/sound/alsa/ that I checked was with 2.6.25.4, so, yes, the problem applies to the current version. candidate to closure unless you provide info that is is a kernel bug. Again with the closure... You really should read the manual regarding how to deal with bug reports. For example, http://www.debian.org/doc/developers-reference/ch-pkgs.en.html#s-bug-handling -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: sparc and testing migration
On Fri, Nov 30, 2007 at 01:18:29AM +0100, Josip Rodin wrote: On Sun, Sep 02, 2007 at 07:30:24PM +0200, Andreas Barth wrote: as you all are probably aware, we currently have some quite bad issues with the sparc buildds for some times, especially http://bugs.debian.org/433187 unkillable processes on the buildds. I hope that the mentioned RC bug can be fixed soon - if so, we're happy to stop ignoring issues on sparc (or rather: we probably will find us in the situation that such cases cease to exist). I haven't seen a reply to this mail, so JFTR - that bug was fixed. Though I'm still not sure if packages built on the new buildd are getting uploaded, I haven't been able to contact James about it. I happened to run into him last night on IRC - he confirmed that things are all right on lebrun, whereas we still need to upgrade the kernel on spontini before that machine can also build unstable. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: sparc and testing migration
On Sun, Sep 02, 2007 at 07:30:24PM +0200, Andreas Barth wrote: as you all are probably aware, we currently have some quite bad issues with the sparc buildds for some times, especially http://bugs.debian.org/433187 unkillable processes on the buildds. I hope that the mentioned RC bug can be fixed soon - if so, we're happy to stop ignoring issues on sparc (or rather: we probably will find us in the situation that such cases cease to exist). I haven't seen a reply to this mail, so JFTR - that bug was fixed. Though I'm still not sure if packages built on the new buildd are getting uploaded, I haven't been able to contact James about it. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: Installing Debian on Ultrasparc III machines
On Mon, Sep 24, 2007 at 09:57:23PM +0200, Josip Rodin wrote: kernel which works well - at least on out US III machine. We've applied 179c85ea53bef807621f335767e41e23f86f01df to make sure that the system doesn't create unkillable processes anymore if you use the libc6 from _lenny_. BTW, lebrun.d.o, also an USIII, running 2.6.23-rc6 plus the aforementioned patch still created unkillable dpkg-query processes. BTW, I got around to changing the input/output-device on lebrun today, so I'll be able to get register dumps in case it goes dead. I'm not sure if those problems are related :) The register dumps would be needed if the kernel fails to initialize the CPU Fabio told me that break+p output might be useful in this case too, I'm just repeating :) In any case, I let it run some more, and then when it went more or less dead, I tried to press the said key combination on the keyboard - to no avail. Break+p would be Ctrl+Pause+p? Didn't work, and Alt+Pause+p also didn't work. What was even more annoying was the fact that Stop+a got me the PROM shell, but I wasn't able to type anything in it (including 'go'), so that effectively freezes the machine. Please tell me if I did something stunningly stupid... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: Installing Debian on Ultrasparc III machines
On Wed, Sep 19, 2007 at 12:10:26AM +0200, Josip Rodin wrote: kernel which works well - at least on out US III machine. We've applied 179c85ea53bef807621f335767e41e23f86f01df to make sure that the system doesn't create unkillable processes anymore if you use the libc6 from _lenny_. BTW, lebrun.d.o, also an USIII, running 2.6.23-rc6 plus the aforementioned patch still created unkillable dpkg-query processes. BTW, I got around to changing the input/output-device on lebrun today, so I'll be able to get register dumps in case it goes dead. Right now its buildd has been building for over 3.5 hours, and it has created this one process: buildd 20263 100 0.5 1941872 11472 ? RN 18:25 192:03 dpkg-query --search libc.so.6 But it keeps moving! The load was around 5 when I checked this. I went to run 'less buildd.log', but that process just stopped responding instantly. I tried stracing it, and that strace stopped responding :) The load went up to 7 after that. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: Installing Debian on Ultrasparc III machines
On Mon, Sep 24, 2007 at 09:53:44PM +0200, Bernd Zeimetz wrote: kernel which works well - at least on out US III machine. We've applied 179c85ea53bef807621f335767e41e23f86f01df to make sure that the system doesn't create unkillable processes anymore if you use the libc6 from _lenny_. BTW, lebrun.d.o, also an USIII, running 2.6.23-rc6 plus the aforementioned patch still created unkillable dpkg-query processes. BTW, I got around to changing the input/output-device on lebrun today, so I'll be able to get register dumps in case it goes dead. I'm not sure if those problems are related :) The register dumps would be needed if the kernel fails to initialize the CPU Fabio told me that break+p output might be useful in this case too, I'm just repeating :) -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: Installing Debian on Ultrasparc III machines
On Tue, Sep 18, 2007 at 11:40:19PM +0200, Bernd Zeimetz wrote: kernel which works well - at least on out US III machine. We've applied 179c85ea53bef807621f335767e41e23f86f01df to make sure that the system doesn't create unkillable processes anymore if you use the libc6 from _lenny_. Please read the following WARNING: using the libc6 from Etch on an US III machine results in a freeze (badly, as in not reacting to stop+a/break) of your system if you do things like using aptitude after becoming root by the use of su/sudo. This is not that bad with the libc6 from testing, but this is definitely NOT fixed. BTW, lebrun.d.o, also an USIII, running 2.6.23-rc6 plus the aforementioned patch still created unkillable dpkg-query processes. [EMAIL PROTECTED]:/home/buildd/build/chroot-unstable/lib]# ./libc-2.6.1.so GNU C Library stable release version 2.6.1, by Roland McGrath et al. Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled by GNU CC version 4.2.1 (Debian 4.2.1-5). Compiled on a Linux 2.6.17-rc1 system on 2007-09-04. Available extensions: crypt add-on version 2.1 by Michael Glad and others GNU Libidn by Simon Josefsson Native POSIX Threads Library by Ulrich Drepper et al BIND-8.2.3-T5B software FPU emulation by Richard Henderson, Jakub Jelinek and others For bug reporting instructions, please see: http://www.gnu.org/software/libc/bugs.html. Outside the chroot it's etch. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: linux-2.6 - [sparc64-smp] produces unkillable processes
On Tue, Sep 04, 2007 at 10:46:47AM +0200, Fabio Massimo Di Nitto wrote: We (David Miller and I) are already working on this. We finally got some info dump from a debugging patched kernel and I expect we will have a fix within the next 3/4 weeks. From our first look it seems like a futex bug and some users have reported that the latest 2.6.23-rcX do not show this behavior. Clearly we also want to figure out a fix for .22. Fabio I should mention that lebrun.d.o is still dead since the last attempt (ssh unresponsive since 2007-08-30 ~21:25), when it was running a 2.6.22.5 with one davem patch applied (one line in kernel/futex_compat.c). If you need something more done to lebrun, such as kicking it back to life, just tell me... If you have console access, it would be good to get a processor dump by break + p. I can easily reproduce that with my Sparc Ultra60 here, which is running as buildd for experimental. The machines has the very same problem. I will try that tonight. It is also worth checking with .23-rcX since it has been reported to be working. lebrun.d.o exploded again after a few hours of building under 2.6.23-rc5. :( -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: linux-2.6 - [sparc64-smp] produces unkillable processes
On Tue, Sep 04, 2007 at 06:16:05AM +0200, Fabio Massimo Di Nitto wrote: #433187 is the bug that has killed the buildds on lebrun and spontini, right? AIUC, yes. at least i can reproduce that on my buildd. Hi guys, We (David Miller and I) are already working on this. We finally got some info dump from a debugging patched kernel and I expect we will have a fix within the next 3/4 weeks. From our first look it seems like a futex bug and some users have reported that the latest 2.6.23-rcX do not show this behavior. Clearly we also want to figure out a fix for .22. Fabio I should mention that lebrun.d.o is still dead since the last attempt (ssh unresponsive since 2007-08-30 ~21:25), when it was running a 2.6.22.5 with one davem patch applied (one line in kernel/futex_compat.c). If you need something more done to lebrun, such as kicking it back to life, just tell me... -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: linux-2.6 - [sparc64-smp] produces unkillable processes
On Tue, Sep 04, 2007 at 10:17:33AM +0200, Fabio Massimo Di Nitto wrote: I should mention that lebrun.d.o is still dead since the last attempt (ssh unresponsive since 2007-08-30 ~21:25), when it was running a 2.6.22.5 with one davem patch applied (one line in kernel/futex_compat.c). If you need something more done to lebrun, such as kicking it back to life, just tell me... If you have console access, it would be good to get a processor dump by break + p. Unfortunately it's impossible to get that remotely on lebrun, I could never get its RSC to work right. Running consolehistory only gets me as far as the first getty prints the issue file, and then zilch. :( I personally have no say on how buildds should be managed.. i guess it's up to you guys if you want to kick it back. That question was for James :) If you do so just make sure you can grab CPU register dumps from console. At this point I'm not sure if it would be possible to see them even if I plugged the monitor into the VGA port, because I redirected output to rsc-console. sigh -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#433187: linux-2.6 - [sparc64-smp] produces unkillable processes
Hi, #433187 is the bug that has killed the buildds on lebrun and spontini, right? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: snd-intel8x0 line-in not working in later 2.6.x kernels
On Wed, Aug 22, 2007 at 04:13:19AM +0200, Josip Rodin wrote: On Wed, Aug 22, 2007 at 02:52:59AM +0200, Josip Rodin wrote: I'm reporting this bug that I have been seeing for a while and which is a regression from a few months/years ago - the line-in input simply doesn't work right. arecord(1) just doesn't record anything with it, it doesn't show any errors, it records silence. The recording from the same external source works just fine with the microphone input. This works just fine with the same hardware in MS Windows (ugh), and it also worked fine with an earlier 2.6.x kernel version that I had been using when I was still running sarge on this machine. But, I removed it in the meantime so I don't know which one it was. I think it was 2.6.16 or so, but I'm not sure. It's definitely not working with = 2.6.18 (I still have one of those and it behaves the same as 2.6.21). Oh, I might have been too vague there. I can't exactly reproduce the old state because I changed much of my other hardware in this machine since and my old kernel images won't boot; and then I also noticed that there was once an old OSS driver and then I switched to alsa, but I don't have backups of my ancient /etc/modules file so I don't know when that was. I re-selected the old-style i810_audio driver in 2.6.21 and compiled it, unloaded the ALSA driver, loaded the old driver, and voila, everything went back to normal, I can hear the TV sound just fine. So, it might be that this is an OSS-ALSA regression that slipped through the cracks? After an upstream developer helped debug it, it seems that it works if I use alsamixer to change the mixer from the 4 channel mode to the 2 channel mode. The 2ch mode is supposed to be the default; I have no idea why my alsamixer setup used the 4ch mode, at least I certainly don't remember ever fiddling with that setting (because I have no idea what it really means :). Can someone tell me the proper steps to test the default value to see if the bug was really just some local mishap? Move away /var/lib/alsa/asound.state and reload the modules? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: snd-intel8x0 line-in not working in later 2.6.x kernels
On Thu, Aug 23, 2007 at 07:04:13PM +0200, Adrian Bunk wrote: While browsing kernel options, I noticed: Please contact Adrian Bunk [EMAIL PROTECTED] if you had to say Y here because your hardware is not properly supported by ALSA. ...in the description of CONFIG_OSS_OBSOLETE, so, here I am :) This is Debian bug #439072 (and #384933 also looks suspiciously similar, if I might add). Please do the following: - check at the ALSA bug tracking system [1] whether your problem was already reported - if there isn't already a bug for it, open a new bug - in any case, tell me the bug number so that I can track this issue I found several bug reports that sounded familiar, but none of them described this exact issue. This is the new ticket I just filed: https://bugtrack.alsa-project.org/alsa-bug/view.php?id=3335 -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: snd-intel8x0 line-in not working in later 2.6.x kernels
Hi Adrian, On Wed, Aug 22, 2007 at 04:13:19AM +0200, Josip Rodin wrote: I'm reporting this bug that I have been seeing for a while and which is a regression from a few months/years ago - the line-in input simply doesn't work right. arecord(1) just doesn't record anything with it, it doesn't show any errors, it records silence. The recording from the same external source works just fine with the microphone input. I re-selected the old-style i810_audio driver in 2.6.21 and compiled it, unloaded the ALSA driver, loaded the old driver, and voila, everything went back to normal, I can hear the TV sound just fine. So, it might be that this is an OSS-ALSA regression that slipped through the cracks? While browsing kernel options, I noticed: Please contact Adrian Bunk [EMAIL PROTECTED] if you had to say Y here because your hardware is not properly supported by ALSA. ...in the description of CONFIG_OSS_OBSOLETE, so, here I am :) This is Debian bug #439072 (and #384933 also looks suspiciously similar, if I might add). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: snd-intel8x0 line-in not working in later 2.6.x kernels
Package: linux-2.6 Hi, I'm reporting this bug that I have been seeing for a while and which is a regression from a few months/years ago - the line-in input simply doesn't work right. arecord(1) just doesn't record anything with it, it doesn't show any errors, it records silence. The recording from the same external source works just fine with the microphone input. I and apparently many other people noticed the same bug indirectly, by noticing that the sound input from the TV card isn't working. (Many TV cards ship with a line-out and a short cable which connects the TV sound output with the regular sound card input.) I also tried changing the external sources - I tried to record using a microphone via the mic in, and that still works just fine, whereas the microphone via the line in also fails to record anything. And finally, the TV card is heard just fine when plugged into the mic in. This works just fine with the same hardware in MS Windows (ugh), and it also worked fine with an earlier 2.6.x kernel version that I had been using when I was still running sarge on this machine. But, I removed it in the meantime so I don't know which one it was. I think it was 2.6.16 or so, but I'm not sure. It's definitely not working with = 2.6.18 (I still have one of those and it behaves the same as 2.6.21). I have previously recorded my problem in the Ubuntu bug report #29789[1], which includes a bit of a convoluted description at the beginning which also talks about an unrelated module, but it appears that several other people are seeing the same problem as I am. The Debian bug report #384933 mentions that snd-emu10k1 also has a dysfunctional line-in. The bug report #374545 mentions something vaguely similar, but even though they have an explicit error message there, the suspicious bit was that the regression happened from .15 to .16, which should be around the same time as this. I grepped the kernel patch files for .13, .14, .15, .16, .17, and they rarely ever mention snd-intel8x0. The quirks option was added in .14, but offhand that doesn't seem to be related to the line-in. (The snd-emu10k1 driver is even more rarely mentioned in the said patches.) Any help would be appreciated. TIA. [1] https://bugs.launchpad.net/ubuntu/+bug/29789 -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#439072: snd-intel8x0 line-in not working in later 2.6.x kernels
On Wed, Aug 22, 2007 at 02:52:59AM +0200, Josip Rodin wrote: I'm reporting this bug that I have been seeing for a while and which is a regression from a few months/years ago - the line-in input simply doesn't work right. arecord(1) just doesn't record anything with it, it doesn't show any errors, it records silence. The recording from the same external source works just fine with the microphone input. This works just fine with the same hardware in MS Windows (ugh), and it also worked fine with an earlier 2.6.x kernel version that I had been using when I was still running sarge on this machine. But, I removed it in the meantime so I don't know which one it was. I think it was 2.6.16 or so, but I'm not sure. It's definitely not working with = 2.6.18 (I still have one of those and it behaves the same as 2.6.21). Oh, I might have been too vague there. I can't exactly reproduce the old state because I changed much of my other hardware in this machine since and my old kernel images won't boot; and then I also noticed that there was once an old OSS driver and then I switched to alsa, but I don't have backups of my ancient /etc/modules file so I don't know when that was. I re-selected the old-style i810_audio driver in 2.6.21 and compiled it, unloaded the ALSA driver, loaded the old driver, and voila, everything went back to normal, I can hear the TV sound just fine. So, it might be that this is an OSS-ALSA regression that slipped through the cracks? -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#409244: bug
Hi, Thinking about this, there's actually another thing bothering me - you can't use firmware-qlogic without the 'MODULES=most' option, IOW, there seems to be no way to build just a minimal initrd just with the qla2xxx and the firmware file. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#409244: initramfs doesn't include the udev firmware helper
Package: initramfs-tools Version: 0.85e Hi, The other day I tried to boot a Sun Fire 280R that works nicely with kernel 2.4.30; however, it didn't work, because the qla2xxx driver can't find the firmware image, and it fails to load properly, meaning I can't access the hard disks in the machine, and... flop. :) I worked around this by including the proprietary file downloaded from the URL provided in kernel config help, ql2200_fw.bin, using a hook file. It was necessary to load qla2xxx *after* init-premount, because it needs udev to load in order to access firmware helper. But, for udev to actually use the firmware helper, it sounds like this is also needed: copy_exec /lib/udev/firmware.agent /lib/udev/ After that, the hook file that installs into /lib/firmware also needed: mkdir -p ${DESTDIR}/lib/firmware Those two problems are more general; another cp/copy_exec for the actual file is probably a matter for another package, with license issues sorting and all that. Cf. http://lists.debian.org/debian-sparc/2007/01/msg00074.html and http://lists.debian.org/debian-sparc/2007/02/msg2.html -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#409244: initramfs doesn't include the udev firmware helper
On Thu, Feb 01, 2007 at 01:41:36PM +0100, maximilian attems wrote: The other day I tried to boot a Sun Fire 280R that works nicely with kernel 2.4.30; however, it didn't work, because the qla2xxx driver can't find the firmware image, and it fails to load properly, meaning I can't access the hard disks in the machine, and... flop. :) did you try to use firmware-qlogic? afaik it has the necessary hooks. Oh, nice, thanks. It does all the same. Someone should have told me about it; I should file a bug on that package for having a completely useless description: Description: Binary firmware for QLOGIC This package contains the binary firmware for QLOGIC. apt-cache search qla2xxx returns nothing, and that's really easily fixable. But, the existence of that package doesn't really help the installation system, or any other driver which uses udev firmware.agent. It would be good if at least the firmware helper was added by default - it's trivially small overhead, and it helps reduce confusion for people trying to roll in their own firmware images and whatnot. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#409244: initramfs doesn't include the udev firmware helper
clone 409244 -1 retitle -1 [sparc] Sun Fire 280R with disks on qla2xxx not bootable by default any more due to lack of firmware-qlogic reassign -1 debian-installer clone 409244 -2 retitle -2 firmware-qlogic description is inadequate reassign -2 firmware-qlogic severity 409244 wishlist merge 355881 409244 thanks On Thu, Feb 01, 2007 at 09:33:10PM +0100, maximilian attems wrote: On Thu, Feb 01, 2007 at 04:33:09PM +0100, Josip Rodin wrote: On Thu, Feb 01, 2007 at 01:41:36PM +0100, maximilian attems wrote: The other day I tried to boot a Sun Fire 280R that works nicely with kernel 2.4.30; however, it didn't work, because the qla2xxx driver can't find the firmware image, and it fails to load properly, meaning I can't access the hard disks in the machine, and... flop. :) did you try to use firmware-qlogic? afaik it has the necessary hooks. Oh, nice, thanks. It does all the same. Someone should have told me about it; I should file a bug on that package for having a completely useless description: Description: Binary firmware for QLOGIC This package contains the binary firmware for QLOGIC. apt-cache search qla2xxx returns nothing, and that's really easily fixable. you want me to reassign this as wishlist against it? Hm, I've performed the bug surgery above :) But, the existence of that package doesn't really help the installation system, or any other driver which uses udev firmware.agent. It would be good if at least the firmware helper was added by default - it's trivially small overhead, and it helps reduce confusion for people trying to roll in their own firmware images and whatnot. well module-init-tools modinfo will get info about modules needing firmware than the helper gets added. that is an postetch item, there is a open bug repot against initramfs-tools tracking that. Oh, I see, #355881. Merged. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]