Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Tue, Jul 09, 2013 at 05:42:20PM +0200, Moritz Muehlenhoff wrote: > > No, a second machine of the same type is available now for testing - and > > also crashing after > > loading of the cassini driver. Here lspci and cpuinfo: ... > > 0002:00:02.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 > > (rev 11) > > 0003:00:01.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 > > (rev 11) ... > Does this work with the wheezy release or later kernels? Nope, tried 3.10.0 today - network worked for a short time, then a Hardware FATAL RESET occured. Last suspicion was a chip issue with rev 11 cassini - there is one working report with rev 20 chips only. For the records: this was a 4 CPU 480R. As usual console output is saved and could be provided, any other ideas are welcome. Thanks, Hermann -- Netzwerkadministration/Zentrale Dienste, Interdiziplinaeres Zentrum fuer wissenschaftliches Rechnen der Universitaet Heidelberg IWR; INF 368; 69120 Heidelberg; Tel: (06221)54-8236 Fax: -5224 Email: hermann.la...@iwr.uni-heidelberg.de -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
notforwarded 516785 reassign 516785 src:linux thanks On Wed, Jun 27, 2012 at 10:31:24AM +0200, Hermann Lauer wrote: > On Mon, Jun 04, 2012 at 04:35:57PM +0200, Hermann Lauer wrote: > > On Sat, Jun 02, 2012 at 03:57:54AM +0800, Aron Xu wrote: > > > I have remote ssh access (root) to that running SunFire 408R, what can > > > I do to help you? > > ... > > > PS: I've disabled the rename function of udev and set hwaddress in > > > /etc/network/interfaces directly to work around the always changing > > > mac address. > > > > How to disable the renaming ? Will try to set the hwaddr during the next > > test. > ... > > Wondering now if having only one cpu board with 2 cpus may be the problem. > > No, a second machine of the same type is available now for testing - and also > crashing after > loading of the cassini driver. Here lspci and cpuinfo: > > :00:06.0 IDE interface: Silicon Image, Inc. PCI0646 (rev 07) > 0002:00:01.0 Bridge: Oracle Corporation RIO EBUS (rev 01) > 0002:00:01.3 USB Controller: Oracle Corporation RIO USB (rev 01) > 0002:00:02.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev > 11) > 0003:00:01.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev > 11) > 0003:00:02.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre > Channel Adapter (rev 05) Does this work with the wheezy release or later kernels? Cheers, Moritz -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Mon, Jun 04, 2012 at 04:35:57PM +0200, Hermann Lauer wrote: > On Sat, Jun 02, 2012 at 03:57:54AM +0800, Aron Xu wrote: > > I have remote ssh access (root) to that running SunFire 408R, what can > > I do to help you? > ... > > PS: I've disabled the rename function of udev and set hwaddress in > > /etc/network/interfaces directly to work around the always changing > > mac address. > > How to disable the renaming ? Will try to set the hwaddr during the next test. ... > Wondering now if having only one cpu board with 2 cpus may be the problem. No, a second machine of the same type is available now for testing - and also crashing after loading of the cassini driver. Here lspci and cpuinfo: :00:06.0 IDE interface: Silicon Image, Inc. PCI0646 (rev 07) 0002:00:01.0 Bridge: Oracle Corporation RIO EBUS (rev 01) 0002:00:01.3 USB Controller: Oracle Corporation RIO USB (rev 01) 0002:00:02.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 11) 0003:00:01.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 11) 0003:00:02.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) cpu : TI UltraSparc III+ (Cheetah+) fpu : UltraSparc III+ integrated FPU pmu : ultra3+ prom: OBP 4.17.1 2005/04/11 14:27 type: sun4u ncpus probed: 4 ncpus active: 4 D$ parity tl1 : 0 I$ parity tl1 : 0 cpucaps : flush,stbar,swap,muldiv,v9,ultra3,mul32,div32,v8plus,vis,vis2 Cpu0ClkTck : 35a4e900 Cpu1ClkTck : 35a4e900 Cpu2ClkTck : 35a4e900 Cpu3ClkTck : 35a4e900 MMU Type: Cheetah+ State: CPU0: online CPU1: online CPU2: online CPU3: online This machine has 24G RAM, 16 on one board and 8 on the other. The first machine has one board with 16G. So it may be a memory issue, as Aron has only 14G. Other maybe the OBP version - here is the latest installed on all machines. Just to rule out a firmware issue: $ md5sum /lib/firmware/sun/cassini.bin fd11e09e8e61694353f12b3de376292a Any further ideas to debug deeper ? Thanks, Hermann -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Sat, Jun 02, 2012 at 03:57:54AM +0800, Aron Xu wrote: > I have remote ssh access (root) to that running SunFire 408R, what can > I do to help you? ... > PS: I've disabled the rename function of udev and set hwaddress in > /etc/network/interfaces directly to work around the always changing > mac address. How to disable the renaming ? Will try to set the hwaddr during the next test. Appended is the lspci and /proc/cpuinfo output. Using the 2.6.32-45 squeeze default image just did hang the machine when loading the cassini drivers again. Wondering now if having only one cpu board with 2 cpus may be the problem. Many thanks for your help, Hermann :00:03.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) :00:06.0 IDE interface: Silicon Image, Inc. PCI0646 (rev 07) 0001:00:01.0 PCI bridge: Digital Equipment Corporation DECchip 21154 (rev 05) 0001:00:02.0 Ethernet controller: Oracle Corporation GEM 10/100/1000 Ethernet [ge] (rev 01) 0001:01:04.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) 0001:01:05.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) 0002:00:01.0 Bridge: Oracle Corporation RIO EBUS (rev 01) 0002:00:01.3 USB Controller: Oracle Corporation RIO USB (rev 01) 0002:00:02.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 11) 0003:00:01.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 11) 0003:00:02.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) cpu : TI UltraSparc III+ (Cheetah+) fpu : UltraSparc III+ integrated FPU pmu : ultra3+ prom: OBP 4.22.34 2007/07/23 13:01 type: sun4u ncpus probed: 2 ncpus active: 2 D$ parity tl1 : 0 I$ parity tl1 : 0 cpucaps : flush,stbar,swap,muldiv,v9,ultra3,mul32,div32,v8plus,vis,vis2 Cpu0ClkTck : 35a4e900 Cpu2ClkTck : 35a4e900 MMU Type: Cheetah+ State: CPU0: online CPU2: online -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
Hi Hermann, Is it possible to describe the detailed physical status of all the two CPU/Memory boards? I would like to know which Slots do you place your CPUs and which memory module groups are installed (and how much) on both of the CPU/Memory boards. -- Regards, Aron Xu -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
Here is some more information of our machine FYI. Attached dmesg_20120322T184612.txt is the dmesg generated during boot of the machine on Mar 22, 2012, which is the date we did our last reboot and put it into production. The machine has 14GB RAM, that is 512MB*(16+12). Because there is one broken RAM, we have to remove all the group of four to make the machine boot, so there is 2GB's loss. The network is dual-stacked IPv4 and IPv6. Both of the NICs have configured multiple IPv6 addresses, only one of them have one IPv4 address. But I confirm it does not have any problem with no connection or have only one NIC configured. The operating system when installed is 6.0.4, user space programs are updated to 6.0.5 later, but the kernel isn't updated. The kernel is linux-image-2.6.32-5-sparc64-smp, version 2.6.32-41squeeze2. I can confirm that 2.6.32-41 worked because its the default of 6.0.4 CD1. d-i prompted for firmware during installation, and I supplied all the firmware.tar.gz as well as an unpacked directory by a USB flash disk. Then the installation continued. The disk is configured as software RAID1 (hardware RAID card seems not being recognized), though /boot as a separated partition isn't configured in the RAID due to a glitch in d-i. /boot is ext3, and / is ext4. The Sun remote control card is installed but not configured. The server's load is normally not very high, but it's usually to have a load average of 2 ~ 3. Most of the loads are processing network requests with very few disk I/O. $ cat /proc/cpuinfo cpu : TI UltraSparc III+ (Cheetah+) fpu : UltraSparc III+ integrated FPU pmu : ultra3+ prom: OBP 4.13.0 2004/01/19 18:26 type: sun4u ncpus probed: 4 ncpus active: 4 D$ parity tl1 : 0 I$ parity tl1 : 0 Cpu0ClkTck : 47868c00 Cpu1ClkTck : 47868c00 Cpu2ClkTck : 47868c00 Cpu3ClkTck : 47868c00 MMU Type: Cheetah+ State: CPU0: online CPU1: online CPU2: online CPU3: online $ lspci :00:03.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] :00:06.0 IDE interface: Silicon Image, Inc. PCI0646 (rev 07) 0001:00:01.0 Fibre Channel: QLogic Corp. ISP2422-based 4Gb Fibre Channel to PCI-X HBA (rev 02) 0002:00:01.0 Bridge: Oracle Corporation RIO EBUS (rev 01) 0002:00:01.3 USB Controller: Oracle Corporation RIO USB (rev 01) 0002:00:02.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 20) 0003:00:01.0 Ethernet controller: Oracle Corporation Cassini 10/100/1000 (rev 20) 0003:00:02.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) -- Regards, Aron Xu [0.00] PROMLIB: Sun IEEE Boot Prom 'OBP 4.13.0 2004/01/19 18:26' [0.00] PROMLIB: Root node compatible: [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 2.6.32-5-sparc64-smp (Debian 2.6.32-41squeeze2) (da...@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Thu Mar 22 18:46:12 UTC 2012 [0.00] bootconsole [earlyprom0] enabled [0.00] ARCH: SUN4U [0.00] Ethernet address: 00:03:ba:8c:1f:11 [0.00] Kernel: Using 2 locked TLB entries for main kernel image. [0.00] Remapping the kernel... done. [0.00] OF stdout device is: /pci@8,70/SUNW,XVR-100@3 [0.00] PROM: Built device tree with 119498 bytes of memory. [0.00] Top of RAM: 0xb17fa6e000, Total RAM: 0x37fa5c000 [0.00] Memory hole size: 712704MB [0.00] [01014000-f8a00040] page_structs=131072 node=0 entry=1280/0 [0.00] [01014000-f8a00080] page_structs=131072 node=0 entry=1281/0 [0.00] [01014080-f8a000c0] page_structs=131072 node=0 entry=1282/0 [0.00] [01014080-f8a00100] page_structs=131072 node=0 entry=1283/0 [0.00] [01014100-f8a00140] page_structs=131072 node=0 entry=1284/0 [0.00] [01014100-f8a00180] page_structs=131072 node=0 entry=1285/0 [0.00] [01014180-f8a001c0] page_structs=131072 node=0 entry=1286/0 [0.00] [01014180-f8a00200] page_structs=131072 node=0 entry=1287/0 [0.00] [01014200-f8a00240] page_structs=131072 node=0 entry=1288/0 [0.00] [01014200-f8a00280] page_structs=131072 node=0 entry=1289/0 [0.00] [01014280-f8a002c0] page_structs=131072 node=0 entry=1290/0 [0.00] [01014280-f8a00300] page_structs=131072 node=0 entry=1291/0 [0.00] [01014300-f8a00340] page_structs=131072 node=0 entry=1292/0 [0.00] [01014300-f8a00380] page_structs=131072 node=0 entry=1293/0 [0.00] [01014380-f8a003c0] page_structs=131072 node=0 entry=129
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Fri, Jun 1, 2012 at 11:05 PM, Hermann Lauer wrote: > > Aron, do you have a Sun Fire 480R ? If yes, I'm interested in getting a > running binary kernel from > you to rule out configuration and compiler issues. > I have remote ssh access (root) to that running SunFire 408R, what can I do to help you? Note I need to keep the service running so don't expect me to try out something new... PS: I've disabled the rename function of udev and set hwaddress in /etc/network/interfaces directly to work around the always changing mac address. -- Regards, Aron Xu -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Tue, Mar 27, 2012 at 03:22:38PM +0100, Ben Hutchings wrote: > On Tue, 2012-03-27 at 15:42 +0800, Aron Xu wrote: > > I can confirm that Debian Squeeze 6.0.4, with kernel > > linux-image-2.6.32-5-sparc64-smp, version 2.6.32-41 or > > 2.6.32-41squeeze2, does not crash anymore. The installation process is > > Well I can't see any changes that might have fixed this. Maybe there's > a difference between your machine and Hermann's? Tried today vanilla 3.4.0 and 3.3.7: 3.4.0 crashes most probable unrelated in the md code, on 3.3.7 setting up the cassini driver hangs the machine and afer a while it resets itself, see below. Aron, do you have a Sun Fire 480R ? If yes, I'm interested in getting a running binary kernel from you to rule out configuration and compiler issues. Thanks, Hermann tantalus:~# modprobe -v cassini cassini_debug=-1 WARNING: All config files need .conf: cassini: cassini.c:v1.6 (21 May 2008) /etc/modprobe.d/local, it will be ignored in a cassini 0002:00:02.0: eth0: Sun Cassini+ (64bit/33MHz PCI/Cu) Ethernet[24] 00:03:ba:29:7c:a0 future release. insmod /lib/modules/3.3.7/kernel/drivers/net/ethernet/sun/cassini.ko cassini_debug=-1 cassini 0003:00:01.0: eth1: Sun Cassini+ (64bit/66MHz PCI/Cu) Ethernet[30] 00:03:ba:29:7c:9f tantudev[913]: renamed network interface eth0 to eth19 alus:~# udev[914]: renamed network interface eth1 to eth20 tantalus:~# ifconfig eth19 129.206.xxx.xxx netmask 255.255.255.0 broadcast 129.206.xxx.255 up cassini 0002:00:02.0: eth19: Link up at 1000 Mbps, full-duplex cassini 0002:00:02.0: eth19: TX pause enabled tantalus:~# route add default gw 129.206.xxx.xxx tantalus:~# Sun Fire 480R, No Keyboard Copyright 2007 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.22.34, 16384 MB memory installed, Serial # -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Tue, Mar 27, 2012 at 03:22:38PM +0100, Ben Hutchings wrote: > On Tue, 2012-03-27 at 15:42 +0800, Aron Xu wrote: > > Hi, > > > > I can confirm that Debian Squeeze 6.0.4, with kernel > > linux-image-2.6.32-5-sparc64-smp, version 2.6.32-41 or > > 2.6.32-41squeeze2, does not crash anymore. The installation process is > > smooth (d-i prompts for a firmware), and the system is working well. > > But don't run lshw with this kernel, it may cause panic (#665932). > > Well I can't see any changes that might have fixed this. Maybe there's > a difference between your machine and Hermann's? > > Hermann, what was the last kernel version where the cassini driver > worked on this system? You originally reported that the problem started > with 2.6.24 in 'etch-and-a-half'. The short answer is: never, the driver always crashes the machine after a short time. As far as I remember with discussions from davem the driver only worked on UP machines, which I can't simulate with the Sun Fire 480R as it has 2 CPU/board. A dual cassini Gigabit RJ45 is build in. I'm on vanilla 3.2.12 at the moment and will test after easter in the (not so much) spare time. If anybody with such a machine has it running I'm interested to hear. Thanks, Hermann -- Netzwerkadministration/Zentrale Dienste, Interdiziplinaeres Zentrum fuer wissenschaftliches Rechnen der Universitaet Heidelberg IWR; INF 368; 69120 Heidelberg; Tel: (06221)54-8236 Fax: -5224 Email: hermann.la...@iwr.uni-heidelberg.de -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
On Tue, 2012-03-27 at 15:42 +0800, Aron Xu wrote: > Hi, > > I can confirm that Debian Squeeze 6.0.4, with kernel > linux-image-2.6.32-5-sparc64-smp, version 2.6.32-41 or > 2.6.32-41squeeze2, does not crash anymore. The installation process is > smooth (d-i prompts for a firmware), and the system is working well. > But don't run lshw with this kernel, it may cause panic (#665932). Well I can't see any changes that might have fixed this. Maybe there's a difference between your machine and Hermann's? Hermann, what was the last kernel version where the cassini driver worked on this system? You originally reported that the problem started with 2.6.24 in 'etch-and-a-half'. Ben. -- Ben Hutchings Horngren's Observation: Among economists, the real world is often a special case. signature.asc Description: This is a digitally signed message part
Bug#516785: Bug #516785: linux-image-2.6.26-1-sparc64-smp: [sparc] SunFire480R cassini network driver kernel panic
Hi, I can confirm that Debian Squeeze 6.0.4, with kernel linux-image-2.6.32-5-sparc64-smp, version 2.6.32-41 or 2.6.32-41squeeze2, does not crash anymore. The installation process is smooth (d-i prompts for a firmware), and the system is working well. But don't run lshw with this kernel, it may cause panic (#665932). -- Regards, Aron Xu -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org