Any news regarding this bug? * Giuseppe Sacco <[EMAIL PROTECTED]> [2008-02-06 09:48]: > Hi Thomas and Martin > > Il giorno mar, 05/02/2008 alle 20.17 +0100, Thomas Bogendoerfer ha > scritto: > > On Tue, Feb 05, 2008 at 11:04:28AM -0700, Martin Michlmayr wrote: > > > CCing Thomas Bogendoerfer > > > > this looks like a userspace program accessing some mmap-ped data, > > which is triggering a dbe. The first question is which application is it ? > > My first thought was an X server, but I don't even know, if there is > > one for the O2. What program are running, when the machine crashes ? > > Crash happened in all scenarios: using X11 (via framebuffer) and > mozilla, or X11 and sylpheed, or tty and compiling the kernel, or tty > and managing email via courier-imapd-ssl. > > Based on this[0] comment, I think that the problem is in libc6. > The machine was working very well, with all software from etch. > > A few days ago I decided to use latest kernel from unstable (this is > 2.6.24-2) but that kernel isn't compilable on etch because it require > gcc-4.2. So, I updated gcc, g++, binutils, libstdc++, libgcc1 and libc6 > from unstable. > > After this update nothing worked. > When I switched back to libc6/gcc/g++/libstdc++/libgcc1/binutils from > etch, everything went well. > > Currently the machine is running etch. The actual mapping of those > libraries are: > > [EMAIL PROTECTED]:~$ ldd /usr/bin/sylpheed | grep -E > '(libc\.|libstdc\+\+\.|libgcc)' > libc.so.6 => /lib/libc.so.6 (0x2c2a4000) > libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x2c5c8000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x2c718000) > > When trying to get back to my original configuration, I switched kernel > to 2.6.23.1 (leaving libraries from unstable) and tried to recompile the > kernel source using make-kpkg or dpkg-buildpackage. What happened was > that many program started segfaulting (signal 11 and 10). Programs like > "diff", "patch", and so on. > > > > > > > Got dbe at 0x2ac2bffc > > > > [...] > > > > > > Index: 20 pgmask=4kb va=0007fd06000 asid=6b > > > > [pa=0000c67b000 c=3 d=1 v=1 g=0] [pa=000032bf000 c=3 d=0 v=0 > > > > g=0] > > > > Index: 21 pgmask=4kb va=0002acf0000 asid=6b > > > > [pa=0000d1c1000 c=3 d=0 v=1 g=0] [pa=0000c188000 c=3 d=0 v=0 > > > > g=0] > > > > Index: 25 pgmask=4kb va=0002acf0000 asid=6b > > > > [pa=0000d1c1000 c=3 d=0 v=1 g=0] [pa=0000c188000 c=3 d=0 v=0 > > > > g=0] > > > > Index: 41 pgmask=4kb va=0002aac8000 asid=6b > > > > [pa=0000368e000 c=3 d=0 v=0 g=0] [pa=000506c0000 c=3 d=1 v=1 > > > > g=0] > > > > these are the only TLBs which are user space. What looks strange to me > > is Index 21 and Index 25. Both are mapping the same page. Looking at > > the R4600/4700 manual (didn't have a R5k manual handy) indicates no > > problem with duplicate TLB entries. I can't check, if pyhsical addresses > > are correct, because I don't know how memory is mapped. I need to see the > > CRIME MC messages from bootup. Even better is a complete boot log. > > This is a boot log when using the kernel whipped with debian unstable. > > Feb 4 02:25:44 sgi kernel: Linux version 2.6.24-1-r5k-ip32 (Debian 2.6.24-2) > ([EMAIL PROTECTED]) (gcc version 4.1.3 20080114 (prereleas > e) (Debian 4.1.2-19)) #1 Fri Feb 1 07:29:41 UTC 2008 > Feb 4 02:25:44 sgi kernel: ARCH: SGI-IP32 > Feb 4 02:25:44 sgi kernel: PROMLIB: ARC firmware Version 1 Revision 10 > Feb 4 02:25:44 sgi kernel: CRIME id a rev 1 at 0x0000000014000000 > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 0 base 0x0000000000000000 size > 128MiB > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 1 base 0x0000000008000000 size > 128MiB > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 2 base 0x0000000050000000 size > 32MiB > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 3 base 0x0000000052000000 size > 32MiB > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 4 base 0x0000000054000000 size > 32MiB > Feb 4 02:25:44 sgi kernel: CRIME MC: bank 5 base 0x0000000056000000 size > 32MiB > Feb 4 02:25:44 sgi kernel: CPU revision is: 00002321 (R5000) > Feb 4 02:25:44 sgi kernel: FPU revision is: 00002310 > Feb 4 02:25:44 sgi kernel: Determined physical RAM map: > Feb 4 02:25:44 sgi kernel: memory: 0000000010000000 @ 0000000000000000 > (usable) > Feb 4 02:25:44 sgi kernel: memory: 0000000008000000 @ 0000000050000000 > (usable) > Feb 4 02:25:44 sgi kernel: Entering add_active_range(0, 0, 65536) 0 entries > of 256 used > Feb 4 02:25:44 sgi kernel: Entering add_active_range(0, 327680, 360448) 1 > entries of 256 used > Feb 4 02:25:44 sgi kernel: Initrd not found or empty - disabling initrd > Feb 4 02:25:44 sgi kernel: Zone PFN ranges: > Feb 4 02:25:44 sgi kernel: Normal 0 -> 360448 > Feb 4 02:25:44 sgi kernel: Movable zone start PFN for each node > Feb 4 02:25:44 sgi kernel: early_node_map[2] active PFN ranges > Feb 4 02:25:44 sgi kernel: 0: 0 -> 65536 > Feb 4 02:25:44 sgi kernel: 0: 327680 -> 360448 > Feb 4 02:25:44 sgi kernel: On node 0 totalpages: 98304 > Feb 4 02:25:44 sgi kernel: Normal zone: 4928 pages used for memmap > Feb 4 02:25:44 sgi kernel: Normal zone: 0 pages reserved > Feb 4 02:25:44 sgi kernel: Normal zone: 93376 pages, LIFO batch:31 > Feb 4 02:25:44 sgi kernel: Movable zone: 0 pages used for memmap > Feb 4 02:25:44 sgi kernel: Built 1 zonelists in Zone order, mobility > grouping on. Total pages: 93376 > Feb 4 02:25:44 sgi kernel: Kernel command line: root=/dev/sda1 ro > video=gbefb:[EMAIL PROTECTED] console=tty0 console=ttyS1,115200 > Feb 4 02:25:44 sgi kernel: Primary instruction cache 32kB, VIPT, 2-way, > linesize 32 bytes. > Feb 4 02:25:44 sgi kernel: Primary data cache 32kB, 2-way, VIPT, cache > aliases, linesize 32 bytes > Feb 4 02:25:44 sgi kernel: R5000 SCACHE size 1024kB, linesize 32 bytes. > Feb 4 02:25:44 sgi kernel: Synthesized clear page handler (15 instructions). > Feb 4 02:25:44 sgi kernel: Synthesized copy page handler (24 instructions). > Feb 4 02:25:44 sgi kernel: Synthesized TLB refill handler (38 instructions). > Feb 4 02:25:44 sgi kernel: Synthesized TLB load handler fastpath (51 > instructions). > Feb 4 02:25:44 sgi kernel: Synthesized TLB store handler fastpath (51 > instructions). > Feb 4 02:25:44 sgi kernel: Synthesized TLB modify handler fastpath (50 > instructions). > Feb 4 02:25:44 sgi kernel: PID hash table entries: 2048 (order: 11, 16384 > bytes) > Feb 4 02:25:44 sgi kernel: Calibrating system timer... 200 MHz CPU detected > Feb 4 02:25:44 sgi kernel: CRIME memory error at 0x3fffffe0 ST > 0x0400a828<INV,RE,REID=0x28,NONFATAL> > Feb 4 02:25:44 sgi kernel: Console: colour dummy device 80x25 > Feb 4 02:25:44 sgi kernel: console [tty0] enabled > Feb 4 02:25:44 sgi kernel: Dentry cache hash table entries: 65536 (order: 7, > 524288 bytes) > Feb 4 02:25:44 sgi kernel: Inode-cache hash table entries: 32768 (order: 6, > 262144 bytes) > Feb 4 02:25:44 sgi kernel: Memory: 367492k/393216k available (3196k kernel > code, 25216k reserved, 1019k data, 200k init, 0k highmem) > Feb 4 02:25:44 sgi kernel: Calibrating delay loop... 199.16 BogoMIPS > (lpj=398336) > Feb 4 02:25:44 sgi kernel: Security Framework initialized > Feb 4 02:25:44 sgi kernel: SELinux: Disabled at boot. > Feb 4 02:25:44 sgi kernel: Capability LSM initialized > Feb 4 02:25:44 sgi kernel: Mount-cache hash table entries: 256 > Feb 4 02:25:44 sgi kernel: Initializing cgroup subsys ns > Feb 4 02:25:44 sgi kernel: Initializing cgroup subsys cpuacct > Feb 4 02:25:44 sgi kernel: Checking for the multiply/shift bug... no. > Feb 4 02:25:44 sgi kernel: Checking for the daddi bug... no. > Feb 4 02:25:44 sgi kernel: Checking for the daddiu bug... no. > Feb 4 02:25:44 sgi kernel: net_namespace: 120 bytes > Feb 4 02:25:44 sgi kernel: NET: Registered protocol family 16 > Feb 4 02:25:44 sgi kernel: MACE PCI rev 1 > Feb 4 02:25:44 sgi kernel: SCSI subsystem initialized > Feb 4 02:25:44 sgi kernel: PCI: Bridge: 0000:00:03.0 > Feb 4 02:25:44 sgi kernel: IO window: 1000-1fff > Feb 4 02:25:44 sgi kernel: MEM window: 80000000-800fffff > Feb 4 02:25:44 sgi kernel: PREFETCH window: 80100000-801fffff > Feb 4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:03.0 (0000 -> 0003) > Feb 4 02:25:44 sgi kernel: PCI: Setting latency timer of device 0000:00:03.0 > to 64 > Feb 4 02:25:44 sgi kernel: Time: MIPS clocksource has been installed. > Feb 4 02:25:44 sgi kernel: NET: Registered protocol family 2 > Feb 4 02:25:44 sgi kernel: IP route cache hash table entries: 4096 (order: > 3, 32768 bytes) > Feb 4 02:25:44 sgi kernel: TCP established hash table entries: 16384 (order: > 6, 262144 bytes) > Feb 4 02:25:44 sgi kernel: TCP bind hash table entries: 16384 (order: 5, > 131072 bytes) > Feb 4 02:25:44 sgi kernel: TCP: Hash tables configured (established 16384 > bind 16384) > Feb 4 02:25:44 sgi kernel: TCP reno registered > Feb 4 02:25:44 sgi kernel: audit: initializing netlink socket (disabled) > Feb 4 02:25:44 sgi kernel: audit(1202088249.428:1): initialized > Feb 4 02:25:44 sgi kernel: VFS: Disk quotas dquot_6.5.1 > Feb 4 02:25:44 sgi kernel: Dquot-cache hash table entries: 512 (order 0, > 4096 bytes) > Feb 4 02:25:44 sgi kernel: Installing knfsd (copyright (C) 1996 [EMAIL > PROTECTED]). > Feb 4 02:25:44 sgi kernel: io scheduler noop registered > Feb 4 02:25:44 sgi kernel: io scheduler anticipatory registered > Feb 4 02:25:44 sgi kernel: io scheduler deadline registered > Feb 4 02:25:44 sgi kernel: io scheduler cfq registered (default) > Feb 4 02:25:44 sgi kernel: Console: switching to colour frame buffer device > 160x64 > Feb 4 02:25:44 sgi kernel: fb0: SGI GBE rev 1 @ 0x16000000 using 4096kB > memory > Feb 4 02:25:44 sgi kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 > ports, IRQ sharing disabled > Feb 4 02:25:44 sgi kernel: serial8250.0: ttyS0 at MMIO 0x1f390000 (irq = 60) > is a 16550A > Feb 4 02:25:44 sgi kernel: serial8250.0: ttyS1 at MMIO 0x1f398000 (irq = 66) > is a 16550A > Feb 4 02:25:44 sgi kernel: console [ttyS1] enabled > Feb 4 02:25:44 sgi kernel: RAMDISK driver initialized: 16 RAM disks of 8192K > size 1024 blocksize > Feb 4 02:25:44 sgi kernel: eth0: SGI MACE Ethernet rev. 1 > Feb 4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:01.0 (0046 -> 0047) > Feb 4 02:25:44 sgi kernel: ahc_pci:0:1:0: Using left over BIOS settings > Feb 4 02:25:44 sgi kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA > DRIVER, Rev 7.0 > Feb 4 02:25:44 sgi kernel: <Adaptec aic7880 Ultra SCSI adapter> > Feb 4 02:25:44 sgi kernel: aic7880: Wide Channel A, SCSI Id=0, > 16/253 SCBs > Feb 4 02:25:44 sgi kernel: > Feb 4 02:25:44 sgi kernel: PCI: Enabling device 0000:00:02.0 (0046 -> 0047) > Feb 4 02:25:44 sgi kernel: ahc_pci:0:2:0: Using left over BIOS settings > Feb 4 02:25:44 sgi kernel: scsi 0:0:1:0: Direct-Access ModusLnk > PQ: 0 ANSI: 3 > Feb 4 02:25:44 sgi kernel: scsi0:A:1:0: Tagged Queuing enabled. Depth 32 > Feb 4 02:25:44 sgi kernel: target0:0:1: Beginning Domain Validation > Feb 4 02:25:44 sgi kernel: target0:0:1: wide asynchronous > Feb 4 02:25:44 sgi kernel: target0:0:1: FAST-10 WIDE SCSI 20.0 MB/s ST (100 > ns, offset 8) > Feb 4 02:25:44 sgi kernel: target0:0:1: Domain Validation skipping write > tests > Feb 4 02:25:44 sgi kernel: target0:0:1: Ending Domain Validation > Feb 4 02:25:44 sgi kernel: scsi 0:0:2:0: Direct-Access FUJITSU > MAG3182LC 5210 PQ: 0 ANSI: 2 > Feb 4 02:25:44 sgi kernel: scsi0:A:2:0: Tagged Queuing enabled. Depth 32 > Feb 4 02:25:44 sgi kernel: target0:0:2: Beginning Domain Validation > Feb 4 02:25:44 sgi kernel: target0:0:2: wide asynchronous > Feb 4 02:25:44 sgi kernel: target0:0:2: FAST-10 WIDE SCSI 20.0 MB/s ST (100 > ns, offset 8) > Feb 4 02:25:44 sgi kernel: target0:0:2: Domain Validation skipping write > tests > Feb 4 02:25:44 sgi kernel: target0:0:2: Ending Domain Validation > Feb 4 02:25:44 sgi kernel: scsi 0:0:4:0: CD-ROM TOSHIBA CD-ROM > XM-5701TA 0167 PQ: 0 ANSI: 2 > Feb 4 02:25:44 sgi kernel: target0:0:4: Beginning Domain Validation > Feb 4 02:25:44 sgi kernel: target0:0:4: FAST-10 SCSI 10.0 MB/s ST (100 ns, > offset 8) > Feb 4 02:25:44 sgi kernel: target0:0:4: Domain Validation skipping write > tests > Feb 4 02:25:44 sgi kernel: target0:0:4: Ending Domain Validation > Feb 4 02:25:44 sgi kernel: scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA > DRIVER, Rev 7.0 > Feb 4 02:25:44 sgi kernel: <Adaptec aic7880 Ultra SCSI adapter> > Feb 4 02:25:44 sgi kernel: aic7880: Wide Channel A, SCSI Id=0, > 16/253 SCBs > Feb 4 02:25:44 sgi kernel: > Feb 4 02:25:44 sgi kernel: Driver 'sd' needs updating - please use bus_type > methods > Feb 4 02:25:44 sgi kernel: (scsi0:A:1:0): data overrun detected in Data-in > phase. Tag == 0x2. > Feb 4 02:25:44 sgi kernel: (scsi0:A:1:0): Have seen Data Phase. Length = 0. > NumSGs = 1. > Feb 4 02:25:44 sgi kernel: sg[0] - Addr 0x017e0c040 : Length 32 > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] 143638992 512-byte hardware > sectors (73543 MB) > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write Protect is off > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Mode Sense: b3 00 00 08 > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write cache: enabled, read > cache: enabled, doesn't support DPO or FUA > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] 143638992 512-byte hardware > sectors (73543 MB) > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write Protect is off > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Mode Sense: b3 00 00 08 > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Write cache: enabled, read > cache: enabled, doesn't support DPO or FUA > Feb 4 02:25:44 sgi kernel: sda: sda1 sda9 sda11 > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: [sda] Attached SCSI disk > Feb 4 02:25:44 sgi kernel: target0:0:2: FAST-10 WIDE SCSI 20.0 MB/s ST (100 > ns, offset 8) > Feb 4 02:25:44 sgi kernel: (scsi0:A:2:0): data overrun detected in Data-in > phase. Tag == 0x2. > Feb 4 02:25:44 sgi kernel: (scsi0:A:2:0): Have seen Data Phase. Length = 0. > NumSGs = 1. > Feb 4 02:25:44 sgi kernel: sg[0] - Addr 0x017e0c040 : Length 32 > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] 35694860 512-byte hardware > sectors (18276 MB) > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write Protect is off > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Mode Sense: a7 00 10 08 > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write cache: disabled, read > cache: enabled, supports DPO and FUA > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] 35694860 512-byte hardware > sectors (18276 MB) > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write Protect is off > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Mode Sense: a7 00 10 08 > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Write cache: disabled, read > cache: enabled, supports DPO and FUA > Feb 4 02:25:44 sgi kernel: sdb: sdb1 sdb2 sdb3 sdb9 sdb11 > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: [sdb] Attached SCSI disk > Feb 4 02:25:44 sgi kernel: Driver 'sr' needs updating - please use bus_type > methods > Feb 4 02:25:44 sgi kernel: sr0: scsi-1 drive > Feb 4 02:25:44 sgi kernel: Uniform CD-ROM driver Revision: 3.20 > Feb 4 02:25:44 sgi kernel: sr 0:0:4:0: Attached scsi CD-ROM sr0 > Feb 4 02:25:44 sgi kernel: mice: PS/2 mouse device common for all mice > Feb 4 02:25:44 sgi kernel: input: AT Raw Set 2 keyboard as > /class/input/input0 > Feb 4 02:25:44 sgi kernel: TCP bic registered > Feb 4 02:25:44 sgi kernel: Initializing XFRM netlink socket > Feb 4 02:25:44 sgi kernel: NET: Registered protocol family 1 > Feb 4 02:25:44 sgi kernel: NET: Registered protocol family 17 > Feb 4 02:25:44 sgi kernel: NET: Registered protocol family 15 > Feb 4 02:25:44 sgi kernel: RPC: Registered udp transport module. > Feb 4 02:25:44 sgi kernel: RPC: Registered tcp transport module. > Feb 4 02:25:44 sgi kernel: registered taskstats version 1 > Feb 4 02:25:44 sgi kernel: scsi: waiting for bus probes to complete ... > Feb 4 02:25:44 sgi kernel: input: PS/2 Logitech Mouse as /class/input/input1 > Feb 4 02:25:44 sgi kernel: EXT3-fs: INFO: recovery required on readonly > filesystem. > Feb 4 02:25:44 sgi kernel: EXT3-fs: write access will be enabled during > recovery. > Feb 4 02:25:44 sgi kernel: kjournald starting. Commit interval 5 seconds > Feb 4 02:25:44 sgi kernel: EXT3-fs: sda1: orphan cleanup on readonly fs > Feb 4 02:25:44 sgi kernel: ext3_orphan_cleanup: deleting unreferenced inode > 6276169 > Feb 4 02:25:44 sgi kernel: EXT3-fs: sda1: 1 orphan inode deleted > Feb 4 02:25:44 sgi kernel: EXT3-fs: recovery complete. > Feb 4 02:25:44 sgi kernel: EXT3-fs: mounted filesystem with ordered data > mode. > Feb 4 02:25:44 sgi kernel: VFS: Mounted root (ext3 filesystem) readonly. > Feb 4 02:25:44 sgi kernel: Freeing unused kernel memory: 200k freed > Feb 4 02:25:44 sgi kernel: sd 0:0:1:0: Attached scsi generic sg0 type 0 > Feb 4 02:25:44 sgi kernel: sd 0:0:2:0: Attached scsi generic sg1 type 0 > Feb 4 02:25:44 sgi kernel: sr 0:0:4:0: Attached scsi generic sg2 type 5 > Feb 4 02:25:44 sgi kernel: Adding 233464k swap on /dev/sdb3. Priority:-1 > extents:1 across:233464k > Feb 4 02:25:44 sgi kernel: EXT3 FS on sda1, internal journal > Feb 4 02:25:44 sgi kernel: device-mapper: uevent: version 1.0.3 > Feb 4 02:25:44 sgi kernel: device-mapper: ioctl: 4.12.0-ioctl (2007-10-02) > initialised: [EMAIL PROTECTED] > Feb 4 02:25:44 sgi kernel: process `syslogd' is using obsolete setsockopt > SO_BSDCOMPAT > Feb 4 02:25:49 sgi kernel: NET: Registered protocol family 10 > Feb 4 02:25:49 sgi kernel: lo: Disabled Privacy Extensions > Feb 4 02:25:51 sgi kernel: lp: driver loaded but no devices found > Feb 4 02:25:59 sgi kernel: eth0: no IPv6 routers present > > Bye, > Giuseppe
-- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

