On Thu, 2009-04-30 at 00:20 -0400, [email protected] wrote: > > I'm running centos 5.2 2.6.18-128.1.1.el5, mythtv mythtv-0.21-203.el5 > > I'm getting system lockups that appear to be the interaction of my > two video capture cards.
Can you be more specific? Does the lockup leave an Ooops or Bug in /var/log/messages you can read later? Is an Oops on panic dumped to the screen, and if so, can you transcribe all the info? When the lockup happens can you do a VT switch (Ctrl-Alt-F2) and log in to look at the logs or dmesg? > I'm running a PVR-350 and a LMLBT44. when > I'm running mythtv AND zoneminder, I get errors along the lines of: > > Apr 23 21:44:59 glutton kernel: bttv0: OCERR @ 7a807014,bits: HSYNC > OFLOW FBUS FDSR OCERR* (which seem to be tied to video capture from > the PVR-350. Well, no, that would be the bttv driver as indicated by the "bttv0:" in the message. "OCERR" is a "RISC instruction error" meaning the BT878 chip tried to execute a garbage instruction. See page 130 of: http://www.datasheetarchive.com/pdf-datasheets/Datasheets-10/DSA-187671.pdf You can load the bttv module with the bttv_debug=1 option (or set the option in /etc/modprobe.conf) to get a more detailed dump of the bad instructions. This sort of thing is likely caused by PCI bus errors when communicating with the BT878 chip in question. A busy PCI bus and poor connections could cause errors. It could also be just a failing BT878 chip. The best course of action for you, since from past experience I know the BT878 seems a little sensitive to the quality of PCI bus signal voltages, is to remove all your PCI cards, blow the dust out of the slots and reseat all your PCI cards. The PCI bus uses reflected voltage waves to get the proper voltage levels, so it is not sufficient to clean the dust/crud out of only the slot with the BT878. > In /proc/interrupts I see: > > 0: 306846 0 IO-APIC-edge timer > 1: 10 0 IO-APIC-edge i8042 > 8: 1 0 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 12: 471 0 IO-APIC-edge i8042 > 169: 1986 14619 IO-APIC-level ehci_hcd:usb1, eth0 > 177: 551 0 IO-APIC-level ohci_hcd:usb2, HDA Intel > 185: 22482 33199 IO-APIC-level sata_nv > 193: 419 15 IO-APIC-level sata_nv > 201: 9 0 IO-APIC-level bttv0 > 209: 76360 84342 IO-APIC-level bttv1, ivtv0 > 217: 4 0 IO-APIC-level bttv2 > 225: 1 0 IO-APIC-level bttv3 > NMI: 163 103 > LOC: 306690 306645 > > Is it possibly bad that one of the channels of the lmlbt44 and the > ivtv0 (pvr-350) are on the same interrupt (and thus acusing lockups)? The error is for bttv0 which doesn't share an interrupt with ivtv0. So, no, interrupt service shouldn't be the problem here. It doesn't look like bttv0, bttv2, and bttv3 had been doing much work at all when you looked at the interrupt counts. > I've dabbled to try to make them not use the same IRQ or change how > they load, but it doesn't seem to change anything or fix it. It probably wouldn't matter. > Any ideas what could be making the system lock? Things seem > completely stable when zonemidner isn't running, and seem to be > (still not 100% confident) as stable with ZM but not myth. It's the > two together that look to be killing me. It sure looks like a PCI bus error on writing RISC instructions to the BT878 for it to execute. I assume ZM only looks at the bttvN devices and myth does not, so it would make sense to have no bttv related problems when ZM was not running. Regards, Andy PS. The Bt879 data sheet I pointed to is interesting to me for 2 reasons: 1. It's got a Rockwell logo in the footer indicating it's from the time when Conexant was still the (switching?) circuits part of Rockwell-Collins and hadn't been spun off yet, but a time after they had bought BrookTree or BrookTree's intellectual property. 2. It's amazing how much of the Bt879 ended up inside the CX25840/1/2/3 video digitizer/decoder that's on many PVR-150 boards. > Rick > > > Rick Steeves > http://www.sinister.net _______________________________________________ ivtv-users mailing list [email protected] http://ivtvdriver.org/mailman/listinfo/ivtv-users
