On Wed, 2008-05-07 at 19:07 -0400, JP Fournier wrote:
> Andy Walls wrote:
> > On Sun, 2008-05-04 at 09:32 -0400, JP Fournier wrote:
> > Andy Walls wrote:
> > I still am inclined to believe some sort of VIA chipset errata is to
> > blame, but since VIA Tech doesn't appear to make their datasheets and
> > errata sheets readily available, I'm not hopeful that you'll ever find
> > the root cause or a solution.
> >
> >
> > But if you're interested in looking around for problem indicators,
> > there's still a chance to observe things:
> >
> > 1. Before running any captures, does the "Status:" line for any device
> > in the output of "lspci -vv" have a "+" next to SERR, PERR, ParErr, or
> > any of the Aborts?
>
> Yes:
>
>
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
> Host Bridge (rev 80)
> Subsystem: Giga-byte Technology GA-7VAX Mainboard
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINT
> x-
> Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort+ >SERR- <PERR- IN
> Tx-
> Latency: 8
> Region 0: Memory at d0000000 (32-bit, prefetchable) [size=128M]
> Capabilities: [80] AGP version 3.5
> Status: RQ=32 Iso- ArqSz=0 Cal=2 SBA+ ITACoh- GART64-
> HTrans- 64bit- FW+ AGP3+ Rate=x4,x8
> Command: RQ=1 ArqSz=0 Cal=2 SBA+ AGP+ GART64- 64bit-
> FW- Rate=x8
> Capabilities: [c0] Power Management version 2
> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> Kernel driver in use: agpgart-via
> Kernel modules: via-agp
>
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge (prog-if 00
> [Normal decode])
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINT
> x-
> Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort+ >SERR- <PERR- IN
> Tx-
> Latency: 0
> Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> Memory behind bridge: ec000000-edffffff
> Prefetchable memory behind bridge: d8000000-dfffffff
> Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
> BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B-
> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> Capabilities: [80] Power Management version 2
> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> Kernel modules: shpchp
>
> 00:0c.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge
> (non-transparent mode) (rev 11) (prog-if 00 [Nor
> mal decode])
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR+ FastB2B- DisINT
> x-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort- >SERR- <PERR- IN
> Tx-
> Latency: 32, Cache Line Size: 32 bytes
> Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
> Prefetchable memory behind bridge: e0000000-e7ffffff
> Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
> BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-
> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> Capabilities: [80] Power Management version 2
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0-,D1+,D2+,D3hot+,D3cold+)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> Bridge: PM- B3+
> Capabilities: [90] CompactPCI hot-swap <?>
OK. Master Aborts on the Host bridge, Both sides of the AGP bridge, and
on the CX23416 side of the PVR-500's PCI-PCI bridge.
Master aborts aren't terrible. What they mean is that when that device
was acting as a master to transfer data, the target it was trying to
contact didn't assert it's DEVSEL line within a few cycles. When this
happens the master aborts the transaction and returns 0xFFFFFFFF for
reads and discards write data. The master also sets the flag in its own
status register.
Master aborts shouldn't cause spontaneous reboots.
If you want to experiment with user actions to see what might cause
devices to flag errors, you can use setpci to clear the error flags by
writing a 1 to the error flag you want to clear. If you simply read the
status register and write it back out, you'll clear all the error flags
on that device.
For example, my host bridge at 00:00.0 is showing a master abort by a
bit set in the high nibble of the status register .
# setpci -s 00:00.0 STATUS <---- read the status reg
2220
# setpci -s 00:00.0 STATUS=2220 <---- write it back to the device
# setpci -s 00:00.0 STATUS <---- read back to see errors cleared
0220
You can use SEC_STATUS to do the same thing on the secondary side of
bridge devices.
>
>
> >
> > 2. What about with one capture ongoing?
>
> This didn't catch anything:
>
>
> [EMAIL PROTECTED]:~/data# while [ true ] ; do lspci -vv | grep Status | grep
> "SERR+"; sleep 1; done
>
> (did a capture in second session)
>
>
> >
> > 3. Can you write a script/program to capture the output "lspci -x"
> > periodically, and then start a second capture? ("lspci -F foo -vv" can
> > then be used to parse the saved output later.)
> >
> > I'm really expecting SERR+ to happen somewhere.
> >
>
> I'll try this shortly.
>
> >
> >
> >
> > And some more questions (not that they'll lead to any solutions):
> >
> > A. Do you have the symptoms with just the PVR-500 in the machine without
> > the PVR-250?
>
> Yes. I tired with the 250 removed from the system and the same problem
> occurred.
>
>
> >
> > B. Do two captures simultaneously from the PVR-500 go OK?
>
> No. A single capture from the 500 was enough to reproduce this. The
> "two captures at once" thing may be a red herring.
>
> >
> > C. Do you have the symptoms when you perform two captures, not using
> > MythTV, but just using
> > $ cat /dev/video0 > foo0.mpg &
> > $ cat /dev/video1 > foo1.mpg
> > ?
> >
>
> It turns out that
> $ cat /dev/video1 > foo1.mpg
> or
> $ cat /dev/video2 > foo1.mpg
> will cause the spontaneous reboot.
> (video1 and video2 are the pvr500)
Since it seems that the PVR-500 alone may be to blame for your
spontaneous reboots, I have an idea that I *hope* can fix your problem.
Use setpci to disable SERR reporting on the primary interface by the
PVR-500's bridge device, by clearing a bit in the primary command
register.
# setpci -s 00:0c.0 COMMAND <---- read the command reg
0107
# setpci -s 00:0c.0 COMMAND=0007 <---- write back with P_SERR disabled
# setpci -s 00:0c.0 COMMAND <---- read back to see flag cleared
0007
Now the Hint Corp Bridge (now called a PLX PCI6254 (HB6), I think)
shouldn't report SERR's on the primary interface. Hopefully that will
keep the VIA chips (or Linux?) from causing a spontaneous reboot.
>From the lspci output you provided parity error reporting/checking on
the primary interface is already turned off for this bridge.
Regards,
Andy
_______________________________________________
ivtv-users mailing list
[email protected]
http://ivtvdriver.org/mailman/listinfo/ivtv-users