Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
On Fri, 2012-10-19 at 18:06 +0100, Simon Farnsworth wrote: On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote: Mauro, Linux-Media I have an issue where an SAA7134-based TV capture card connected via a PCIe to PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes skips updating lines of the capture when the GPU is in RC6. We've confirmed that a CX23418 based chip doesn't have the problem, so the question is whether the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus. My money's on the saa7134 driver's irq_handler or the driver's locking scheme to protect data accessed by both irq handler and userspace file operations (aka videobuf's locking) in the driver. It could also be a system level problem with another driver's irq handler being stupid. This manifests as a regression, as I had no problems with kernel 3.3 (which never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with current Linus git master. I'm happy to try anything, Profile the saa7134 driver in operation: http://www.spinics.net/lists/linux-media/msg15762.html That will give you and driver writers a clue as to where any big delays are hapeening in the saa7134 driver. Odds are the processor slowing down to a lower power/lower speed state is exposing inefficiencies in the irq handling of the saa7134 driver. I've attached lspci -vvx output (suitable for feeding to lspci -F) for when the corruption is present (lspci.faulty) and when it's not (lspci.working). Doing a diff between the two files and checking what devices have changed registers I noted that only 3 devices' PCI config space registers changed: 00:01.0 and 00:1c.1 (both PCIe ports/bridges) and 00:1a.0. $ lspci -F lspci.working -tv -[:00]-+-00.0 Intel Corporation 2nd Generation Core Processor Family DRAM Controller +-01.0-[01-02]00.0-[02]08.0 Philips Semiconductors SAA7131/SAA7133/SAA7135 Video Broadcast Decoder +-02.0 Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller +-16.0 Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 +-19.0 Intel Corporation 82579V Gigabit Network Connection +-1a.0 Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 +-1b.0 Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller +-1c.0-[03]-- +-1c.1-[04]00.0 NEC Corporation uPD720200 USB 3.0 Host Controller +-1d.0 Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 +-1f.0 Intel Corporation H67 Express Chipset Family LPC Controller +-1f.2 Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller \-1f.3 Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller Obviously the changes to the bridge at 00:01.0 might matter, but I would need to dig up the data sheet for the 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port [8086:0101] (rev 09) (prog-if 00 [Normal decode]) to see if it really mattered. The speculation is that the SAA7134 is somehow more sensitive to the changes in timings that RC6 introduces than the CX23418, and that someone who understands the saa7134 driver might be able to make it less sensitive. I heavily optimized the cx18 driver for the high throughput use case (mutliple cards running multiple data streams), which meant squeezing every little bit of useless junk out of the irq handler and adding highly granular buffer queue locking between the irq handling and the userspace file operations calls. Also the CX23418 firmware has a best effort buffer notification handshake and the cx18 driver does some extra recovery processing to handle when it is late on handling buffer notifications. All that optimzation and robustness coding took me a few months to get right. I don't see that sort of optimization of the saa7134 driver coming anytime soon. Regards, Andy And timings are definitely the problem; I have a userspace provided pm_qos request asking for 0 exit latency, but I can see CPU cores entering C6. I'll take this problem to an appropriate list. There is still be a bug in the SAA7134 driver, as the card clearly wants a pm_qos request when streaming to stop the DMA latency becoming too high; this doesn't directly affect me, as my userspace always requests minimal DMA latency anyway, so consider this message as just closing down the thread for now, and as a marker for the future (if people see such corruption, the saa7134 driver needs a pm_qos request when streaming that isn't currently present). -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to
Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote: Mauro, Linux-Media I have an issue where an SAA7134-based TV capture card connected via a PCIe to PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes skips updating lines of the capture when the GPU is in RC6. We've confirmed that a CX23418 based chip doesn't have the problem, so the question is whether the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus. This manifests as a regression, as I had no problems with kernel 3.3 (which never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with current Linus git master. I'm happy to try anything, I've attached lspci -vvx output (suitable for feeding to lspci -F) for when the corruption is present (lspci.faulty) and when it's not (lspci.working). The speculation is that the SAA7134 is somehow more sensitive to the changes in timings that RC6 introduces than the CX23418, and that someone who understands the saa7134 driver might be able to make it less sensitive. And timings are definitely the problem; I have a userspace provided pm_qos request asking for 0 exit latency, but I can see CPU cores entering C6. I'll take this problem to an appropriate list. There is still be a bug in the SAA7134 driver, as the card clearly wants a pm_qos request when streaming to stop the DMA latency becoming too high; this doesn't directly affect me, as my userspace always requests minimal DMA latency anyway, so consider this message as just closing down the thread for now, and as a marker for the future (if people see such corruption, the saa7134 driver needs a pm_qos request when streaming that isn't currently present). -- Simon Farnsworth Software Engineer ONELAN Ltd http://www.onelan.com signature.asc Description: This is a digitally signed message part.
Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
RC6 plus CPU C6 would also put the whole package into a low power state. It's possible we're missing some initialization to keep things up for other system activity like bus mastering on PCIe? Just thinking out loud here, unfortunately I don't know of any settings that might control this. But package level changes are one other thing that would be affected by RC6 enabling. Jesse On Fri, 19 Oct 2012 17:10:17 +0100 Simon Farnsworth simon.farnswo...@onelan.co.uk wrote: Mauro, Linux-Media I have an issue where an SAA7134-based TV capture card connected via a PCIe to PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes skips updating lines of the capture when the GPU is in RC6. We've confirmed that a CX23418 based chip doesn't have the problem, so the question is whether the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus. This manifests as a regression, as I had no problems with kernel 3.3 (which never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with current Linus git master. I'm happy to try anything, I've attached lspci -vvx output (suitable for feeding to lspci -F) for when the corruption is present (lspci.faulty) and when it's not (lspci.working). The speculation is that the SAA7134 is somehow more sensitive to the changes in timings that RC6 introduces than the CX23418, and that someone who understands the saa7134 driver might be able to make it less sensitive. Details of the most recent tests follow: On Friday 19 October 2012 15:52:32 Simon Farnsworth wrote: On Friday 19 October 2012 16:26:08 Daniel Vetter wrote: Ok, this is really freaky stuff. One thing to triage: Is it just sufficient to put the gpu into rc6 to corrupt the dma transfers, or is some light X/gpu load required? In either case, rc6 being able to corrupt random dma transfers (or at least prevent them from reaching their destination) would be a fitting explanation for the leftover rc6 issues on snb ... In an attempt to have this happen with the GPU as idle as possible, I did the following (note that I'm on a gigabit Ethernet segment, so I can burn network bandwidth while testing): 1. Start X.org with -noreset, and don't start any X clients. 2. Run xset dpms force off ; xrandr --output DP2 --off (DP2 is the connected output). 3. On the affected machine, run gst-launch v4l2src ! gdppay ! tcpclientsink host=f17simon port=65512 4. On my desktop, run gst-launch tcpserversrc host=0.0.0.0 port=65512 ! gdpdepay ! xvimagesink I see the corruption continue to happen, even though the GPU should be idle and in RC6 state most of the time (confirmed by reading /sys/class/drm/card0/power/rc6_residency_ms and seeing it increase between reads). When I run intel_forcewaked from intel_gpu_tools, the corruption goes away, and I can confirm by reading /sys/class/drm/card0/power/rc6_residency_ms that the GPU does not enter RC6. Killing intel_forcewaked makes the corruption reappear while streaming over the network (X11 idle). As a follow up - Daniel requested via IRC that I try with a different capture card; I've switched to a HVR-1600 (cx18 driver instead of saa7134), and I've also tried with the X server forcibly quiesced via kill -STOP. Quiescing the X server doesn't help; however, the HVR-1600 does not show the problem. This suggests that it's an interaction between the SAA7134 based TV card, the bridge chip, and the different PCIe timings when the GPU is in RC6. -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html