Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems

2012-10-20 Thread Andy Walls
On Fri, 2012-10-19 at 18:06 +0100, Simon Farnsworth wrote:
 On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote:
  Mauro, Linux-Media
  
  I have an issue where an SAA7134-based TV capture card connected via a PCIe 
  to
  PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
  skips updating lines of the capture when the GPU is in RC6. We've 
  confirmed
  that a CX23418 based chip doesn't have the problem, so the question is 
  whether
  the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe 
  bus.

My money's on the saa7134 driver's irq_handler or the driver's locking
scheme to protect data accessed by both irq handler and userspace file
operations (aka videobuf's locking) in the driver.

It could also be a system level problem with another driver's irq
handler being stupid.

  This manifests as a regression, as I had no problems with kernel 3.3 (which
  never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and 
  with
  current Linus git master. I'm happy to try anything, 

Profile the saa7134 driver in operation:

http://www.spinics.net/lists/linux-media/msg15762.html

That will give you and driver writers a clue as to where any big delays
are hapeening in the saa7134 driver.

Odds are the processor slowing down to a lower power/lower speed state
is exposing inefficiencies in the irq handling of the saa7134 driver.


 
  I've attached lspci -vvx output (suitable for feeding to lspci -F) for
  when the corruption is present (lspci.faulty) and when it's not
  (lspci.working). 

Doing a diff between the two files and checking what devices have
changed registers I noted that only 3 devices' PCI config space
registers changed: 00:01.0 and 00:1c.1 (both PCIe ports/bridges) and
00:1a.0. 

$ lspci -F lspci.working -tv
-[:00]-+-00.0  Intel Corporation 2nd Generation Core Processor Family DRAM 
Controller
   +-01.0-[01-02]00.0-[02]08.0  Philips Semiconductors 
SAA7131/SAA7133/SAA7135 Video Broadcast Decoder
   +-02.0  Intel Corporation 2nd Generation Core Processor Family 
Integrated Graphics Controller
   +-16.0  Intel Corporation 6 Series/C200 Series Chipset Family MEI 
Controller #1
   +-19.0  Intel Corporation 82579V Gigabit Network Connection
   +-1a.0  Intel Corporation 6 Series/C200 Series Chipset Family USB 
Enhanced Host Controller #2
   +-1b.0  Intel Corporation 6 Series/C200 Series Chipset Family High 
Definition Audio Controller
   +-1c.0-[03]--
   +-1c.1-[04]00.0  NEC Corporation uPD720200 USB 3.0 Host 
Controller
   +-1d.0  Intel Corporation 6 Series/C200 Series Chipset Family USB 
Enhanced Host Controller #1
   +-1f.0  Intel Corporation H67 Express Chipset Family LPC Controller
   +-1f.2  Intel Corporation 6 Series/C200 Series Chipset Family 6 port 
SATA AHCI Controller
   \-1f.3  Intel Corporation 6 Series/C200 Series Chipset Family SMBus 
Controller

Obviously the changes to the bridge at 00:01.0 might matter, but I would
need to dig up the data sheet for the 00:01.0 PCI bridge [0604]: Intel
Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI
Express Root Port [8086:0101] (rev 09) (prog-if 00 [Normal decode]) to
see if it really mattered.


 The speculation is that the SAA7134 is somehow more
  sensitive to the changes in timings that RC6 introduces than the CX23418, 
  and
  that someone who understands the saa7134 driver might be able to make it 
  less
  sensitive.

I heavily optimized the cx18 driver for the high throughput use case
(mutliple cards running multiple data streams), which meant squeezing
every little bit of useless junk out of the irq handler and adding
highly granular buffer queue locking between the irq handling and the
userspace file operations calls.  Also the CX23418 firmware has a best
effort buffer notification handshake and the cx18 driver does some
extra recovery processing to handle when it is late on handling buffer
notifications.  All that optimzation and robustness coding took me a few
months to get right.

I don't see that sort of optimization of the saa7134 driver coming
anytime soon.

Regards,
Andy

 And timings are definitely the problem; I have a userspace provided pm_qos
 request asking for 0 exit latency, but I can see CPU cores entering C6. I'll
 take this problem to an appropriate list.
 
 There is still be a bug in the SAA7134 driver, as the card clearly wants a
 pm_qos request when streaming to stop the DMA latency becoming too high; this
 doesn't directly affect me, as my userspace always requests minimal DMA
 latency anyway, so consider this message as just closing down the thread for
 now, and as a marker for the future (if people see such corruption, the
 saa7134 driver needs a pm_qos request when streaming that isn't currently
 present).


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to 

Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems

2012-10-19 Thread Simon Farnsworth
On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote:
 Mauro, Linux-Media
 
 I have an issue where an SAA7134-based TV capture card connected via a PCIe to
 PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
 skips updating lines of the capture when the GPU is in RC6. We've confirmed
 that a CX23418 based chip doesn't have the problem, so the question is whether
 the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus.
 
 This manifests as a regression, as I had no problems with kernel 3.3 (which
 never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with
 current Linus git master. I'm happy to try anything, 
 
 I've attached lspci -vvx output (suitable for feeding to lspci -F) for
 when the corruption is present (lspci.faulty) and when it's not
 (lspci.working). The speculation is that the SAA7134 is somehow more
 sensitive to the changes in timings that RC6 introduces than the CX23418, and
 that someone who understands the saa7134 driver might be able to make it less
 sensitive.
 
And timings are definitely the problem; I have a userspace provided pm_qos
request asking for 0 exit latency, but I can see CPU cores entering C6. I'll
take this problem to an appropriate list.

There is still be a bug in the SAA7134 driver, as the card clearly wants a
pm_qos request when streaming to stop the DMA latency becoming too high; this
doesn't directly affect me, as my userspace always requests minimal DMA
latency anyway, so consider this message as just closing down the thread for
now, and as a marker for the future (if people see such corruption, the
saa7134 driver needs a pm_qos request when streaming that isn't currently
present).
-- 
Simon Farnsworth
Software Engineer
ONELAN Ltd
http://www.onelan.com


signature.asc
Description: This is a digitally signed message part.


Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems

2012-10-19 Thread Jesse Barnes
RC6 plus CPU C6 would also put the whole package into a low power
state.  It's possible we're missing some initialization to keep things
up for other system activity like bus mastering on PCIe?

Just thinking out loud here, unfortunately I don't know of any settings
that might control this.  But package level changes are one other
thing that would be affected by RC6 enabling.

Jesse

On Fri, 19 Oct 2012 17:10:17 +0100
Simon Farnsworth simon.farnswo...@onelan.co.uk wrote:

 Mauro, Linux-Media
 
 I have an issue where an SAA7134-based TV capture card connected via a PCIe to
 PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
 skips updating lines of the capture when the GPU is in RC6. We've confirmed
 that a CX23418 based chip doesn't have the problem, so the question is whether
 the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus.
 
 This manifests as a regression, as I had no problems with kernel 3.3 (which
 never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with
 current Linus git master. I'm happy to try anything, 
 
 I've attached lspci -vvx output (suitable for feeding to lspci -F) for
 when the corruption is present (lspci.faulty) and when it's not
 (lspci.working). The speculation is that the SAA7134 is somehow more
 sensitive to the changes in timings that RC6 introduces than the CX23418, and
 that someone who understands the saa7134 driver might be able to make it less
 sensitive.
 
 Details of the most recent tests follow:
 
 On Friday 19 October 2012 15:52:32 Simon Farnsworth wrote:
  On Friday 19 October 2012 16:26:08 Daniel Vetter wrote:
   Ok, this is really freaky stuff. One thing to triage: Is it just
   sufficient to put the gpu into rc6 to corrupt the dma transfers, or is
   some light X/gpu load required? In either case, rc6 being able to
   corrupt random dma transfers (or at least prevent them from reaching
   their destination) would be a fitting explanation for the leftover rc6
   issues on snb ...
   
  In an attempt to have this happen with the GPU as idle as possible, I did 
  the
  following (note that I'm on a gigabit Ethernet segment, so I can burn 
  network
  bandwidth while testing):
  
  1. Start X.org with -noreset, and don't start any X clients.
  2. Run xset dpms force off ; xrandr --output DP2 --off (DP2 is the 
  connected output).
  3. On the affected machine, run gst-launch v4l2src ! gdppay ! 
  tcpclientsink host=f17simon port=65512
  4. On my desktop, run gst-launch tcpserversrc host=0.0.0.0 port=65512 ! 
  gdpdepay ! xvimagesink
  
  I see the corruption continue to happen, even though the GPU should be idle
  and in RC6 state most of the time (confirmed by reading
  /sys/class/drm/card0/power/rc6_residency_ms and seeing it increase between
  reads). When I run intel_forcewaked from intel_gpu_tools, the corruption 
  goes
  away, and I can confirm by reading 
  /sys/class/drm/card0/power/rc6_residency_ms
  that the GPU does not enter RC6. Killing intel_forcewaked makes the 
  corruption
  reappear while streaming over the network (X11 idle).
  
 As a follow up - Daniel requested via IRC that I try with a different capture
 card; I've switched to a HVR-1600 (cx18 driver instead of saa7134), and I've
 also tried with the X server forcibly quiesced via kill -STOP.
 
 Quiescing the X server doesn't help; however, the HVR-1600 does not show the
 problem. This suggests that it's an interaction between the SAA7134 based TV
 card, the bridge chip, and the different PCIe timings when the GPU is in RC6.


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html