NVIDIA root caused this at one point and came to the conclusion
that Linux kernel was incorrectly mapping the memory as cached.
Experiments with setting bit{63} of Base Register fixed the
problem. 

                        Mark.


On 11 Aug 2003, John Dennis wrote:

> I'm trying to track down a problem on ia64 (Itanium), it seems to
> manifest itself only with the Nvidia (nv) driver but I don't think
> this is an nv driver issue, rather I think its a generic issue in
> SlowBCopy which the nv driver happens to invoke.
> 
> Symptom:
> --------
> 
> The first time the nv driver is used for accelerated drawing the X
> server enters an infinite loop that consumes almost all the CPU
> cycles. Also, if you scan the pci bus at this point (e.g. with lspci)
> you will discover bogus config values on the nv card (all ones,
> e.g. ~0).
> 
> Cause:
> ------
> 
> The cause of the infinite loop in the nv driver is a sync function
> that polls a device register waiting for the engine to go idle. At
> this point all pci reads from the nv card on the bus return values
> that are all ones (~0). These are not valid values, but I believe this
> is defined behavior for PCI bridges after a master abort. The nv sync
> function never exits its loop because the register its polling is not
> being read correctly because of pci bus problems. This also explains
> why scanning the pci bus (e.g. lspci) no longer works when the scan
> gets to the nv card, its says 7F is not a valid config value and skips
> the card (note all values printed below are all ones).
> 
> 80:00.0 Class ffff: nVidia Corporation NV25GL [Quadro4 900 XGL] (rev ff)
> (prog-if ff)
>         !!! Unknown header type 7f
>  
> 
> Root Cause:
> -----------
> 
> After much investigation I tracked the problem down to VGA font save
> and restore which invokes SlowBCopy. SlowBCopy would appear to a hack
> designed to slow down bus transactions, seemingly used only when
> accessing VGA data. There are two basic variants of SlowBCopy, on x86
> architectures there is an asm version which basically just inserts
> extra machine instructions in the loop that copies the data. For
> some non-x86 architectures the copy loop includes:
> 
>       outb(0x80, 0x00);
> 
> I learned via correspondence in the past the purpose of this outb is
> no-op, supposedly io port 0x80 is not used for anything and thus the
> write to this port does nothing other than introducing a delay.
> 
> Sort of fix:
> ------------
> 
> On April 7th Egbert Eich checked in a fix to SlowBcopy.c (rev 1.6)
> that introduced an extra outb delay before entering the copy loop,
> xf86SlowBcopy now looks like this:
> 
> void
> xf86SlowBcopy(unsigned char *src, unsigned char *dst, int len)
> {
> #if defined(__ia64__)
>     outb(0x80, 0x00);
> #endif
>     while(len--)
>     {
>       *dst++ = *src++;
> #if !defined(__sparc__) && !defined(__powerpc__) && !defined(__mips__)
>       outb(0x80, 0x00);
> #endif
>     }
> }
> 
> The fix Egbert introduced fixed the nv hang we were seeing on HP
> ZX2000's and the subsequent PCI Bus corruption (e.g. card only returns
> ~0 on all PCI reads). I thought we now had a fix for all nv cards on
> all HP ZX systems. But my elation was premature. The exact same
> symptoms reared its head on HP ZX 6000's even with the above fix. As
> long as SlowBCopy for VGA font save/restore is not called things work
> fine on the ZX 6000's.
> 
> Not surprised SlowBCopy is not robust:
> --------------------------------------
> 
> For reasons I don't understand (can somebody explain this to me?)
> reads/writes to VGA data perform slower than bus transactions. This
> would appear to be why SlowBCopy was introduced originally, to slow
> down reads and writes to the VGA data and hence either preventing the
> data from being corrupted and/or prevent the bus from getting into a
> bad state when bus transactions start to timeout.
> 
> Now it seems to me that using extra machine instructions (asm version)
> or no-op IO is inherently a risky solution to this problem. It would
> appear there is some interval of time one must wait for individual VGA
> bus transactions complete. The number of extra machine instructions
> and/or no-op IO to insert seems to be purely a guess and highly
> dependent on the processor and the bus its sitting on. The fact this
> works on one class of machines and not another does not surprise me at
> all.
> 
> So my real questions are:
> -------------------------
> 
> 1) Why are VGA transactions so slow and is there a known timing value?
> 
> 2) Is the fact that reads from the nv card return as all ones (~0)
> due to PCI master abort as a consequence of timing out on a VGA
> transaction and is the PCI bridge never recovering from the abort (I
> believe its the bridge who is responsible for returning all
> ones). And if this is true can/should the bridge be configured not to
> stay in this state (it seems to stay in this state until hw reset).
> 
> OR
> 
> Is it not the bridge that is the culprit for returning all ones but
> the card (an nvidia in this instance) that is in some screwed up state
> such that it returns all ones till hw reset? 
> 
> I think this is important distinction, if this is a PCI bridge
> configuration issue we might be able to address it in a more generic
> manner. If its a card issue then that suggests a driver specific
> solution. 
> 
> 3) Can we come up with a scheme that introduces a known timing delay
> (e.g. usleep) such that we don't have to make arbitrary guesses as to
> how much no-op is needed in the loop on a given system?
> 
> 4) Is my general analysis correct? If not can you help explain where
> I'm missing the mark and what the actual issues are?
> 
> John
> 
> -- 
> John Dennis <[EMAIL PROTECTED]>
> 
> _______________________________________________
> Devel mailing list
> [EMAIL PROTECTED]
> http://XFree86.Org/mailman/listinfo/devel
> 

_______________________________________________
Devel mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/devel

Reply via email to