Linus,

Sorry for abusing of your presence in this list, but I'm sure that (as 
you're much more familiar with these subjects) even a quick answer can 
provide more clues than the ones I've gathered so far by trial and error.

The origin of the problem is the way as the Mach64 chip exposes its bus 
mastering (BM) abilities. The chip reads DMA buffers which are pointed by 
entries in a ring buffer residing in system memory - it walks through each 
entry in the ring and executes the graphical commands contained in the DMA 
buffer which is pointed by the entry, until it reaches an entry marked 
with a special end flag.

So far so good, but the chip provides no callback mechanism for the 
conlusion of the BM, i.e., there is no hardware interrupt to know when 
it's finished so that we can start a new one. But through the registers we 
can know where the engine is at (or stopped at), i.e., which pair 
entry/buffer is being processed. So to avoid being constantly checking for 
conclusion before asking to process new entries we devised a different 
scheme:

  - after adding new entries to the ring

  - toggle the end flag of the previous last entry, so that the engine 
will also process our just commited buffers

  - check if it's idle (due to lack of timely buffer additions) and ask to 
process the remaining entries (if there is any) from the position it was 
previous stopped

Although we still need to check for idle, the engine is idle much less 
times and that really makes a difference on the fps obtained in slow 
machines (+20%)

Although this works really very well, but on a slow machine I experience a 
lockup every once in a while. The register and memory dumps (when 
available) show that engine jumps to arbitrate positions of memory instead 
of keep reading inside the piece of memory whe supplied for the ring 
buffer. Once 
My suspicion is that there is some kind of race condition when accessing 
the previous last entry, but here is were my knowledge starts to fail:

  - Are the system memory accesses by the processor and the bus serialized 
or concurrent?
  - Has this problem ever appeared on the Linux kernel before, and how was 
it solved?
  - How does the presense of an AGP aperture or MTRRs covering that memory 
affects that access?

I would really appreciate if you could give some hints or references to 
this. If anyone else on this list besides Linus has any observations to do 
regarding this I would also appreciate!

Thanks in advance,

José Fonseca

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas - 
http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to