On Mon, Oct 25, 2010 at 7:20 AM, Mark Marshall <[email protected]> wrote:
> I agree with the other comments here though. A video-mem to video-mem copy > should be done in the S3, as should block fills. If we are copying > video-mem to system-mem then I think we really need to bus master to get > this "fast". The trick (I guess) will be to free up the PCI bus and the > host CPU while we are reading from RAM to S3 and then from S3 to XP10. Only > when the data is in the XP10 do we want to use the PCI bus, and then we > still don't want to bother the CPU. How about we drop _another_ HQ into the S3, just as a quick hack for starters. We need a real blt engine, but until we run out of space for something else, there's no harm in having a microcontroller sitting in the S3. > > I'd really like for someone to be working on PCI bus master, as I think > that's the one feature that's really going to make things faster. > To do the PCI target, I first designed a simulation-only model that I debugged, and then I converted it to a synthesizable form (and in the process found some bugs and such). As a test harness for this, I also developed a simulation version of a master. This master is designed to be as aggressive as possible with the protocol so as to test the target. Some of these things need to be relaxed a bit, and some changes need to be made regarding how it handles transaction termination and such. And then it can be converted to a synthesizable form. Finally, we need to make some decisions on how the thing is controlled. And I don't remember enough about it to talk further about this, although we did already discuss this at length on the list in the past. Basically, HQ drops commands into a queue, which the master processes in sequence, and the commands are things like "read X pci words from address Y and send them to address Z in graphics memory". So the master would be connected to like three fifos. Oh, and then there's the matter of integrating the two to use the same set of pins. We play a trick in the master to allow MUXes to be as close to the pins as possible. For a handful of signals, we'll have to pull that logic out into a master-target wrapper. And then anything that is registered will have to be pulled out into the wrapper, because there will be more than one piece of logic that might require an input be registered in or an output registered out. Let's not worry about this too much for now. For the short term, let's drop an HQ into the S3. We'll have to give it its own read and write ports on the arbiter and make its program memory available in the register space. Then following that, we can put in a real blt engine. -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
