On 5/30/05, Eric Smith <[EMAIL PROTECTED]> wrote: > Timothy asks: > > Do you know if we're allowed to require BIOS calls to change between > > text and graphics modes, or do we have to expect the hardware to > > always work right when you bang on it directly? > > It has to work right without the BIOS. Linux doesn't use the BIOS calls, > and I don't think Windows does either. > > Patrick wrote: > > Here are the bounds I was working in for this program: > > Load/Store architecture > > Each non memory access takes 5 100MHz clocks > > Timothy wrote: > > What do you mean by "non memory access"? Are you talking about the > > fetch-decode-load-compute-store cycle of instruction execution? > > BTW, since the memory is dual-ported, we can do some overlap. > > It's hard to believe that it's not worthwhile to use a traditional > RISC 3-state pipeline. It's not hard to do.
It's impossible to do if you can't do arbitrary combinations of memory accesses. We can access two registers at a time, and the program file is unified with the register file. That severely restricts what we can do in terms of pipelining. > > Where does the 100 MHz clock enter the picture? It's easy to get > a RISC running in an S3 at 80 MHz, and a lot less easy at 100 MHz. > Worst case, divide by two and run a single-cycle RISC at 50 MHz; > that will be 2.5x faster for non-memory stuff than the "5 100MHz clock" > approach. The 100MHz figure is totally arbitrary. If we can run it faster, we will. If it has to run slower, that's okay too. The key point here is that the design needs to be very small, so putting in a general-purpose CPU core is not going to do. What we are designing right now is probably over-sized already. > > > Branch instructions would also take 3 to 5. > > Put in a branch delay slot. Then branches take 1 cycle, but if the > delay slot can't be filled, they effectively take 2 cycles. Most > of the time delay slots can be filled. Our inability to access more than one register at a time kinda makes this moot. > I think way too much effort is being spent on designing the > nanocontroller, especially given the poor performance that is > expected to result from it. Just drop an aeMB (MicroBlaze clone) in > there, and hook up Timothy's proposed memory access FIFOs to it. > It seems like this should work at least as well as the nanocontroller, > and probably better, and there's much less effort needed. The > only drawback is that it might be slightly larger than a custom > nanocontroller design, but I don't think that will amount to more > than a 50% increase. This isn't a bad idea, although I need to know more about how it expects to access memory and where it stores its program file. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
