One of the reasons I had thought about using behavioral Verilog (i.e.
using Verilog as a scripting language) to implement the simulator is
that there's a built-in event management system.  We can easily do
things like send signals that cause parallel processes to happen,
perform actions based on global and local timers, and even pause what
appears to be sequential code, because it's waiting on some other
event.  The problems with doing it this way include (a) some
algorithms are unnecessarily more challenging to express in Verilog
than C++, and (b) being interpreted, the performance will be far too
low for a useful simulator.

One way to handle event processing would be to have a global event
queue, into which you can put objects and method pointers to be called
at an appropriate time. This is clean for some things (handing off an
operation to the next stage in a pipeline, which is implemented in a
totally different method), while it's nasty for others where the most
straightforward implementation is sequential code with pauses in it.

Another way to handle it would be to spawn threads that represent the
parallel systems and use IPC (e.g. cond_waits) to block on information
(i.e. waiting for a memory read to come back) or send signals (i.e.
making the request in the first place).  The problem here is that we
could end up with thousands of thread contexts, and I don't know what
the feasibility of this is.  It would be scalable to arbitrary numbers
of CPUs, but the IPC overhead itself could be significant.  Allowing
green threads in the mix would constitute a compromise.

The simulator should match up with the hardware, but only at a high
level.  The objectives are performance and accuracy, and we can get
both if we can choose different ways to implement an algorithm that
don't look like hardware, as long as the outputs (timing, energy,
etc.) are correct.

One example where I think we'd benefit from deviating from exactly
modeling the hardware is in pipeline segments that don't have external
communication.  If you have a 6-stage pipeline segment that does
nothing but compute, then it would be best to implement it as one
chunk of code that posts its results to appear 6 cycles in the future,
rather than having each stage get entered into the event system.

Suggestions welcome, please.

-- 
Timothy Normand Miller, PhD
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to