> Yes, but we had this clever idea of unifying the two.  This way,
> there's no need for a special instruction for full-size immediate
> constants, for instance.  (But just because it's cute doesn't mean
> it's a good idea.)

Just use a second blockram for contant storage.  The blockram gets
loaded when the FPGA is loaded, and any of it that you're not using
for contants can be used for variables.

Perhaps the constants could be shared with the instruction blockram,
provided that you never use two contants as both operands of a
single instruction.  This is easiest if you restrict the contants
to being source operand B.

> Any given instruction can do two reads at the same time, followed by a
> write.  Include instruction fetch.  Overlap that with four other
> instructons also in the pipeline, and that's a lot of memory activity.

That's not how pipelining works.  Not counting the instruction fetch
(because that should come from a separate RAM), there are only three
data RAM accesses per cycle, two reads and a write.  When an instruction
is in the pipeline, it only gets its two data fetches while it is in
the fetch stage, and it only gets its write when it is in the writeback
stage.  So even though there are multiple instructions in flight, only
one is fetching register operands and only one is writing results
back on any given clock cycle.  That's why you only need a three-port
register file.

Another way to look at it is that if you do an "add r1, r2, r3"
instruction, it may be in the pipeline for three or four clocks, but
it only fetches its source operands once, and it only writes its
results back once.

> As you start adding ports, you might as well just use random logic,
> which is one bit per CLB.

No, because with a three-port (2-read, 1-write) register file, you
get 64 bits per CLB.  If you had to expand that to five ports
(4-read, 1-write), you still get 32 bits per CLB.

> Doesn't sound very economical.  Also, you're doing this for
> performance... do we really need the performance?

It's both better performance AND more compact, because it's using
the FPGA resources the way the designers intended, rather than trying
to shoehorn in something else.  And yes, a VGA text display that only
can update at 10 Hz needs more performance.

Eric

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to