On Dec 9, 2007 4:53 PM, Petter Urkedal <[EMAIL PROTECTED]> wrote:
> On 2007-12-09, Timothy Normand Miller wrote:
> > Petter,
> >
> > I had some more caffeine and looked over your oga1hq_io.v file.  It's
> > exactly what I had in mind and nicely organized.  Excellent job!
>
> Thanks, but I also think you made some excellent points with the code in
> your previous post:
>
> 1.  You do reads purely combinatorially.  That way the HQ memory stage
> can register it, and we avoid muxing it into the output.  I think there
> is enough time within the HQ memory stage for this, and that we should
> avoid incurring delay after it's output, since that can spill into more
> critical parts due forwarding.  I guess we won't know before we try, but
> what does your intuition say?

My intuition from earlier (that reads are just a big MUX) left out the
fact that some of these reads cause outputs.  Reading from a fifo
causes deq to be pulsed, meaning that while the data being pulled into
the CPU is just coming through a big MUX, there's more do it.  You can
separate that logic if you like, although you'll end up with the same
logic in the end, just organized differently in the source code.  Do
what makes the most sense and is easiest to maintain.

> 2.  As you point out, we don't need separate outputs for each external
> unit, only two ports and individual binary lines.  That should save a
> fair amount of logic.  This also means we can collapse most of the
> non-trigger port addresses, which reduces muxing for writes.  I'd have
> to look closer at what can be shared.

Actually, I was thinking that we could collapse more of the trigger
write ports and fewer of the non-trigger ones.  This is particularly
an issue where we want to interleave access to different agents.  An
address port, with or without autoincrement, is looked at every time
we access the trigger data port.  We can't let accesses to more than
one agent clobber each other's address ports.  On the other hand, the
data port is only valid for one cycle and we never care what was
written to it the last time.  That last statement is invalidated,
however, if we want to pay attention to "full" signals, where the
protocol requires that we maintain the register contents until full is
deasserted.

We need to look at the things that are mutually exclusive.  If HQ were
inside of the Spartan, we would give it independent read and write
channels.  However, since the bridge isn't full duplex, we can only do
one thing at a time.  This may give us room for some sharing.
However, it's conceivable that we'd set up a write address, write a
few words, request a read (with its own address), and then assume we
can start another write where we left off without explicitly updating
the address.

SO... when we decide to combine things, we need to be careful to
document all the side-effects that can occur so that the programmer
knows that when they've done this read that that write address is no
longer valid, etc.  We could even make some of this more explicit by
giving them the same I/O address!

Don't take anything I'm saying to discourage you against making these
combinings; I'm just trying to work through the usage scenarios and
the tradeoffs.

Note:  I made that write address auto-incrementer able to increment
the whole word.  We need to restrict the counter to the lower 6 bits.
This is faster and requires less logic.  The drawback is that HQ is
responsible for updating the whole address when crossing the 64-word
boundary.  (We may change the boundary to like 256 word or something
later, depending on how things go with static timing.)

> So, if you agree, I'd try to rewrite more like your suggestion and see
> what simplification we can make.

Although I'm very happy with what you've done, I do encourage
experimentation.  Conversely, I feel that we're under a time crunch
and that if we have something that will do the job well, it may be
better for you to spend your time on something else.

> I'd say we just define these as symbolic constants in the assembler and
> declare each as either read or write and any other use to be undefined.
> If we want a port (like a self-incrementing address) to be both readable
> and writeable we'll make sure the address is the same for both
> directions.

Write addresses would increment for each word.  We wouldn't want
auto-inc for reads, because we'd have to have a full adder in there to
increment the address by the read count.  Better to add another
instruction and save the extra logic.

> > I added reset logic.  The dequeue signal has to come out a cycle
> > earlier (its assertion causes the next data to appear on the next
> > cycle).  I put in some perhaps unnecessary checks against full for
> > writes; see the comment.    I added byte enable flags for writes.  I
> > added auto-increment for write addresses.  (Hmmm... a side-effect we
> > have to document carefully!)
>
> Yes, it looks good.  I didn't think we needed reset, but of course, when
> adding incrementing address and the conditional reset of enq/deq, it
> becomes stateful.

In these FPGAs, there is a global reset that we take advantage.
Howard understands this better than I do.  But for correctness and to
make simulation work _at all_, we need to be sure to have sane reset
values.  Otherwise, those registers start out as x's (undefined) in
simulation and we get something that doesn't work.  Plus, you need to
make sure that, in hardware, the enqueue and dequeue signals (at the
very least) are deasserted so you don't get anything spurious going
on.




-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to