On 9/3/07, Patrick McNamara <[EMAIL PROTECTED]> wrote:

> As a starting point for having a single memory space for registers and
> RAM take for example the ATtiny45.  This controller has 32 general
> purpose registers, 64 I/O registers, and 256 bytes of RAM.  The memory
> maps looks something like this.

One of my concerns is that since we now have gotten some outside
attention wrt our VGA implementation, we may want to just go with
basically what we have, for the sake of expediency.

>
> My assumption in all this is that the controller does not have direct
> access to the card memory space.  That card memory access would be done
> through IO ports or we would have explicit instructions for card memory
> access ala the MOVX instruction on the Intel 8051 series.  All working
> memory for the controller would be in the controller core.

This is correct.

> IIRC, if a BRAM is 512x36 correct?  Since the BRAM is dual ported are
> allowed 2 reads and two writes per cycle assuming you read on one clock
> edge and write on the other.  We could break the BRAM in two, using half
> for memory/register and the upper half as dedicated stack space.  Even
> if you only get one read and write per cycle, appropriately designing
> the pipeline could work around this.

One thing about the BRAM.  If we were to try to use it as the primary
register file, I'm not sure we could double-pump it like we need to.
Routing to/from the RAM may impose too much delay.  One of the
advantages of the current architecture is that it is effectively
triple-ported.  We can write one reg and read two at the same time.
If we use the BRAM as you describe, we serialize it, making any
instruction that requires access to three operands take 3 cycles.

We have a nice MIPS-like pipeline working.  Going to a different
architecture would require a completely different pipeline, and
frankly, I don't know how to lay it out so that it's efficient.  Most
well-pipelined processors use this REG-REG RISC architecture like
MIPS, at least on an abstract level, but usually quite concretely.

> Something else I was thinking about relates to using the same controller
> core for both PCI and VGA duties.  We effectively have to be able to
> context switch to do this, and we have to be able to do it quickly to

Did you mean DMA and VGA?  We'd never be doing DMA at the same time as VGA.

If you're referring to the fact that it has to handle VGA translation
at the same time as intercepting PCI transactions so it can do the
rest of the VGA stuff, then you're right.

> meet PCI timing requirements.  I don't know what our BRAM budget is
> right now, but could we effectively have two sets of memory/registers
> and stack for the core.  When we need to context switch we switch BRAMS.

Switching BRAMs involves adding multiplexing.  I might suggest instead
that we virtually break the BRAM into four sections.  Two stacks and
two reg files of 128 entries each.  This way, the context switch
involves changing static address inputs to the BRAM.

> Context switching does of course bring us back to the problem of the
> multiplier.  If multiplying doesn't stall the pipe waiting for the
> answer, then we really don't want to context switch (or interrupt) in
> the middle of a multiply.  This causes all sort of problems though since

I think we can cope with this.  I and Petter both have some ideas
about handling the context switch.  As long as the timing isn't
critical (you can't read the product early, but you can read it late),
then we can manage it.

> we are effectively working in a realtime environment.  If we need to go
> service a PCI transaction, we can't wait 10-20 cycles for a pending
> multiply to finish.  This means that we have to have the output be
> context (or interrupt) aware.  If the multiplier is context aware then
> the answer could be written to a separate output as necessary.

If the muliplier is pipelined, then we can just give it more than one
set of control ports.  One for user context, the other for interrupt.

>
> Which brings me to a question that has been tickling the back of my head
> for a bit.  Why aren't we using the multipliers embedded in the FPGA?  I

The XP10 doesn't have any embedded multipliers.



-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to