2007/3/26, Timothy Normand Miller <[EMAIL PROTECTED]>:
On 3/25/07, Nicolas Boulay <[EMAIL PROTECTED]> wrote:

> > I don't know what you mean.  Are you suggesting more types of
> > load/store instructions?  Perhaps a way to get at graphics memory
> > directly?  The latency for getting access to graphics memory will be
> > up to hundreds of cycles (if it's busy) or as low as tens of cycles
> > (if it's not).  Either way, it's a win to do it as an I/O op.
>
> What do you called an 'I/O op' ? A specific instruction ? If you do a
> memory mapped load/store unit, the cpu will be easier but you will
> lose few cycle and maybe few complexe function, like
> read/modify/write, bus locking, etc...

All I/O will appear to be memory-mapped.  Yes, we'll lose cycles, but
we'll attach I/O systems that either are inherently high-latency
anyhow or will kinda take care of themselves.  For instance, reading
graphics memory really needs to be done where you request a large
number of reads in advance and then later take the data.  In order to
be able to maximize memory throughput, we really want to have many
requests queued up so that we can intelligently choose between them to
minimize delays (like row misses), but the disadvantage of high
throughput is also high latency.  Another for instance is PCI DMA; the
bus master will mostly take care of itself.  There'll be some
constraints, but basically, we can queue up a bunch of requests that
it will go off and process, and we just have to lazily monitor it.


In a Soc design, i have seen the use of a multiport SDRAM controller.
4 ports where 4 AMBA bus was connected. So 4 access could be
interlaced. It's was mostly DMA transfert.

Maybe each core that need SDRAM access could have a direct access to
the controler and the controler interlaced the access the best it
could.

>
> > Part of the elegance of the MIPS design is that we don't
> > have to have them, which makes the design simpler.  And we don't want
> > to reserve another 8 bits in the instruction word for this.  And if we
>
> You lake of memory space ? To be aligned to 8 bits or 32 bits word is
> only important for general purpose cpu. There you will have dedicated
> memory. With 1 or 2 more bits, you could often save 32 bits one or
> more and with a speed increase.

What I was saying is that to use general-purpose registers for
condition registers, we would need 5 bits to specify the register and
another few bits (3?) to specify how to interpret the contents of the
register.  Also, we would need a 3-port register file, which I just
don't even want to consider.  If the condition registers were in their
own register space, that would just complicate the design by basically
duplicating whole lot of routing logic.  Trust me; we don't want two
independent register sets.  (There's a good reason why there is a
separate register set for floating point in MIPS, but that's a
different animal.)


The idea is to use a specific register file with conditionnal bits of
each register. This specific register are written in the same time
than the original register, so you shorten the pipeline and avoid to
calculate a condition for each instruction.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to