On 2007-09-06, Timothy Normand Miller wrote:
> On 9/6/07, Petter Urkedal <[EMAIL PROTECTED]> wrote:
> 
> >   * Looking at the inner loop of draw_glyph, it doesn't clearly favour
> >     register-like IO over memory-like IO.  I ended up translating the
> >     write_io statements to simple move instructions (rather than
> >     computed values), except if we unroll the loop, in which case the
> >     address writes can be combined with addition of constants 0 to 7.
> 
> It's true.  If we were to unroll the loop, we would probably AND with
> constants (1, 2, 4, 8...), and for each, there would be a branch based
> on whether or not the target register is zero.  Something like:
> 
>     andi #1, r3, r4
>     bnz l1
>     store r5,[write_data]  ; in delay slot, push foreground color
>     store r6,[write_data]  ; get background color
> l1:  ....
> 
> Actually, if the store is the trigger, that wouldn't work, so we'd use
> a reg-to-reg move instruction and then write that to the I/O port.
> You get the idea.

However, with register-like IO, the address writes in the unrolled
version reduce to

        add r7, 0, [write_addr]
        ...
        add r7, 1, [write_addr]
        ...
        ...
        add r7, 7, [write_addr]

It's not a major difference.  OTOH, the only disadvantage I can see with
register-like IO is that we loose 2 bits on the immediates.

> >   * I think it's worth considering if we can do context switching
> >     without too much overhead.  From the discussion between Patrick and
> >     Tim, I understand interrupts are not strictly needed, but they may
> >     reduce PCI bus traffic.
> 
> I'm just worried about the race conditions.  In fact, I know they'll
> be a problem.  We really don't want to give the nanocontroller a
> separate pipe to memory.  So if we have pending reads, then data will
> come back out of order.

I though we could solve that by encoding the thread number in the
request and reply.  An attempt to read a data which the thread does not
own, would cause a context switch.

Anyway, as you point out below in your post, DMA mode runs a single
process (and I assume there are command-dependencies which prevent
splitting up the work), so agree we can go for thread- and
interrupt-free nanocontroller.

Given these conclusions, we are close to the desired nanocontroller:
First, lets `ifdef out the multiplier logic.  Then, maybe we turn IO
access into registers.  If not, a minor practical-aesthetic point is
that I'd suggest negative addresses for IO-ports because it lets us
expand the scratch memory without changing the IO base address.  Then,
we can try to synthesise it again.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to