On 2007-09-06, Timothy Normand Miller wrote:
> On 9/6/07, Petter Urkedal <[EMAIL PROTECTED]> wrote:
>
> > * Looking at the inner loop of draw_glyph, it doesn't clearly favour
> > register-like IO over memory-like IO. I ended up translating the
> > write_io statements to simple move instructions (rather than
> > computed values), except if we unroll the loop, in which case the
> > address writes can be combined with addition of constants 0 to 7.
>
> It's true. If we were to unroll the loop, we would probably AND with
> constants (1, 2, 4, 8...), and for each, there would be a branch based
> on whether or not the target register is zero. Something like:
>
> andi #1, r3, r4
> bnz l1
> store r5,[write_data] ; in delay slot, push foreground color
> store r6,[write_data] ; get background color
> l1: ....
>
> Actually, if the store is the trigger, that wouldn't work, so we'd use
> a reg-to-reg move instruction and then write that to the I/O port.
> You get the idea.
However, with register-like IO, the address writes in the unrolled
version reduce to
add r7, 0, [write_addr]
...
add r7, 1, [write_addr]
...
...
add r7, 7, [write_addr]
It's not a major difference. OTOH, the only disadvantage I can see with
register-like IO is that we loose 2 bits on the immediates.
> > * I think it's worth considering if we can do context switching
> > without too much overhead. From the discussion between Patrick and
> > Tim, I understand interrupts are not strictly needed, but they may
> > reduce PCI bus traffic.
>
> I'm just worried about the race conditions. In fact, I know they'll
> be a problem. We really don't want to give the nanocontroller a
> separate pipe to memory. So if we have pending reads, then data will
> come back out of order.
I though we could solve that by encoding the thread number in the
request and reply. An attempt to read a data which the thread does not
own, would cause a context switch.
Anyway, as you point out below in your post, DMA mode runs a single
process (and I assume there are command-dependencies which prevent
splitting up the work), so agree we can go for thread- and
interrupt-free nanocontroller.
Given these conclusions, we are close to the desired nanocontroller:
First, lets `ifdef out the multiplier logic. Then, maybe we turn IO
access into registers. If not, a minor practical-aesthetic point is
that I'd suggest negative addresses for IO-ports because it lets us
expand the scratch memory without changing the IO base address. Then,
we can try to synthesise it again.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)