Carsten Haitzler wrote:
> hmmmm. interesting idea. this is only possible to do sanely with dma

You could probably control this also by carefully scheduling instructions.
E.g., instead of

        compute A and B
        write A to glamo
        write B to glamo
        do C

you could

        compute A
        write A to glamo
        compute B
        write B to glamo
        do C

or

        compute A and B
        write A to glamo
        do C
        write B to glamo

This would be hell to implement for all accesses, but if you have
certain structures of accesses, they could be done in small hand-tuned
functions. For bulk transfers, you need DMA.

It's an interesting idea. Whether it works depends largely when the
Glamo samples the data: at the beginning or at the end of the cycle.

Whether it's useful also depends on whether there is really a lot of
other things the system can do. Since this is likely to be slower than
a burst copy, you only win if the bottleneck of the application isn't
the amount of data that gets moved to the Glamo.

In a GUI, sheer transfer rate is probably the bottleneck. In video
playback, there may be an opportunity to interleave decompression
with frame buffer access. Should be fun to implement, though ;-)

- Werner

Reply via email to