Re: [Open-graphics] Multipliers in oga1hq

Farhan Mohamed Ali Sat, 01 Sep 2007 14:38:22 -0700

On Sat, September 1, 2007 1:23 pm, Timothy Normand Miller said:
> On 9/1/07, Petter Urkedal <[EMAIL PROTECTED]> wrote:
> 
>> So, let's consider integrating Farhan's version in the nanocontroller.
>> 
> 
> http://wiki.opengraphics.org/tiki-index.php?page=Subversion+Commit+Polic
> y
> 
> Farhan would need to officially give us (Traversal specifically) rights
> to use his work.
> 
No problem. How do i do this officially? Just include the copyright and 
license statement in my files? I realize that right now i don't have a 
proper header and i just have comments all over the place, but changes 
are being made quite often so i'm too lazy to write a decent one at the 
moment. I will do this once the spec is more or less settled.



> 
>> code?  (Does DMA require multiply at all, other than powers of 2?)
> 
> Doubtful.  But if I'm wrong, we maybe should reserve an opcode or two for
> some instruction we don't yet know about.
> 
>> I'd go with the non-blocking out-of-band approach.  That is, the 
>> programmer will count instructions before fetching the result.
> 
> I generally prefer this myself.
> 
>> As a slight variant, we can hard-code the multiplication result to r31
>>  and drop the fetch-product instruction.  That's just as easy to 
>> implement, and it saves one cycle, since it means the product can be 
>> directly used as an operand to the ALU.
>> 
> 
> I'm not sure we want to add additional MUXing after the REG stage.  It 
> might be better to move it into the MEM stage.  This is especially not a
> problem since we have gobs of time to schedule when the product is 
> grabbed.
> 
> Having a special instruction to initiate the multiply would save us one
> cycle (worth it?).  Otherwise, there would be two moves into the 
> scratch/io space.  But the product is only a single word fetch. Putting
> it into r31 would save a cycle, because we wouldn't have to move it into
> a register first before using it as an operand to another instruction.
> 
> My main concerns are the extra multiplexing logic hurting our max clock
> rate.
> 
>> The introduction of interrupts, if needed, will not cause problems as 
>> long as interrupt handlers don't use the multiplier.  Moreover, if an 
>> interrupt handler needs to use the multiplier, this is also possible: 
>> When the interrupt handler is sure any pending multiplication is 
>> finished, it can save the result R.  Then it can do it's own 
>> multiplication.  Before returning to normal code, it must perform a 
>> multiply R*1 and wait long enough for the result to be available.
> 
> I think we may in fact need interrupts, and I'm struggling with it. The
> problem is VGA graphics modes.  In 640x480x16 and such, framebuffer reads
> and writes are not simple accesses.  You can apply raster operators to
> writes, and you can make reads fill a blt buffer larger than your word
> size so that when you write, it causes more than a word size to get
> written out.  This way, you can bitblt faster than you can move data over
> the bus.
> 
> Now, for VGA mode, mostly what the controller does is read VGA text or 
> pixels and convert them in the background into pixels suitable for our 
> video controller.  At the same time, we want the controller to handle the
> extra smarts of VGA.  One way to do this is to support interrupts; when a
> PCI access comes in, we can intercept it and do the extra stuff.  While
> writes could be queued for us to process periodically, reads have to be
> processed as soon as possible.
> 
> Interrupts won't stall lower parts of the pipeline, but they would divert
> the instruction flow.  We need to determine how this will affect our
> static instruction scheduling.
> 
> Correct me if I'm wrong, but a subroutine call stores the return address
> into r31, right?  Of course, since that's under main program control, no
> problem!  But with interrupts, I think we should dump the return address
> into a redefined address in the scratch memory.
> 
> What about context switches?  Should we require the ISR to copy registers
> to the scratch memory?  That's a fair amount of overhead, depending on
> how many we need to clobber.  How about doubling the size of the register
> file?  The lower half for normal execution, the upper half for
> interrupts.  (Like how the Z80 did it.)   (In this case, the interrupt
> return address appears in what we might internally call r63.)  Oh, and
> don't forget the delayed branch issue and how it'll affect interrupt--one
> extra instruction from the main program will get executed, so the return
> PC must account for that, and be sure to consider the situation where the
> interrupt arrives at the same time as a branch instruction is being
> fetched in the main program.
> 
> -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open
> Graphics Project _______________________________________________ 
> Open-graphics mailing list [email protected] 
> http://lists.duskglow.com/mailman/listinfo/open-graphics List service
> provided by Duskglow Consulting, LLC (www.duskglow.com)
> 
> 

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Multipliers in oga1hq

Reply via email to