Re: [Open-graphics] Multipliers in oga1hq

Patrick McNamara Tue, 04 Sep 2007 18:48:41 -0700

Timothy Normand Miller wrote:
> On 9/4/07, Patrick McNamara <[EMAIL PROTECTED]> wrote:
>   
>>> In a VGA graphics mode that requires more smarts, the read or write
>>> raises an interrupt with the nanocontroller that then makes it own
>>> modified request to the memory system.
>>>
>>>       
>> My question was centered around VGA reads since the VGA interface
>> expects to spit out data formated in "funky" VGA formats.  I suspect we
>> are going to have to shadow VGA memory so that reads can come directly
>> from the shadow memory and writes get written to the shadow memory and
>> also the the framebuffer as 24 bit pixels.
>>     
>
> There are two issues here.  One is the funky formats.  What you call
> the "shadow" depends on your perspective.  There are two framebuffers.
>  One is what the PC thinks is the VGA stuff.  The other is the one
> scanned out by our video controller.  The job of our nanocontroller is
> to convert from one to the other over and over again in the
> background.
>


It doesn't actually have to do constant conversion for anything but text
modes, and then
only when the text changes.  I'm still not sure how we handle blinking
other than run the controller in a busy loop.

> The other issue is that some accesses will do things that are more
> than just the straight-foward access.  For instance, we could be doing
> a bitblt (where reads and writes move more data than we're moving over
> the bus) or there could be a ROP applied to the writes.  The
> nanocontroller's job will be to intercept the PCI access and do the
> extra stuff.
>
>   
>>> Those are just stored in graphics memory.  The text mode has a
>>> standard way to store those, and we'll just have the VGA controller
>>> use them to convert text to graphics.
>>>
>>>       
>> They are effectively stored as bitmaps anyway.
>>     
>
> Exactly.  When I finish my example program, you'll see what I have in mind.
>   

Something like this: 
http://www.supersecret.org/~mcnamara/fbconvertprogram.txt



>   
>>>> That is definitely a concern.  Allowing for only one read and write per
>>>> cycle would require addition of a second fetch stage in the pipeline.
>>>>
>>>>         
>>> How would this help?  I don't know what you mean.
>>>
>>>       
>> Given that you have two source registers and a target register, with
>> dual port memory you can fetch both register contents in a single
>> pipeline stage.  This stage can also allow for a write in a tri-port
>>     
>
> The extra stage in the CPU doesn't give you an extra port on the memory.
>
>   
>> setup as we have.  Assuming we don't allow for ALU operations on
>> non-register locations (indirect addressing), then you would normal
>> follow with the ALU/MEM stage.  If you can only do one read and one
>> write per register access then you have to have two register fetch
>> stages stages, one for each register, prior to the ALU/MEM stage.
>>
>> tri-port:
>> instruction fetch
>> instruction decode
>> register fetch
>> ALU/memory
>> write back
>>
>> dual-port:
>> instruction fetch
>> decode
>> register fetch
>> register fetch
>> ALU/memory
>> write back
>>
>> Or something like that.  It's been 10 years since my processor design
>> class and I sold the book because I was a poor college student at the time.
>>     
>
> I don't think this will work.  You're describing a pipeline where
> there are three different stages that access the BRAM.  Since all
> three stages could have valid instructions in them all at the same
> time, that requires a 3-port RAM.
>
>   

Doh, I knew I missed something....

>>> I'm positive.  Only the 3D GPU needs DMA.  Upon starting X11, we'll
>>> have software load the DMA program.  On exit from X11, we'll have it
>>> (or the kernel or whatever) reload the VGA program.  We can only be in
>>> one graphics mode (well, one per head, but ignore that) at a time, so
>>> there's no issue with 640x480x16.  And we also won't be in text mode
>>> and graphics mode at the same time.
>>>
>>>
>>>       
>> What does the the nanocontoller handle in the way of DMA?
>>     
>
> Scheduling.  Software has buffers that are master command queues
> processed by the GPU.  The nanocontroller's job is to read them,
> process commands, and execute them.
>
> Another would be memory moves.  That one is simpler in that it's just
> issuing reads on one and writes on the other (graphics memory and the
> PCI bus).
>
>   
>> What happens
>> if I initiate a DMA transaction from main memory targeting the VGA
>> memory space?  I don't actually know if that is allowed with standard
>> VGA, I will need to do some research.
>>     
>
> There is no VGA memory.  It's all a trick.  Some portion of our rather
> large graphics memory is set aside and mapped into A000 or whereever
> for VGA (text mode or graphics mode or whatever).  Some other portion
> is set aside for a translated version of the image.  The video
> controller is programmed to scan the second one.  The nano controller
> is programmed to read the first one and translate it into pixels for
> the second.
>   
>From the card perspective yes.  From the system perspective, VGA memory
is a 64k contiguous block mapped at physical address 0x000A0000.  Based
on the Linux kernel documentation, it is legal to program the old ISA
DMA controller to target the VGA memory space.  That means that in VGA
mode we have to be capable of accepting host initiated.  Though since it
is host initiated and not card initiated, I don't think we care.  To us
it just looks like a big stream of reads or writes.
>   
>> We do have to provide text based ouput in graphics mode.  You can make a
>> BIOS interrupt call on an x86 system to print text, even in graphics
>> mode.  Obviously this is different that standard graphics mode from our
>> perspective, but what we have to do is very similar.
>>     
>
> Again, it's just a trick.  There's just memory space that is mapped
> into the system, and the nanocontroller just runs in the background,
> doing translations.
>
>   
>> Once we have a basic 3d pipeline available, we could use it to assist
>> with scaling and text.  If the VGA screen is simple a poly and the video
>> memory is the texture for that poly then we don't have to handle scaling
>> at all, just format translation.  Likewise if an 80x25 text screen is
>> simply 2000 polys and the character is the texture then text mode
>> becomes quite easy to.  When a character is changed all we need do is
>> change to texture.
>>     
>
> You're making it much more complicated than it needs to be.  None of
> this sort of thing is necessary.
>
>   
>> An interesting side effect of this is the capability to dump the text
>> console into a window after the window manager starts, or to allow a VM
>> direct access to the VGA hardware while the 3d pipe is handling normal
>> display.  These are obviously just neat little things that could be done
>> and not at all necessary.  But there are valid reasons to consider
>> supporting DMA, PCI, and VGA (or another context that we haven't thought
>> of yet) if at all possible.
>>     
>
> I can't think of a situation where we'd want to do VGA and 3D at the same 
> time.
>   

I've got one sitting on my desk right now:  VMWare, or Xen, or your
favorite virtualization.  Ok, so that was kind of a contrived example
since they don't have to have it.  My point was more that just because
you and I can't come up with a reason doesn't mean there isn't one.  :)

>>> Only the interrupt needs to worry about this.  That helps a bit.  But
>>> as I say, if the multiplier is pipelined, then it's a non-issue.  If
>>> it's not pipelined, then we will indeed have to query it before using
>>> it, unless we ensure that the ISR doesn't issue a multiply too early.
>>>
>>>
>>>
>>>       
>> The first stage of the VGA pipeline is a barrel shift.  Being able to
>> use the multiplier for this would be very useful.  Otherwise it will
>> take 8 processor cycles plus branch overhead in the worst case.  Though
>> that may be faster than the multiplier can work as a barrel shifter so
>> it may be a moot point.
>>
>>     
>
> The nanocontroller has a shifter that takes an operand indicating the
> amount to shift.
>
> Anyhow, there's absolutely no reason why the VGA hardware should be
> fast.  It just has to be _correct_.  There needs to be some image on
> the screen that contains something recognizable as the pixels you
> would see on a regular VGA card.  But they don't have to fill the
> screen, be as large, or whatever.
>
>   
Actually, I'm thinking that it may be most efficient to actual
instantiate most of the VGA pipeline in hardware similar to the
multiplier and only have the nanocontroller handle the final conversion
of the color space or font to bitmap.  Or maybe not, but it is worth
thinking about whether it is more efficient to implement the hardware
for a "general purpose" processor to emulate a fixed hardware pipeline,
or to just implement the pipeline and make the processor much simpler.

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Multipliers in oga1hq

Reply via email to