Re: [Open-graphics] Multipliers in oga1hq

Timothy Normand Miller Tue, 04 Sep 2007 18:03:42 -0700

On 9/4/07, Patrick McNamara <[EMAIL PROTECTED]> wrote:
> > In a VGA graphics mode that requires more smarts, the read or write
> > raises an interrupt with the nanocontroller that then makes it own
> > modified request to the memory system.
> >
>
> My question was centered around VGA reads since the VGA interface
> expects to spit out data formated in "funky" VGA formats.  I suspect we
> are going to have to shadow VGA memory so that reads can come directly
> from the shadow memory and writes get written to the shadow memory and
> also the the framebuffer as 24 bit pixels.


There are two issues here.  One is the funky formats.  What you call
the "shadow" depends on your perspective.  There are two framebuffers.
 One is what the PC thinks is the VGA stuff.  The other is the one
scanned out by our video controller.  The job of our nanocontroller is
to convert from one to the other over and over again in the
background.

The other issue is that some accesses will do things that are more
than just the straight-foward access.  For instance, we could be doing
a bitblt (where reads and writes move more data than we're moving over
the bus) or there could be a ROP applied to the writes.  The
nanocontroller's job will be to intercept the PCI access and do the
extra stuff.

> > Those are just stored in graphics memory.  The text mode has a
> > standard way to store those, and we'll just have the VGA controller
> > use them to convert text to graphics.
> >
> They are effectively stored as bitmaps anyway.

Exactly.  When I finish my example program, you'll see what I have in mind.

> >> That is definitely a concern.  Allowing for only one read and write per
> >> cycle would require addition of a second fetch stage in the pipeline.
> >>
> >
> > How would this help?  I don't know what you mean.
> >
> Given that you have two source registers and a target register, with
> dual port memory you can fetch both register contents in a single
> pipeline stage.  This stage can also allow for a write in a tri-port

The extra stage in the CPU doesn't give you an extra port on the memory.

> setup as we have.  Assuming we don't allow for ALU operations on
> non-register locations (indirect addressing), then you would normal
> follow with the ALU/MEM stage.  If you can only do one read and one
> write per register access then you have to have two register fetch
> stages stages, one for each register, prior to the ALU/MEM stage.
>
> tri-port:
> instruction fetch
> instruction decode
> register fetch
> ALU/memory
> write back
>
> dual-port:
> instruction fetch
> decode
> register fetch
> register fetch
> ALU/memory
> write back
>
> Or something like that.  It's been 10 years since my processor design
> class and I sold the book because I was a poor college student at the time.

I don't think this will work.  You're describing a pipeline where
there are three different stages that access the BRAM.  Since all
three stages could have valid instructions in them all at the same
time, that requires a 3-port RAM.

> > I'm positive.  Only the 3D GPU needs DMA.  Upon starting X11, we'll
> > have software load the DMA program.  On exit from X11, we'll have it
> > (or the kernel or whatever) reload the VGA program.  We can only be in
> > one graphics mode (well, one per head, but ignore that) at a time, so
> > there's no issue with 640x480x16.  And we also won't be in text mode
> > and graphics mode at the same time.
> >
> >
>
> What does the the nanocontoller handle in the way of DMA?

Scheduling.  Software has buffers that are master command queues
processed by the GPU.  The nanocontroller's job is to read them,
process commands, and execute them.

Another would be memory moves.  That one is simpler in that it's just
issuing reads on one and writes on the other (graphics memory and the
PCI bus).

> What happens
> if I initiate a DMA transaction from main memory targeting the VGA
> memory space?  I don't actually know if that is allowed with standard
> VGA, I will need to do some research.

There is no VGA memory.  It's all a trick.  Some portion of our rather
large graphics memory is set aside and mapped into A000 or whereever
for VGA (text mode or graphics mode or whatever).  Some other portion
is set aside for a translated version of the image.  The video
controller is programmed to scan the second one.  The nano controller
is programmed to read the first one and translate it into pixels for
the second.

> We do have to provide text based ouput in graphics mode.  You can make a
> BIOS interrupt call on an x86 system to print text, even in graphics
> mode.  Obviously this is different that standard graphics mode from our
> perspective, but what we have to do is very similar.

Again, it's just a trick.  There's just memory space that is mapped
into the system, and the nanocontroller just runs in the background,
doing translations.

> Once we have a basic 3d pipeline available, we could use it to assist
> with scaling and text.  If the VGA screen is simple a poly and the video
> memory is the texture for that poly then we don't have to handle scaling
> at all, just format translation.  Likewise if an 80x25 text screen is
> simply 2000 polys and the character is the texture then text mode
> becomes quite easy to.  When a character is changed all we need do is
> change to texture.

You're making it much more complicated than it needs to be.  None of
this sort of thing is necessary.

> An interesting side effect of this is the capability to dump the text
> console into a window after the window manager starts, or to allow a VM
> direct access to the VGA hardware while the 3d pipe is handling normal
> display.  These are obviously just neat little things that could be done
> and not at all necessary.  But there are valid reasons to consider
> supporting DMA, PCI, and VGA (or another context that we haven't thought
> of yet) if at all possible.

I can't think of a situation where we'd want to do VGA and 3D at the same time.

> > Only the interrupt needs to worry about this.  That helps a bit.  But
> > as I say, if the multiplier is pipelined, then it's a non-issue.  If
> > it's not pipelined, then we will indeed have to query it before using
> > it, unless we ensure that the ISR doesn't issue a multiply too early.
> >
> >
> >
> The first stage of the VGA pipeline is a barrel shift.  Being able to
> use the multiplier for this would be very useful.  Otherwise it will
> take 8 processor cycles plus branch overhead in the worst case.  Though
> that may be faster than the multiplier can work as a barrel shifter so
> it may be a moot point.
>

The nanocontroller has a shifter that takes an operand indicating the
amount to shift.

Anyhow, there's absolutely no reason why the VGA hardware should be
fast.  It just has to be _correct_.  There needs to be some image on
the screen that contains something recognizable as the pixels you
would see on a regular VGA card.  But they don't have to fill the
screen, be as large, or whatever.

-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Multipliers in oga1hq

Reply via email to