> > I agree with the "you have to read pixels back from the frame
> > buffer and
> > then continue rendering polygons."  For a hardware
> > implementation I might
> > agree with the "you need to draw more polygons than your
> > hardware has room
> > to store", but only if the hardware implementation decides to perform
> > overdraw rather than fetching the triangles on the fly from
> > AGP memory.
>
> You need to agree that current hardware does implement the
> sheme where some percentage of pixels is drawn multiple times.
> Its a straigforward hardware design that nicely opens ways
> for getting the performance with an affordable amount of
> ASIC design engineering power. I dont assume the current
> market leaders did choose that way if they did expect to
> get more performance from the approaches. In the end i am
> pretty sure that this approach does provide more ways for
> interesing features and effects than the mentioned one pass
> rendering would provide.

First off, current market leaders began their hardware designs back when the
main CPU was much much slower.  They have an investment in this technology
and likely do not want to throw it away.  Back when these companies were
founded, such 3d rendering could not be performed on the main processor at
all.  The computational power of the main processor has since increased
dramatically.  The algorithmic approach to 3d rendering should be reexamined
with current and future hardware in mind.  What was once true may no longer
be so.

Second, if a processor intensive algorithm was capable of better efficiency
than a bandwidth intensive algorithm, there is a good chance these
algorithms would be movd back over to the main CPU.  If the main processor
took over 3D rendering, what would the 3D card manufacturers sell?  It would
put them out of business essentially.  Therefore you cannot gauge what is
the most efficient algorithm based on what the 3D card manufacturers decide
to push.  They will push whatever is better for their bottom line and their
own future.


> Anyways, the current memory interfaces for the framebuffer memory
> arent the performance break at all today. Its the features that
> the applications do demand e.g. n-times texturing.

The features of most games today do cause the current memory interfaces to
be the performance bottleneck.  This is why overclocking your card's memory
offers more of a performance gain than overclocking your card's processor.


> If these one-pass algorithms would be so resource saving,
> why is there only a single hardware implementation and
> the respective software solutions are of not much attention?

Why should a 3D card hardware company show interest in something that could
so easily be implemented in software?  How does that benefit their bottom
lines?


> > With the rest I disagree.  The Kyro, for example, has some
> > high-speed local
> > memory (cache) it uses to hold the pixels for a tile.  It can
> > antialias and
> > render translucent scenes without ever blitting the cache to
> > the framebuffer
> > more than once.  This is the advantage to tile-based
> > rendering.  Since you
> > only need to hold a tile's worth of pixels, you can use
> > smaller high-speed
> > cache.
>
> Pixel caches and tiled framebuffers/textures are state of the art
> for most (if not all) of current engines. Only looking at the Kyro
> would draw a fals view of the market. Kryo has it too, so its sort
> of a "me too product". But a vendors marketing department will never
> tell you that it is this way.

No, tile buffers cannot be used by immediate mode renderers to eliminate
overdraw.  Immediate mode does not render on a per-pixel basis.  It renders
on a per-polygon basis.  Current hardware engines that use immediate mode
rendering in fact do not make use of tile-based rendering.  They would need
a "tile buffer" the size of the entire framebuffer.  At that point it is no
longer a high speed buffer.  It is simply the framebuffer.  Imagine the cost
of high-speed cache in quantities large enough to hold a full frame buffer,
especially at high resolutions...

While I would prefer to see a software implementation of scene-capture
tile-based rendering, the Kyro was a good first step.  It was the first
mainstream card to use these algorithms.  For this I applaud them.  This was
by no means a "me too product" as you claimed.


> > As far as the reading of pixels from the framebuffer, this is a highly
> > inefficient thing to do, no matter the hardware.  If you want a fast
> > application you will not attempt to read from the video card's memory.
> > These operations are always extremely slow.
>
> For this there are caches (most often generic for nearly any render unit).
> And reading is not that different from writing on current RAM designs.
> Some reading is always working without any noticeable impact on
performance,
> (and its done for a good bunch of applications and features)
> but if you need much data from framebuffer, than you might notice it.
> That closer the pixel consuming circuit is to the RAM that better it
> will work. A CPU is one of the not so good consumers for pixels.


Are you saying there is a high-speed cache large enough to hold the entire
framebuffer now?  Do you realize how much that would cost?  It would be
entirely pointless as well.  This pixel must still be transferred via the
AGP bus to main memory.  This is a slow process.  You will find that reading
a line of pixels takes far more time than sending a polygon to the card.


> Hmm, current state of the art is called display list based rendering
> and its up to date and nicely optimizde despite the concept is
> an older one. It takes the goods of both worlds. Fast overdrawing
> rendering into memory and a higher level of primitive preprocessing.
> With only a single comparision on a preprocessed displaylist you
> can quickly decide if that display list is in need to be sent to the
> grafics adapter.

I have no doubt that the immediate mode rendering algorithm has extremely
optimized implementations in today's 3D cards.  However, a poorly optimized
more efficient algorithm is preferred over a highly optimized inefficient
algorithm.  There is always room for optimizing the implementation.  The
algorithm itself is of utmost importance.

-Raystonn


_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to