> > The only thing video > > cards have today that is really better than the main processor is massive > > amounts of memory bandwidth. > > That is far from the truth - they have internal pipelining > and parallelism. Their use of silicon can be optimised to balance > the performance of just one single algorithm. You can never do that > for a machine that also has to run an OS, word process and run > spreadsheets.
Modern processors have internal pipelining and parallelism as well. Most of the processing power of today's CPUs go completely unused. It is possible to create optimized implementations using Single-Instruction-Multiple-Data (SIMD) instructions of efficient algorithms. > > Since memory bandwidth is increasing rapidly,... > > It is?!? Let's look at the facts: > > Since 1989, CPU speed has grown by a factor of 70. Over the same > period the memory bus has increased by a factor of maybe 6 or so. We have gone from approximately 200MB/s of memory bandwidth (PC66 EDO RAM) to over 3.2GB/s (dual 16-bit RDRAM channels) in the last 5 years. We have over 16 times the memory bandwidth available today than we did just 5 years ago. Available memory bandwidth has been growing more quickly than processor clockspeed lately, and I do not foresee an end to this any time soon. > On the other hand, the graphics card can use heavily pipelined > operations to guarantee that the memory bandwidth is 100% utilised Overutilised in my opinion. The amount of overdraw performed by today's video cards on modern games and applications is incredible. Immediate mode rendering is an inefficient algorithm. Video cards tend to have extremely well optimized implementations of this inefficient algorithm. > - and can use an arbitarily large amount of parallelism to improve > throughput. The main CPU can't do that because it's memory access > patterns are not regular and it has little idea where the next byte > has to be read from until it's too late. Modern processors have a considerable amount of parallelism built in. With advanced prefetch and streaming SIMD instructions it is very possible to do these types of operations in a modern processor. It will, however, take another couple of years to be able to render at great framerates and high resolutions. > You only have to look at the gap you are trying to bridge - a > modern graphics card is *easily* 100 times faster at rendering > sophisticated pixels (with pixel shaders, multiple textures and > antialiasing) than the CPU. They are limited in what they can do. In order to allow more flexibility they have recently introduced pixel shaders, which basically turns the video card into a mini-CPU. Modern processors can perform these features more quickly and would allow an order of magnitude more flexibility in what can be done. > > A properly > > implemented and optimized software version of a tile-based "scene-capture" > > renderer much like that used in Kyro could perform as well as the latest > > video cards in a year or two. This is what I am dabbling with at the > > moment. > > I await this with interest - but 'scene capture' systems tend to be > unusable with modern graphics API's...they can't run either OpenGL > or Direct3D efficiently for arbitary input. If there were to be > some change in consumer needs that would result in 'scene capture' > being a usable technique - then the graphics cards can easily take > that on board and will *STILL* beat the heck out of doing it in > the CPU. Scene capture is also only feasible if the number of > polygons being rendered is small and bounded - the trends are > for modern graphics software to generate VAST numbers of polygons > on-the-fly precisely so they don't have to be stored in slow old > memory. Kyro-based video cards perform quite well. They are not quite up to the level of nVidia's latest cards but this is new technology and is being worked on by a relatively new company. These cards do not require nearly as much memory bandwidth as immediate-mode renderers, performing 0 overdraw. They are more processing intensive rather than being bandwidth intensive. I see this as a more efficient algorithm. > Everything that is speeding up the main CPU is also speeding up > the graphics processor - faster silicon, faster busses and faster > RAM all help the graphics just as much as they help the CPU. Everything starts out in hardware and eventually moves to software. There will come a time when the basic functionality provided by video cards can be easily done by a main processor. The extra features offered by the video cards, such as pixel shaders, are simply attempts to stand-in as a main processor. Once the basic functionality of the video card can be performed by the main system procsesor, there will really be no need for extra hardware to perform these tasks. What I see now is a move by the video card companies to software-based solutions (pixel shaders, etc.) They have recognized that there are limitations to what specialized hardware can do and they are now attempting to allow programmers more flexibility. However, this is the kind of functionality where the main system processor has a huge advantage. If more features are added in this manner (as software) then the specialized video card hardware will lose its edge. Intel is capable of pushing microprocessor technology more quickly than nVidia or ATI, regardless of how much nVidia wants their technology to be at the center of the chipset. > However, increasing the number of transistors you can have on > a chip doesn't help the CPU out very much. Their instruction > sets are not getting more complex in proportion to the increase > in silicon area - and their ability to make use of more complex What would you call MMX, SSE, SSE2, and even 3dnow? These are additional instructions designed to optimize the use of these new transistors. > instructions is already limited by the brain power of compiler > writers. Since when can you write a pixel shading routine in a standard C/C++ compiler? Assembly language can be used for the main processor just as easily as it can be used for pixel shaders using nVidia's own assembly language. In fact, there is a great deal more support for assembly language on the main processor. > Most of the speedup in modern CPU's is coming from > physically shorter distances for signals to travel and faster > clocks - all of the extra gates typically end up increasing the > size of the on-chip cache which has marginal benefits to graphics > algorithms. > > In contrast to that, a graphics chip designer can just double > the number of pixel processors or something and get an almost > linear increase in performance with chip area with relatively > little design effort and no software changes. Modern processors have multiple parallel units for both integer and FPU operations. More of these are added as time marches on. These have proven to offer a huge performance boost. Increasing processor performance is much more complex than a simple die shrink. > If you doubt this, look at the progress over the last 5 or 6 > years. In late 1996 the Voodoo-1 had a 50Mpixel/sec fill rate. > In 2002 GeForce-4 has a fill rate of 4.8 Billion (antialiased) > pixels/sec - it's 100 times faster. Fill rate is just memory bandwidth. It is not hard to offer more memory channels. In fact, a dual-channel DDR chipset is coming soon for the Pentium 4. In May the Pentium 4 will have access to 4.3GB/s of memory bandwidth. Future generations will offer considerably more. > The graphics cards are also gaining features. > Over that same period, they added - windowing, hardware T&L, > antialiasing, multitexture, programmability, you name it. > Meanwhile the CPU's have added just a modest amount of MMX/3Dnow > type functionality...almost none of which is actually *used* > because our compilers don't know how to generate those new > instructions in compiling generalised C/C++ code. The Intel C/C++ compiler generates MMX, SSE, and SSE2 instructions if you tell it to do so. It requires no inline assembly, though inline assembly is always a good idea. SSE and SSE2 are used in nVidia's drivers... > CONCLUSION. > ~~~~~~~~~~~ > There is no sign whatever that CPU's are "catching up" with > graphics cards - and no logical reason why they ever will. I will have to disagree here. Indications are that the video card manufacturers are looking more and more into 'programmable' features such as the pixel shaders. If this is the case it would be relatively easy for the main processor to 'catch up'. Programmability is its specialty. At any rate, we will probably just have to agree to disagree here. ;) -Raystonn _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel