On Sat, Aug 25, 2007 at 11:26:33PM +0100, Phil Endecott wrote: > Dear All, > > I have been doing some measurements of the relative speed of > hardware-accelerated and software rectangle fills. It seems that on my > hardware, doing the fill in software is faster when the rectangle has > an area of <= 16 pixels or so. > > Here are the full results. The "size", below, is the edge-length of a > square. In each case I filled the square 1 million times. The > run-times (in seconds) include the program startup time, which is about > 1 second. > > size --dfb:hardware --dfb:no-hardware > 1 6.04 4.84 > 2 5.96 4.94 > 4 6.03 5.74 > 8 6.02 8.28 > 16 6.10 17.63 > 32 6.02 56.31 > 64 5.98 > 128 14.34 > > So all the way up to a 64x64 square, the graphics hardware is fast > enough that it runs faster than the processor. For squares <= 4x4, the > software fill loop is faster than the overhead of passing another fill > command to the graphics hardware. > > One thing that I haven't tested is the FillRectangles feature. Does > anyone know how it would compare?
There shouldn't be any significant difference since the gfxcard state checking is smart and doesn't dirty the hardware state unless something has actually changed between the FillRetangle() calls. > I imagine that you would see similar results for the other accelerated > operations, e.g. blits. With blits and blending operations you would have to take into account whether any reads have to be performed from video RAM. That is extremely slow. > In my application, I have quite a lot of small rectangles in some > situations: I have a zoom feature, and when you zoom a long way out the > screen contains many rectangles that are actually single points. So I > could probably make an improvement if this case were optimised. (Not > as much of an improvement as I was hoping when I decided to do these > measurements unfortunately.) > > So, is there any way for me to disable hardware acceleration for a > particular operation? IDirectFBSurface::DisableAcceleration() > I was looking at dfb_gfxcard_drawrectangle() in src/core/gfxcard.c, at > it seems that it already has logic to fall back to the software > implementation in various circumstances. So this could be extended > with a check for small dimensions? It would be possible. Probably the best place to do it would be dfb_gfxcard_state_check(), or maybe it could be left up to each gfxdriver individually in case the threshold varies between GPUs. Unfortunately there are more variables than the GPU to this equation (CPU speed, RAM speed, bus speed, cache sizes etc.) so coming up with a universally beneficial recipe might be impossible. Also nothing actually guarantees that hw and sw rendering provide identical results. For example some gfxdrivers enable dithering but the sw renderer doesn't do dithering. So mixing hw and sw rendering might look a bit odd. Then again, if the threshold size is small that may not matter much. > One thing that concerns me is > whether the order of operations is preserved when some things are done > by the graphics hardware and some in software. The order is preserved but that too causes overhead. When switching between hw and sw rendering the core code may have to wait for the GPU to become idle. Also some hardware caches may need flushing. -- Ville Syrjälä [EMAIL PROTECTED] http://www.sci.fi/~syrjala/ _______________________________________________ directfb-dev mailing list [email protected] http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev
