On Mon, Mar 14, 2016 at 1:41 AM, Javier Paz <zor...@gmail.com> wrote:

> I also agree with Greg. My experience tells me that, in order to have a
> significant overhead, you must have a loop.
>
​I think the point is that in the case where you have a lot of small blits,
the proportional overhead of the checks is more significant.

My (C++) codebase supported something similar, and I noticed (in my code)
that the checks were pretty cheap, unless you had a lot of blits to do.
Then, they were still cheap, but pointless. My solution? Only allow blits
between surfaces of like type (and it won't compile if you try). This works
out great--on the CPU, you don't really ever want to blit anything other
than RGBA8 anyway (even if you don't have an alpha, you should use RGBA,
because of SIMD and alignment). This forces you to make a copy if you
*really* want to blit--but that's sortof what the blitter would already
have to do, in a slower (less instr. locality), less elegant, one-off way.

So one *could* drop support for a lot of surface types (I know there was
talk about doing this anyway). But, at some point you have to remember that
your game is written in Python. If you want to scrape out the last 1% of
your performance, why are you using a scripting language in the first
place? And sure, the underlying library oughtn't to be deliberately slow,
but I think I value flexibility over minor performance gains. Of course,
maybe we *can* get rid of some surface types that no one actually uses, and
get a bit of both . . .

Ian​

Reply via email to