Immediate mode calls (glVertex et al) are the very slowest way to use
OpenGL. In fact they are deprecated in OpenGL 3.0 and will eventually
be removed.
The display list is better as you discovered, but you still are making
a few OpenGL state changes per sprite, which is likely slowing you
down. Also there is some overhead for the display list call, which
makes them sub-optimal for just drawing a single quad.
glPushMatrix()
glTranslate(self.positionx,self.positiony,0)
glCallList(self.displist)
glPopMatrix()
You really need to batch the quads up into a few vertex arrays or vbos
to stream them to the card in one go. pyglet has a high-level python
sprite api that automates this for you fwiw.
-Casey
On Feb 26, 2009, at 11:04 AM, Zack Schilling wrote:
I know the PyOpenGL mailing list might be a better place to ask this
question, but I've had a lot of luck talking to the experienced
people here so I figured I'd try it first.
I'm trying to migrate a game I created from using the Pygame / SDL
software rendering to OpenGL. Before attempting the massive and
complex conversion involved with moving the whole game, I decided to
make a little test program while I learned OpenGL.
In this test, I set up OpenGL to work in 2D and began loading images
into texture objects and drawing textured quads as sprites. I
created a little glSprite class to handle the drawing and
translation. At first its draw routine looked like this:
glPushMatrix()
glTranslate(self.positionx,self.positiony,0)
glBindTexture(GL_TEXTURE_2D, self.texture)
glBegin(GL_QUADS)
glTexCoord2f(0, 1)
glVertex2f(0, 0)
glTexCoord2f(1, 1)
glVertex2f(w, 0)
glTexCoord2f(1, 0)
glVertex2f(w, h)
glTexCoord2f(0, 0)
glVertex2f(0, h)
glEnd()
glPopMatrix()
Note: self.texture is a texture ID of a loaded OpenGL texture
object. My sprite class keeps a dictionary cache and only loads the
sprite's image into a texture if it needs to.
I'd get maybe 200 identical sprites (same texture) onscreen and my
CPU would hit 100% load from Python execution. I looked into what
could be causing this and found out that it's probably function call
overhead. That's 14 external library function calls per sprite draw.
The next thing I tried was to create a display list at each sprite's
initialization. Then my code looked like this:
glPushMatrix()
glTranslate(self.positionx,self.positiony,0)
glCallList(self.displist)
glPopMatrix()
Well, that's nice, down to 4 calls per draw. I was able to push ~500
sprites per frame using this method before the CPU tapped out. I
need more speed than this. My game logic uses 30-40% of the CPU
alone and I'd like to push at least 1000 sprites. What can I do?
I've looked into passing sprites as a matrix with vertex arrays, but
forming a proper vertex array with numpy can sometimes be more
trouble than it's worth. Plus, I can't swap out textures easily mid-
draw, so it makes things much more complex than the simple way I'm
doing things now.
Is there any design pattern I could follow that will get me more
speed without sending me off the deep end with complexity.
Thanks,
Zack