Hi, A few data points: It takes about 3ms to do a straight blit of a 320x480 / 16-bits surface, this blit is done by the 2D engine on the G1, and in parallel to the CPU and GPU. Note that while this is happening, both the CPU and GPU are free to do something else. I don't think the delay you're seeing has anything to do with this though.
However, the GL driver on the G1 uses deferred rendering, which means that *all the work* happens in eglSwapBuffers() (or in glFinish(), if you call it, but I strongly suggest not to). It is also worth mentioning that the GPU renders the scene 4 times per frame, because its internal framebuffer can only hold 1/4 of the real framebuffer. If you send too many commands, it may have to do 2 passes or more, which can get pretty ugly in terms of performance. Of course, in 32-bits mode, it needs 8 passes! This doesn't affect the fill-rate too much, but definitely the triangle-rate. The GPU's fill rate is roughly 130Mp/s, assuming what you are rendering is entirely limited by the fill-rate... just do the math (I don't know what you're rendering). There is also the question of the synchronization to the VSYNC -- if you were doing *nothing* eglSwapBuffer() would take roughly 16ms, since it'd wait for a new buffer to be available. Currently, the system is using what I call double-and-a-half buffering, that is, it's almost triple buffering, but not quite. In practice, you are up to 2 buffers "ahead", and that's why you have to wait -- there are already 2 buffers in the queue waiting to be displayed. If you pay attention, the *very first* eglSwapBuffers() is instantaneous (if you drew nothing, that is). All that being said, in the current implementation, the CPU is waiting for eglSwapBuffers() to complete (that is, for the GPU to finish rendering the scene) before it returns; this means that the CPU can be idle while the GPU is rendering -- it would be better, if instead, it returned immediately and the application could start enqueuing more GL commands. This limitation will be lifted when we switch to the new GPU driver model (probably the release after "donut"). This should improve the perceived speed of eglSwapBuffers() on the G1, at the expense of higher CPU usage (which is what we want). In the meantime you can generally improve the performance of your GL apps by using multi- threading in your application, so that the CPU has something to do which it waits for eglSwapBuffers() to complete. Just in case this wasn't clear enough, all the above only apply to the G1 and devices based on the same chipset. I hope this helps, Mathias On May 24, 12:20 am, Anton <[email protected]> wrote: > I have been trying to optimize on OpenGL render, and I've come > upon an irreducible problem. It seems that on the G1 a call to > eglSwapBuffers will always take at least 9ms. I understand that the > buffer swapping is restricted to a maximum of 60Hz (well, the refresh > rate of the screen, which in this case happens to be 60Hz). But I'm > never seeing it take less than 9ms, even when I'm doing more than 8ms > of work. > > Has anyone seen this? I've been reading through the code and > can't find anywhere that there is explicit mention of a fixed delay. > Not that I was really expecting to find something like that. What I > do find is that the swap code eventually calls unlockAndPost and then > lock on the underlying surface. The unlockAndPost causes the > underlying driver to start the swap and the lock waits for that to > finish. The driver is told to do the swap via an IOCTL call on /dev/ > graphics/fb0 that actually does a BLT. > > I wonder if the IOCTL round trip to the kernel and the thread > switching time is adding up to make eglSwapBuffers so slow? Or if the > requested BLT takes this long? If the BLT were actually taking > something close to this long that would explain things because the > lock call waits until the BLT is done (at least that's what it looks > like to me). > > As partial confirmation of this, when I up the surface size by > going to an RGBA_8888 surface the eglSwapBuffers call goes from taking > 9ms to taking 15ms. Which pretty much means you can never get 60 > frames per second using an RGBA_8888 surface. > > Thanks, > Anton --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/android-developers?hl=en -~----------~----~----~----~------~----~------~--~---

