Excerpts from john delahey's message of Tue Nov 24 14:13:58 +0000 2009:

>Hello Robert
>
>The reason I am asking is I am trying to understand the performance
>implication of rendering in retained mode versus sending calls directly to
>opengl. I understand that batching opengl call for later gives the
>oportunity to optimize the rendering by grouping draw calls that share the
>same textures or the same materials (to avoid too many state changes).
>However, when there is a lot of rendering to process, would it be best to
>pass the draw calls and state changes directly to opengl?
You may be able to get some insight using COGL_DEBUG=disable-batching
This probably isn't quite what you want, because it doesn't try and
optimize things by short circuiting the logging, but it will force
an immediate gl draw call for each rectangle.

Other debug options that give insight into the Cogl journal are:
 COGL_DEBUG=journal - prints the geometry logged
 COGL_DEBUG=batching - shows what batching is being achieved
 and
 COGL_DEBUG=disable-software-transform

Note: Cogl currently only batches rectangles drawn via the
cogl_rectangle[xyz] functions. If you really want to avoid the journal
altogether you could use the cogl_polygon or cogl_vertex_buffer APIs to
have a more direct route through to the driver.

Are you trying to understand the implications of retaining
state/batching draw calls for a particular workload, or just in a
general sense?

I could certainly see the journal becoming an overhead when you know
that you are going to draw lots of rectangles each with different
materials. This is currently quite common due to us having to split
batches when different textures are used but we are aiming to improve
this with texture atlas support soon.

The other thing of course is that batching will consume more memory. My
expectation would be that most application assets typically dwarf the
amount of memory used for a journal, but it could be a consideration if
a huge number of rectangles were to get logged.

>I have used cogl_flush() in many situations to explicitly force the
>rendering. This way I can do some direct opengl calls and restore the states
>that I have changed before given the control back to Clutter/COGL. I thought
>about implementing the functions that I require in COGL but I am afraid this
>comes down to reimplementing many opengl calls and states in a retained mode
>fashion.

What kinds of draw calls are you missing? As mentioned above only
rectangles are currently batched in a retained mode fashion. The
cogl_polygon, cogl_vertex_buffer and cogl_path APIs can currently all
be considered immediate mode. If they aren't enough I'd be interested
in considering other extensions to Cogl.

I'm not sure it helps to debate semantics here, but I would generally
classify Cogl as an immediate mode API (or about as immediate mode as
OpenGL, in that we also support VBOs for retaining geometry) I tend to
consider the journal as a fairly high level piece of utility code
provided by Cogl basically to help optimize the throughput of rectangles
which are - somewhat sadly - still the building block for 99.9% of all
Clutter actors. Cogl like OpenGL would rather not spend time retaining
and analyzing user geometry so as to optimize it; instead it should
assume that higher layers have done that. Clutter - which is a fully
retained mode API - has vastly more contextual information with which it
can optimize the geometry given to Cogl. 

>
>That is why I wonder if the option of having an immediate mode can improve
>performance and also avoid the buffering of opengl calls and states (which
>requires some internal management and data structures).

So you know; Cogl hasn't always batched rectangles - in fact the journal
was only added recently before releasing Clutter 1.0 as a way to improve
performance in Clutter. Very early implementations of cogl_rectangle
were in fact thin wrappers around glRect. We have found batching
rectangles does improve performance for Clutter for a number of key
things: Picking and text rendering. A pick render is typically comprised
of many rectangles with different colors, one for each reactive actor.
The journal means we can often draw a whole pick scene full of actors in
one GL draw call. For text we have a glyph cache that means we share a
texture and so again it's possible to batch the drawing of many glyphs
into a single draw call.

The benefit of caching GL state is debatable in some cases, but as a
general rule I think it makes sense for Cogl to avoid redundant GL calls
where possible. If you consider the use of indirect GLX then being able
to trivially avoid a single round trip per frame may easily justify a bit
of extra state tracking in Cogl. Even for direct rendering the cost of
getting state from OpenGL can vary widely between drivers. Sadly we've
found Mesa to be particularly bad in this regard. One example I can
recall is that the use of glGetFloatfv may result in the
miscellaneous update of unrelated "derived state". One such piece of
derived state is a combined modelview and projection matrix. Your glGet
request may not depend on that matrix but mesa may update it anyway. So
code that effectively does:
glRotate ()
glGet GL_XYZ
glTranslate ()
glGet GL_XYZ
glTranslate ()
glGet GL_XYZ
glDrawElements()
Would update the combined matrix at each glGet call even if unneeded.
(This happened because each time the modelview was changed this derived
state gets marked as dirty, and glGetFloatfv calls mesa_update_state()
before doing anything) We saw this kind of pattern in the past because
the modelview matrix would be continually modified as we traversed the
scene graph painting Clutter actors, and then various cogl calls would
request uncached state directly from OpenGL at each point in the graph.

Anyhow; I certainly welcome any investigation in to Cogl performance,
and would be interested to hear about your findings and any ideas for
improving Cogl.

kind regards,
- Robert

>
>Cheers
>John
>
>On Mon, Nov 23, 2009 at 10:25 AM, Robert Bragg <b...@o-hand.com> wrote:
>
>> Excerpts from john delahey's message of Wed Nov 18 04:49:46 +0000 2009:
>>
>> >Hello
>> >
>> >When using the OpenGL backup, can Clutter render in immediate mode? That
>> is,
>> >send all Opengl command to the GPU instead of backing them with COGL. Are
>> >there fundamental reasons why this can't be done?
>>
>> Hi John,
>>
>> We could potentially consider adding API to disable the Cogl journal
>> which batches a lot of draw calls, or we could make the journal
>> pluggable potentially. Aside from GL draw calls though there are lots of
>> other GL calls which we defer. E.g. glEnable calls or glBindXYZ calls
>> tend to get deferred until the last moment so we avoid redundant state
>> changes.
>>
>> Can you be more specific about the problem you are facing? For example
>> there are the cogl_flush() and cogl_begin_gl/cogl_end_gl APIs that
>> may be of some help.
>>
>> Since I'm assuming you're asking this because you're trying to break out
>> into raw GL, I'll note one thing that we can't support and that is
>> interleaving of Cogl and GL calls done in such a way that you are trying
>> to affect the behaviour of Cogl via manual GL calls. We can only support
>> interleaved OpenGL calls that diligently save and restore the GL state
>> that they change, before returning to Cogl calls, and even then it's a
>> risky business and we'd much rather see proposals to improve the Cogl
>> API if possible.
>>
>> kind regards,
>> - Robert
>
-- 
Robert Bragg, Intel Open Source Technology Center
-- 
To unsubscribe send a mail to clutter+unsubscr...@o-hand.com

Reply via email to