[EMAIL PROTECTED] wrote:
>
> Brett Johnson wrote:
> | ... The whole point of all
> | these posts I've been writing is that only *one* indirection needs to exist,
> | and it might as well be at the libGL layer rather than at the driver layer.
>
> You're correct that one indirection (in addition to loading the
> address of the dispatch table) suffices, but there are also reasons to
> use more, or to put them some place other than in libGL.
>
> Let's assume for a moment that there's only one dispatch table. Then
> any state change must modify the table entries for all the OpenGL
> functions it affects. For some state changes, this is a *lot* of
> table entries. If state changes are frequent compared to rendering
> commands, then the cost of modifying the table entries can be higher
> than the cost of double indirections (or inline tests) for the
> rendering commands.
Agreed.
> In theory, you could solve this by keeping a set of dispatch tables,
> each handling an interesting combination of modes. A state-setting
> command would change just the current context's pointer to the
> dispatch table. This requires two loads (one for the table's address,
> and one for the function address) to dispatch each command; in return,
> both state changes and command dispatching are relatively fast.
Ewww...
> In practice, the number of unique dispatch tables you'd need to do
> this is probably prohibitively high. But you can apply the same
> strategy hierarchically; for instance, all the commands that you
> expect to be executed infrequently can be lumped together in one or
> more subtables, and high-frequency commands could be left in the
> highest-level dispatch table. Then state-setting commands would
> change one high-level dispatch table pointer and just a few
> lower-level subtable pointers. However, the choice of the best
> hierarchy of tables is likely to be highly machine-dependent, so these
> tricks need to be done in the driver, not in libGL.
I couldn't agree more. That's why I'm pushing for this concept that
the driver would be able to manage its own top-level dispatch table
(through a well-defined API of course!). Nothing prevents the driver
from maintaining its own second-tier dispatch tables if desired.
> Other people know the low-level details of current implementations
> better than I do, but my guess would be that the best tradeoff looks
> something like this:
>
> The per-thread information needed for OpenGL rendering is a
> pointer to the current rendering context and a pointer to the
> current dispatch table.
>
> Core OpenGL commands are dispatched by loading the pointer to
> the current dispatch table from the per-thread data area,
> loading the appropriate element of the table, and jumping to
> it. This transfers control to the driver associated with the
> current rendering context. Commands in the driver fetch their
> arguments in the usual way, and load the pointer to the
> current rendering context from the per-thread data area if
> they need it.
>
> The dispatch table is maintained entirely by the driver
> associated with the current context. State changes may cause
> the thread's dispatch-table pointer to change (thus swapping
> in a new table that's potentially entirely different from the
> old one), or may cause individual entries in the current
> dispatch table to change.
Hmm.. I'm not sure I agree with this part. I don't like the idea of the
driver owning and managing the actual dispatch table memory. I'd rather
see the libGL manage the table memory and assign slots etc.., and have the
driver manage the appropriate table entries through a more abstract API.
I'll give some reasoning for this later.
> The dispatch table contains entries for both core commands and
> extension commands (for the driver associated with the current
> context).
>
> The mapping between command and table index needs to be
> identical across all OpenGL implementations. This allows a
> single libGL to interpret the dispatch tables from any driver
> in the most efficient way (using constant offsets for
> dispatching). The table indices should be maintained in a
> registry, just like OpenGL core and extension enumerants, and
> allocated in small chunks so that the table memory is used
> efficiently. [Note: I've just mentioned this for
> completeness; it's not a part of the opengl-base
> specification.]
>
> In addition to entry points for core OpenGL commands and
> previously-registered extensions, libGL should include a
> number of reserved entry points for extension commands that
> were registered after the time libGL was compiled. Each of
> these entry points is associated with a table index variable.
> GetProcAddress functions by asking the driver to map a command
> name into a table index, storing that value in the index
> variable associated with the next-available reserved entry
> point, and returning the address of that entry point. (If no
> more reserved entry points are available, GetProcAddress
> returns NULL.)
>
> The registry of dispatch-table indices for extension commands
> guarantees that an efficient dispatch process is possible for
> all contexts that support a given extension. The entire setup
> also preserves the nice property that
> glGetProcAddress("glFoo")==&glFoo, for both core and extension
> commands, for all contexts. Finally, the dispatch process is
> very nearly as efficient for new extensions as it is for core
> commands (the only difference is indexing the dispatch table
> with a variable rather than a constant).
I'd rather see a less static arrangement, where libGL dynamically manages
the dispatch tables and dynamically assigns offsets into the table for
each entrypoint (keeping the offsets the same for all drivers of course).
This would ensure that current drivers could continue to function if
libGL was updated to use larger dispatch tables etc...
> Note that if an application uses glGetProcAddress to get the
> address of an extension function, and then calls that function
> when the current context does not support the extension, libGL
> will jump through a nonexistent dispatch table entry.
> Personally, I say ``We gave 'em the rope; let 'em hang,'' but
> with additional overhead in the reserved entry-point functions
> we could check dispatch table length, check for null table
> entries, etc. This might be useful when debugging apps that
> fail to check the extensions string properly.
If the libGL manages the table memory and slot allocation (all drivers
will then have the same length table), this overhead can go away. libGL
can simply plug in "do_nothing" or "debug_and_generate_an_error" as
appropriate for any slots not filled by the driver.
As a final argument for having libGL manage the dispatch tables, it would
also allow libGL to come up with a higher performance dispatching mechanism
later, without breaking all of the drivers in the process.
For example:
We could decide that adding some platform dependence into libGL would be
worth it for additional performance. So, on x86, rather than using C
function pointers to do dispatching, we might instead use a table of
dynamically generated x86 "jump" instructions. We would be free to do
this if the dispatch table details were all managed by libGL. If the
table is managed by the drivers though, its definition would be cast in
stone and be nearly impossible to change.
Cheers!
BTW, I'm going to be gone till the middle of next week, so if I don't reply
to a post right away, it's not 'cause I've lost interest!
begin:vcard
n:Johnson;Brett
x-mozilla-html:FALSE
org:Hewlett-Packard;Worstation Systems Lab
adr:;;;;;;
version:2.1
email;internet:[EMAIL PROTECTED]
title:WSL Turtle Master
fn:Brett Johnson
end:vcard