Ian Romanick wrote:
I'm getting back to the TSL work, and I'd like to wrap it up as much as possible by the end of the week. I sent a patch out of the current state of things a little bit ago. Performance testing has shown that the new dispatch functions and the old, non-thread safe dispatch functions perform equally. The difference between the two is within the error margin of the test. That's the good news.
The bad news is that the combination of a multithreaded app, with an "old" driver, and a "new" libGL will result in a segfault. The code sets _glapi_Dispatch to NULL in the multithreaded case as a signal to call _glapi_get_dispatch. When the code is fully converted to TLS, the call to _glapi_get_dispatch will be replaced with a 'movl %gs:_gl_DispatchTSD, %eax' or something similar.
I want to maintain compatibility across the whole matrix of old / new libGL with old / new drivers. I may be willing to sacrifice new driver + old libGL, but that's the only combination I'd be willing to give on. I think we need to do 3 things to get there:
1. Keep the old meaning of _glapi_Dispatch. This means that all the _ts_* functions and their support mechanism will have to be kept. :( However, it will only be needed for the DRI libGL. The software rendering Mesa version won't need it.
This turned out to not be as bad as I thought. There is still the extra dispatch table, but all of the function pointers in it point to the regular dispatch functions. In the multithreaded case, if an old driver does _glapi_Dispatch->Vertex3fv, it will end up calling glVertex3fv, which will see _glapi_DispatchTSD is NULL, which will call _glapi_get_dispatch to get the real dispatch and call the real function.
The _ts_* functions are still gone. Everyone is happy. :)
The one catch is that glapitemp.h (and the Python generator scripts) were modified. If NAME is not defined, glapitemp.h will not generate the dispatch functions. If DISPATCH_TABLE_NAME is defined, it will still generate the dispatch table.
2. For non-TLS "new" code, implement a new variable called _glapi_DispatchTSD that implements the new behavior. There will be a "weak" version of that variable, that is always NULL in dri_util.c. This should allow non-TLS "new" code to work with old libGL binaries.
This is done and works perfectly. The only problem *I* have with it is the use of a GCC specific feature. Of course, the whole TLS thing is pretty GCC specific, so I guess this is somewhat moot.
3. TLS "new" code can do whatever it wants. I imagine this will be a __thread variable called _glapi_tls_Dispatch or something similar.
This part isn't done yet, but should be easy enough.
This patch also still requires you to manually generate glapi_x86.S by doing:
cd src/mesa/glapi python2 ./gl_x86_asm.py > ../x86/glapi_x86.S
The other new thing in this patch is the state variable ThreadSafe is gone. Instead, the code universally uses "_glapi_DispatchTSD == NULL" to determine if it is in multi-threaded mode. Some other code in glapi.c changes as a direct result.
If nobody raises any concerns, my intention is to commit these changes *tomorrow* afternoon (23-June-2004).
The next two steps will be to start adding in some of the 'visibility("hidden")' stuff and (of course) the TLS support. I plan to mostly follow Jakub's lead with two exceptions. Instead of using 'hidden' as the macro for '__attribute__ ((visibility("hidden"))', I'm going to take a cue from the PowerPC glibc code and use attribute_hidden. I'm also going to pick a different name than GLX_USE_TLS to enable compilation of the TLS code. I haven't decide what to use, though.
Index: src/mesa/drivers/dri/common/dri_util.c =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/common/dri_util.c,v retrieving revision 1.11 diff -u -d -r1.11 dri_util.c --- src/mesa/drivers/dri/common/dri_util.c 7 Jun 2004 21:23:12 -0000 1.11 +++ src/mesa/drivers/dri/common/dri_util.c 22 Jun 2004 23:32:48 -0000 @@ -55,6 +55,10 @@ #endif /** + */ +struct _glapi_table *_glapi_DispatchTSD __attribute__((weak)) = NULL; + +/** * This is used in a couple of places that call \c driCreateNewDrawable. */ static const int empty_attribute_list[1] = { None }; Index: src/mesa/glapi/gl_apitemp.py =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/gl_apitemp.py,v retrieving revision 1.2 diff -u -d -r1.2 gl_apitemp.py --- src/mesa/glapi/gl_apitemp.py 19 May 2004 23:33:08 -0000 1.2 +++ src/mesa/glapi/gl_apitemp.py 22 Jun 2004 23:32:48 -0000 @@ -112,6 +112,7 @@ */ +#if defined( NAME ) #ifndef KEYWORD1 #define KEYWORD1 #endif @@ -120,10 +121,6 @@ #define KEYWORD2 #endif -#ifndef NAME -#error NAME must be defined -#endif - #ifndef DISPATCH #error DISPATCH must be defined #endif @@ -140,6 +137,7 @@ def printInitDispatch(self): print """ +#endif /* defined( NAME ) */ /* * This is how a dispatch table can be initialized with all the functions Index: src/mesa/glapi/gl_x86_asm.py =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/gl_x86_asm.py,v retrieving revision 1.1 diff -u -d -r1.1 gl_x86_asm.py --- src/mesa/glapi/gl_x86_asm.py 18 May 2004 18:33:40 -0000 1.1 +++ src/mesa/glapi/gl_x86_asm.py 22 Jun 2004 23:32:48 -0000 @@ -77,15 +77,65 @@ print '#define GLOBL_FN(x) GLOBL x' print '#endif' print '' - print '#define GL_STUB(fn,off,stack)\t\t\t\t\\' + print '#if defined(PTHREADS)' + print '# define GL_STUB(fn,off,fn_alt)\t\t\t\\' print 'ALIGNTEXT16;\t\t\t\t\t\t\\' - print 'GLOBL_FN(GL_PREFIX(fn, fn ## @ ## stack));\t\t\\' - print 'GL_PREFIX(fn, fn ## @ ## stack):\t\t\t\\' - print '\tMOV_L(CONTENT(GLNAME(_glapi_Dispatch)), EAX) ;\t\\' + print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\' + print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\' + print '\tMOV_L(CONTENT(GLNAME(_glapi_DispatchTSD)), EAX) ;\t\\' + print '\tTEST_L(EAX, EAX) ;\t\t\t\t\\' + print '\tJE(1f) ;\t\t\t\t\t\\' + print '\tJMP(GL_OFFSET(off)) ;\t\t\t\t\\' + print '1:\tCALL(get_dispatch) ;\t\t\t\t\\' + print '\tJMP(GL_OFFSET(off))' + print '#elif defined(THREADS)' + print '# define GL_STUB(fn,off,fn_alt)\t\t\t\\' + print 'ALIGNTEXT16;\t\t\t\t\t\t\\' + print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\' + print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\' + print '\tMOV_L(CONTENT(GLNAME(_glapi_DispatchTSD)), EAX) ;\t\\' + print '\tTEST_L(EAX, EAX) ;\t\t\t\t\\' + print '\tJE(1f) ;\t\t\t\t\t\\' + print '\tJMP(GL_OFFSET(off)) ;\t\t\t\t\\' + print '1:\tCALL(_glapi_get_dispatch) ;\t\t\t\\' print '\tJMP(GL_OFFSET(off))' + print '#else /* Non-threaded version. */' + print '# define GL_STUB(fn,off,fn_alt)\t\t\t\\' + print 'ALIGNTEXT16;\t\t\t\t\t\t\\' + print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\' + print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\' + print '\tMOV_L(CONTENT(GLNAME(_glapi_DispatchTSD)), EAX) ;\t\\' + print '\tJMP(GL_OFFSET(off))' + print '#endif' print '' print 'SEG_TEXT' + print '' + print '#ifdef PTHREADS' print 'EXTERN GLNAME(_glapi_Dispatch)' + print 'EXTERN GLNAME(_gl_DispatchTSD)' + print '#ifdef __PIC__' + print 'EXTERN GLNAME([EMAIL PROTECTED])' + print '#else' + print 'EXTERN GLNAME(pthread_getspecific)' + print '#endif' + print '' + print 'ALIGNTEXT16' + print 'GLNAME(get_dispatch):' + print '\tSUB_L(CONST(24), ESP)' + print '\tPUSH_L(GLNAME(_gl_DispatchTSD))' + print '#ifdef __PIC__' + print '\tCALL(GLNAME([EMAIL PROTECTED]))' + print '#else' + print '\tCALL(GLNAME(pthread_getspecific))' + print '#endif' + print '\tADD_L(CONST(28), ESP)' + print '\tRET' + print '#elif defined(THREADS)' + print 'EXTERN GLNAME(_glapi_get_dispatch)' + print '#endif' + print '' + print '\t\tALIGNTEXT16 ; GLOBL gl_dispatch_functions_start' + print 'gl_dispatch_functions_start:' print '' return @@ -95,11 +145,10 @@ return def printFunction(self, f): - if f.fn_offset == -1: return - stack = self.get_stack_size(f) - print '\tGL_STUB(%s, _gloffset_%s, %u)' % (f.name, f.real_name, stack) + alt = "[EMAIL PROTECTED]" % (f.name, stack) + print '\tGL_STUB(%s, _gloffset_%s, %s)' % (f.name, f.real_name, alt) return def show_usage(): Index: src/mesa/glapi/glapi.c =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glapi.c,v retrieving revision 1.74 diff -u -d -r1.74 glapi.c --- src/mesa/glapi/glapi.c 27 May 2004 00:05:13 -0000 1.74 +++ src/mesa/glapi/glapi.c 22 Jun 2004 23:32:48 -0000 @@ -133,44 +133,44 @@ /***** BEGIN THREAD-SAFE DISPATCH *****/ -/* if we support thread-safety, build a special dispatch table for use - * in thread-safety mode (ThreadSafe == GL_TRUE). Each entry in the - * dispatch table will call _glthread_GetTSD() to get the actual dispatch - * table bound to the current thread, then jump through that table. - */ #if defined(THREADS) -static GLboolean ThreadSafe = GL_FALSE; /* In thread-safe mode? */ -static _glthread_TSD DispatchTSD; /* Per-thread dispatch pointer */ -static _glthread_TSD RealDispatchTSD; /* only when using override */ -static _glthread_TSD ContextTSD; /* Per-thread context pointer */ - - -#define KEYWORD1 static -#define KEYWORD2 GLAPIENTRY -#define NAME(func) _ts_##func - -#define DISPATCH(FUNC, ARGS, MESSAGE) \ - struct _glapi_table *dispatch; \ - dispatch = (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD); \ - if (!dispatch) \ - dispatch = (struct _glapi_table *) __glapi_noop_table; \ - (dispatch->FUNC) ARGS +/** + * \name Multi-threaded control support variables + * + * If thread-safety is supported, there are two potential mechanisms that can + * be used. The old-style mechanism would set \c _glapi_Dispatch to a special + * thread-safe dispatch table. These dispatch routines would call + * \c _glapi_get_dispatch to get the actual dispatch pointer. In this + * setup \c _glapi_Dispatch could never be \c NULL. This dual layered + * dispatch setup performed great for single-threaded apps, but didn't + * perform well for multithreaded apps. + * + * In the new mechansim, there are two variables. The first is + * \c _glapi_DispatchTSD. In the single-threaded case, this variable points + * to the dispatch table. In the multi-threaded case, this variable is + * \c NULL, and thread-specific variable \c _gl_DispatchTSD points to the + * actual dispatch table. \c _glapi_DispatchTSD is used to signal to the + * static dispatch functions to call \c _glapi_get_dispatch to get the real + * dispatch table. + * + * Throughout the code \c _glapi_DispatchTSD == \c NULL is used to determine + * whether or not the application is multi-threaded. + */ +/[EMAIL PROTECTED]/ +_glthread_TSD _gl_DispatchTSD; /**< Per-thread dispatch pointer */ +static _glthread_TSD RealDispatchTSD; /**< only when using override */ +static _glthread_TSD ContextTSD; /**< Per-thread context pointer */ +/[EMAIL PROTECTED]/ -#define RETURN_DISPATCH(FUNC, ARGS, MESSAGE) \ - struct _glapi_table *dispatch; \ - dispatch = (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD); \ - if (!dispatch) \ - dispatch = (struct _glapi_table *) __glapi_noop_table; \ - return (dispatch->FUNC) ARGS #define DISPATCH_TABLE_NAME __glapi_threadsafe_table #define UNUSED_TABLE_NAME __usused_threadsafe_functions -#define TABLE_ENTRY(name) (void *) _ts_##name +#define TABLE_ENTRY(name) (void *) gl##name -static int _ts_Unused(void) +static int glUnused(void) { return 0; } @@ -184,6 +184,7 @@ struct _glapi_table *_glapi_Dispatch = (struct _glapi_table *) __glapi_noop_table; +struct _glapi_table *_glapi_DispatchTSD = (struct _glapi_table *) __glapi_noop_table; struct _glapi_table *_glapi_RealDispatch = (struct _glapi_table *) __glapi_noop_table; /* Used when thread safety disabled */ @@ -218,7 +219,7 @@ _glapi_check_multithread(void) { #if defined(THREADS) - if (!ThreadSafe) { + if ( _glapi_DispatchTSD != NULL ) { static unsigned long knownID; static GLboolean firstCall = GL_TRUE; if (firstCall) { @@ -226,14 +227,12 @@ firstCall = GL_FALSE; } else if (knownID != _glthread_GetID()) { - ThreadSafe = GL_TRUE; + _glapi_set_dispatch(NULL); } } - if (ThreadSafe) { + else if (!_glapi_get_dispatch()) { /* make sure that this thread's dispatch pointer isn't null */ - if (!_glapi_get_dispatch()) { - _glapi_set_dispatch(NULL); - } + _glapi_set_dispatch(NULL); } #endif } @@ -250,10 +249,7 @@ { #if defined(THREADS) _glthread_SetTSD(&ContextTSD, context); - if (ThreadSafe) - _glapi_Context = NULL; - else - _glapi_Context = context; + _glapi_Context = (_glapi_DispatchTSD == NULL) ? NULL : context; #else _glapi_Context = context; #endif @@ -270,7 +266,7 @@ _glapi_get_context(void) { #if defined(THREADS) - if (ThreadSafe) { + if ( _glapi_DispatchTSD == NULL ) { return _glthread_GetTSD(&ContextTSD); } else { @@ -289,9 +285,11 @@ void _glapi_set_dispatch(struct _glapi_table *dispatch) { + struct _glapi_table * old_style_dispatch; + if (!dispatch) { /* use the no-op functions */ - dispatch = (struct _glapi_table *) __glapi_noop_table; + old_style_dispatch = (struct _glapi_table *) __glapi_noop_table; } #ifdef DEBUG else { @@ -301,19 +299,20 @@ #if defined(THREADS) if (DispatchOverride) { - _glthread_SetTSD(&RealDispatchTSD, (void *) dispatch); - if (ThreadSafe) + _glthread_SetTSD(&RealDispatchTSD, (void *) old_style_dispatch); + if ( dispatch == NULL ) _glapi_RealDispatch = (struct _glapi_table*) __glapi_threadsafe_table; else _glapi_RealDispatch = dispatch; } else { /* normal operation */ - _glthread_SetTSD(&DispatchTSD, (void *) dispatch); - if (ThreadSafe) - _glapi_Dispatch = (struct _glapi_table *) __glapi_threadsafe_table; - else - _glapi_Dispatch = dispatch; + _glthread_SetTSD(&_gl_DispatchTSD, (void *) old_style_dispatch); + _glapi_DispatchTSD = dispatch; + + _glapi_Dispatch = (dispatch == NULL) + ? (struct _glapi_table *) __glapi_threadsafe_table + : old_style_dispatch; } #else /*THREADS*/ if (DispatchOverride) { @@ -334,12 +333,12 @@ _glapi_get_dispatch(void) { #if defined(THREADS) - if (ThreadSafe) { + if ( _glapi_DispatchTSD == NULL ) { if (DispatchOverride) { return (struct _glapi_table *) _glthread_GetTSD(&RealDispatchTSD); } else { - return (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD); + return (struct _glapi_table *) _glthread_GetTSD(&_gl_DispatchTSD); } } else { @@ -349,11 +348,11 @@ } else { assert(_glapi_Dispatch); - return _glapi_Dispatch; + return _glapi_DispatchTSD; } } #else - return _glapi_Dispatch; + return _glapi_DispatchTSD; #endif } @@ -391,11 +390,14 @@ _glapi_set_dispatch(real); #if defined(THREADS) - _glthread_SetTSD(&DispatchTSD, (void *) override); - if (ThreadSafe) + _glthread_SetTSD(&_gl_DispatchTSD, (void *) override); + if ( _glapi_DispatchTSD == NULL ) { _glapi_Dispatch = (struct _glapi_table *) __glapi_threadsafe_table; - else + } + else { _glapi_Dispatch = override; + _glapi_DispatchTSD = override; + } #else _glapi_Dispatch = override; #endif @@ -427,7 +429,7 @@ else { if (DispatchOverride) { #if defined(THREADS) - return (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD); + return (struct _glapi_table *) _glthread_GetTSD(&_gl_DispatchTSD); #else return _glapi_Dispatch; #endif @@ -446,7 +448,9 @@ }; +#if !defined( USE_X86_ASM ) #define NEED_FUNCTION_POINTER +#endif /* The code in this file is auto-generated with Python */ #include "glprocs.h" @@ -468,7 +472,6 @@ return NULL; } - /* * Return dispatch table offset of the named static (built-in) function. * Return -1 if function not found. @@ -485,6 +488,37 @@ } +#ifdef USE_X86_ASM +extern const GLubyte gl_dispatch_functions_start[]; + +# if defined(PTHREADS) +# define X86_DISPATCH_FUNCTION_SIZE 32 +# else +# define X86_DISPATCH_FUNCTION_SIZE 16 +# endif + + +/* + * Return dispatch function address the named static (built-in) function. + * Return NULL if function not found. + */ +static const GLvoid * +get_static_proc_address(const char *funcName) +{ + const glprocs_table_t * const f = find_entry( funcName ); + + if ( f != NULL ) { + return gl_dispatch_functions_start + + (X86_DISPATCH_FUNCTION_SIZE * f->Offset); + } + else { + return NULL; + } +} + +#else + + /* * Return dispatch function address the named static (built-in) function. * Return NULL if function not found. @@ -496,6 +530,8 @@ return ( f != NULL ) ? f->Address : NULL; } +#endif /* USE_X86_ASM */ + static const char * get_static_proc_name( GLuint offset ) Index: src/mesa/glapi/glapitemp.h =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glapitemp.h,v retrieving revision 1.59 diff -u -d -r1.59 glapitemp.h --- src/mesa/glapi/glapitemp.h 27 Jan 2004 18:52:41 -0000 1.59 +++ src/mesa/glapi/glapitemp.h 22 Jun 2004 23:32:48 -0000 @@ -26,6 +26,7 @@ */ +#if defined( NAME ) #ifndef KEYWORD1 #define KEYWORD1 #endif @@ -34,10 +35,6 @@ #define KEYWORD2 #endif -#ifndef NAME -#error NAME must be defined -#endif - #ifndef DISPATCH #error DISPATCH must be defined #endif @@ -4778,6 +4775,7 @@ } +#endif /* defined( NAME ) */ /* * This is how a dispatch table can be initialized with all the functions Index: src/mesa/glapi/glapitemp.py =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glapitemp.py,v retrieving revision 1.5 diff -u -d -r1.5 glapitemp.py --- src/mesa/glapi/glapitemp.py 30 Nov 2002 17:18:46 -0000 1.5 +++ src/mesa/glapi/glapitemp.py 22 Jun 2004 23:32:48 -0000 @@ -67,6 +67,7 @@ */ +#if defined( NAME ) #ifndef KEYWORD1 #define KEYWORD1 #endif @@ -75,10 +76,6 @@ #define KEYWORD2 #endif -#ifndef NAME -#error NAME must be defined -#endif - #ifndef DISPATCH #error DISPATCH must be defined #endif @@ -231,6 +228,7 @@ def PrintInitDispatch(): print """ +#endif /* defined( NAME ) */ /* * This is how a dispatch table can be initialized with all the functions Index: src/mesa/glapi/glthread.h =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glthread.h,v retrieving revision 1.14 diff -u -d -r1.14 glthread.h --- src/mesa/glapi/glthread.h 27 May 2004 00:03:53 -0000 1.14 +++ src/mesa/glapi/glthread.h 22 Jun 2004 23:32:48 -0000 @@ -109,6 +109,20 @@ #define _glthread_UNLOCK_MUTEX(name) \ (void) pthread_mutex_unlock(&(name)) +/* This is temporarilly removed because driver binaries cannot count on + * the existance of _gl_DispatchTSD in libGL. It only exists in "new" + * libGL. We may be able to ressurect this optimization at some point + * for DRI driver or for software Mesa. + */ +#if 0 +extern struct _glapi_table * _glapi_DispatchTSD; +extern _glthread_TSD _gl_DispatchTSD; + +#define GL_CALL(name) \ + (((__builtin_expect( _glapi_DispatchTSD != NULL, 1 )) \ + ? _glapi_DispatchTSD : (struct _glapi_table *) pthread_getspecific(_gl_DispatchTSD.key))-> name) +#endif + #endif /* PTHREADS */ @@ -291,8 +305,15 @@ _glthread_SetTSD(_glthread_TSD *, void *); #ifndef GL_CALL -# define GL_CALL(name) (*(_glapi_Dispatch-> name)) -#endif +# if defined(THREADS) +extern struct _glapi_table * _glapi_DispatchTSD; +# define GL_CALL(name) \ + (((__builtin_expect( _glapi_DispatchTSD != NULL, 1 )) \ + ? _glapi_DispatchTSD : _glapi_get_dispatch())-> name) +# else +# define GL_CALL(name) (*(_glapi_Dispatch-> name)) +# endif /* defined(THREADS) */ +#endif /* ndef GL_CALL */ #endif /* THREADS_H */ Index: src/mesa/main/glheader.h =================================================================== RCS file: /cvs/mesa/Mesa/src/mesa/main/glheader.h,v retrieving revision 1.45 diff -u -d -r1.45 glheader.h --- src/mesa/main/glheader.h 22 Apr 2004 00:27:32 -0000 1.45 +++ src/mesa/main/glheader.h 22 Jun 2004 23:32:48 -0000 @@ -328,6 +328,12 @@ #endif +#if !defined __GNUC__ || __GNUC__ < 3 +# define __builtin_expect(x, y) x +#endif + + + /** * Sometimes we treat GLfloats as GLints. On x86 systems, moving a float * as a int (thereby using integer registers instead of FP registers) is