Ian Romanick wrote:
Ian Romanick wrote:
Ian Romanick wrote:
Alan Hourihane wrote:

Is there someone looking to integrate the TLS patches for libGL ??

We should certainly take a look soon and comment upon the patches used.

Here is a patch that covers part of what's in the Redhat patch. This convert the static_functions table to a list of offsets instead of a list of pointers. According to 'objdump -R' on the Mesa libGL, it cuts out about 1800 R_386_RELATIVE relocs. However, the size of the library *increases* by about 24k. That doesn't make sense to me.

Here's an updated version of that patch. There are some significant differences.


1. *All* architectures use the string offset table. To do this, gl_procs.py was modified to generate a big character array called gl_string_table in glprocs.h. The static_functions array now contains offsets into that array instead of pointers to strings. If gl_procs.py is invoked with '-m short' it will generate a (hard to read) character array. If it is invoked with no option or '-m long' it will generate a big (~16k) string. The string version of the .h file generates a warning from GCC.

2. The same glprocs.h is used even for the optimized x86 case. This is done by defining NEED_FUNC_POINTERS only on non-x86. Actually, it should only be defined on architectures that don't have generated assembly dispatch stubs.

3. All of the _ts_ dispatch code is *gone*. The x86 assembly dispatch code and the C dispatch code reflect this. The SPARC assembly dispatch has not yet been updated, but it should follow the x86 model. This means that this cod will catch fire, fallover, and sink into the swamp on SPARC. This will obviously need to be fixed before that portion of the patch is committed.

Unless there are objections, I would like to commit the new glprocs.h and the non-x86 specific code in glapi.c to support it.

This patch is about the final form, I hope. It builds on the previous version by replacing all "_glapi_Dispatch->Foo" calls with the GL_CALL macro. In addition to several programs from progs/demos, this patch has been tested with progs/xdemos/glthreads with 20 threads. The previous patch would die in glthreads because, in the threaded case, _glapi_Dispatch is NULL. That shouldn't have been a big surprise to me since that's most of the point of that patch! Duh!


Conceptually, this is similar to the GL_CALL macro in Jakub's patch, but it does not directly call the dispatch function in the threaded case. Since we can't call the dispatch functions from with in a *_dri.so, it inlines the dispatch function. As a nice side effect, dispatch.c uses GL_CALL to define DISPATCH and RETURN_DISPATCH. The GL_CALL macro currently lives in glthread.h because it might use some threading related functions. The pthreads-specific version directly calls pthread_getspecific, for example.

For functions that have a lot of GL_CALL invocations, it might be possible to make a new macro, GL_CALL_GET_DISPATCH or something, to cache the dispatch table pointer. This should make the compiled code a lot smaller and reduce the performance hit in the threaded case. Note that the performance hit in the threaded case was just as bad (maybe worse) when the _ts_ dispatch functions were used.

Like I threatened yesterday (heh), tomorrow (Wednesday) I plan to commit the following parts of this patch:

1. The new glprocs.h (generated with 'python2 gl_procs.py -m short') and the non-x86 specific code to support it.

2. The GL_CALL changes and the non-threaded version of the macro. That's the one that looks like "#define GL_CALL(name) (_glapi_Dispatch-> name)".

My hope is that we can discuss the remaining changes in the patch at Monday's #dri-devel chat. I should be able to get some performance numbers out later today.

This patch is the same as the -03 patch except it is against today's CVS. Also, you *MUST* regenerate glapi_x86.S on your own. Including that file made the patch way to big to get through the list. :) You can do this by doing (after applying the patch):


cd src/mesa/glapi
python2 gl_x86_asm.py > ../x86/glapi_x86.S

Index: src/mesa/glapi/gl_x86_asm.py
===================================================================
RCS file: /cvs/mesa/Mesa/src/mesa/glapi/gl_x86_asm.py,v
retrieving revision 1.1
diff -u -d -r1.1 gl_x86_asm.py
--- src/mesa/glapi/gl_x86_asm.py        18 May 2004 18:33:40 -0000      1.1
+++ src/mesa/glapi/gl_x86_asm.py        21 Jun 2004 21:17:25 -0000
@@ -77,15 +77,65 @@
                print '#define GLOBL_FN(x) GLOBL x'
                print '#endif'
                print ''
-               print '#define GL_STUB(fn,off,stack)\t\t\t\t\\'
+               print '#if defined(PTHREADS)'
+               print '#  define GL_STUB(fn,off,fn_alt)\t\t\t\\'
                print 'ALIGNTEXT16;\t\t\t\t\t\t\\'
-               print 'GLOBL_FN(GL_PREFIX(fn, fn ## @ ## stack));\t\t\\'
-               print 'GL_PREFIX(fn, fn ## @ ## stack):\t\t\t\\'
+               print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\'
+               print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\'
                print '\tMOV_L(CONTENT(GLNAME(_glapi_Dispatch)), EAX) ;\t\\'
+               print '\tTEST_L(EAX, EAX) ;\t\t\t\t\\'
+               print '\tJE(1f) ;\t\t\t\t\t\\'
+               print '\tJMP(GL_OFFSET(off)) ;\t\t\t\t\\'
+               print '1:\tCALL(get_dispatch) ;\t\t\t\t\\'
                print '\tJMP(GL_OFFSET(off))'
+               print '#elif defined(THREADS)'
+               print '#  define GL_STUB(fn,off,fn_alt)\t\t\t\\'
+               print 'ALIGNTEXT16;\t\t\t\t\t\t\\'
+               print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\'
+               print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\'
+               print '\tMOV_L(CONTENT(GLNAME(_glapi_Dispatch)), EAX) ;\t\\'
+               print '\tTEST_L(EAX, EAX) ;\t\t\t\t\\'
+               print '\tJE(1f) ;\t\t\t\t\t\\'
+               print '\tJMP(GL_OFFSET(off)) ;\t\t\t\t\\'
+               print '1:\tCALL(_glapi_get_dispatch) ;\t\t\t\\'
+               print '\tJMP(GL_OFFSET(off))'
+               print '#else /* Non-threaded version. */'
+               print '#  define GL_STUB(fn,off,fn_alt)\t\t\t\\'
+               print 'ALIGNTEXT16;\t\t\t\t\t\t\\'
+               print 'GLOBL_FN(GL_PREFIX(fn, fn_alt));\t\t\t\\'
+               print 'GL_PREFIX(fn, fn_alt):\t\t\t\t\t\\'
+               print '\tMOV_L(CONTENT(GLNAME(_glapi_Dispatch)), EAX) ;\t\\'
+               print '\tJMP(GL_OFFSET(off))'
+               print '#endif'
                print ''
                print 'SEG_TEXT'
+               print ''
+               print '#ifdef PTHREADS'
                print 'EXTERN GLNAME(_glapi_Dispatch)'
+               print 'EXTERN GLNAME(_gl_DispatchTSD)'
+               print '#ifdef __PIC__'
+               print 'EXTERN GLNAME([EMAIL PROTECTED])'
+               print '#else'
+               print 'EXTERN GLNAME(pthread_getspecific)'
+               print '#endif'
+               print ''
+               print 'ALIGNTEXT16'
+               print 'GLNAME(get_dispatch):'
+               print '\tSUB_L(CONST(24), ESP)'
+               print '\tPUSH_L(GLNAME(_gl_DispatchTSD))'
+               print '#ifdef __PIC__'
+               print '\tCALL(GLNAME([EMAIL PROTECTED]))'
+               print '#else'
+               print '\tCALL(GLNAME(pthread_getspecific))'
+               print '#endif'
+               print '\tADD_L(CONST(28), ESP)'
+               print '\tRET'
+               print '#elif defined(THREADS)'
+               print 'EXTERN GLNAME(_glapi_get_dispatch)'
+               print '#endif'
+               print ''
+               print '\t\tALIGNTEXT16 ; GLOBL gl_dispatch_functions_start'
+               print 'gl_dispatch_functions_start:'
                print ''
                return
 
@@ -95,11 +145,10 @@
                return
 
        def printFunction(self, f):
-               if f.fn_offset == -1: return
-
                stack = self.get_stack_size(f)
 
-               print '\tGL_STUB(%s, _gloffset_%s, %u)' % (f.name, f.real_name, stack)
+               alt = "[EMAIL PROTECTED]" % (f.name, stack)
+               print '\tGL_STUB(%s, _gloffset_%s, %s)' % (f.name, f.real_name, alt)
                return
 
 def show_usage():
Index: src/mesa/glapi/glapi.c
===================================================================
RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glapi.c,v
retrieving revision 1.74
diff -u -d -r1.74 glapi.c
--- src/mesa/glapi/glapi.c      27 May 2004 00:05:13 -0000      1.74
+++ src/mesa/glapi/glapi.c      21 Jun 2004 21:17:25 -0000
@@ -142,42 +142,11 @@
 #if defined(THREADS)
 
 static GLboolean ThreadSafe = GL_FALSE;  /* In thread-safe mode? */
-static _glthread_TSD DispatchTSD;        /* Per-thread dispatch pointer */
+_glthread_TSD _gl_DispatchTSD;           /* Per-thread dispatch pointer */
 static _glthread_TSD RealDispatchTSD;    /* only when using override */
 static _glthread_TSD ContextTSD;         /* Per-thread context pointer */
 
-
-#define KEYWORD1 static
-#define KEYWORD2 GLAPIENTRY
-#define NAME(func)  _ts_##func
-
-#define DISPATCH(FUNC, ARGS, MESSAGE)                                  \
-   struct _glapi_table *dispatch;                                      \
-   dispatch = (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD);  \
-   if (!dispatch)                                                      \
-      dispatch = (struct _glapi_table *) __glapi_noop_table;           \
-   (dispatch->FUNC) ARGS
-
-#define RETURN_DISPATCH(FUNC, ARGS, MESSAGE)                           \
-   struct _glapi_table *dispatch;                                      \
-   dispatch = (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD);  \
-   if (!dispatch)                                                      \
-      dispatch = (struct _glapi_table *) __glapi_noop_table;           \
-   return (dispatch->FUNC) ARGS
-
-#define DISPATCH_TABLE_NAME __glapi_threadsafe_table
-#define UNUSED_TABLE_NAME __usused_threadsafe_functions
-
-#define TABLE_ENTRY(name) (void *) _ts_##name
-
-static int _ts_Unused(void)
-{
-   return 0;
-}
-
-#include "glapitemp.h"
-
-#endif
+#endif /* defined(THREADS) */
 
 /***** END THREAD-SAFE DISPATCH *****/
 
@@ -303,15 +272,15 @@
    if (DispatchOverride) {
       _glthread_SetTSD(&RealDispatchTSD, (void *) dispatch);
       if (ThreadSafe)
-         _glapi_RealDispatch = (struct _glapi_table*) __glapi_threadsafe_table;
+         _glapi_RealDispatch = (struct _glapi_table*) NULL;
       else
          _glapi_RealDispatch = dispatch;
    }
    else {
       /* normal operation */
-      _glthread_SetTSD(&DispatchTSD, (void *) dispatch);
+      _glthread_SetTSD(&_gl_DispatchTSD, (void *) dispatch);
       if (ThreadSafe)
-         _glapi_Dispatch = (struct _glapi_table *) __glapi_threadsafe_table;
+         _glapi_Dispatch = (struct _glapi_table *) NULL;
       else
          _glapi_Dispatch = dispatch;
    }
@@ -339,7 +308,7 @@
          return (struct _glapi_table *) _glthread_GetTSD(&RealDispatchTSD);
       }
       else {
-         return (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD);
+         return (struct _glapi_table *) _glthread_GetTSD(&_gl_DispatchTSD);
       }
    }
    else {
@@ -391,9 +360,9 @@
    _glapi_set_dispatch(real);
 
 #if defined(THREADS)
-   _glthread_SetTSD(&DispatchTSD, (void *) override);
+   _glthread_SetTSD(&_gl_DispatchTSD, (void *) override);
    if (ThreadSafe)
-      _glapi_Dispatch = (struct _glapi_table *) __glapi_threadsafe_table;
+      _glapi_Dispatch = (struct _glapi_table *) NULL;
    else
       _glapi_Dispatch = override;
 #else
@@ -427,7 +396,7 @@
    else {
       if (DispatchOverride) {
 #if defined(THREADS)
-         return (struct _glapi_table *) _glthread_GetTSD(&DispatchTSD);
+         return (struct _glapi_table *) _glthread_GetTSD(&_gl_DispatchTSD);
 #else
          return _glapi_Dispatch;
 #endif
@@ -446,7 +415,9 @@
 };
 
 
+#if !defined( USE_X86_ASM )
 #define NEED_FUNCTION_POINTER
+#endif
 
 /* The code in this file is auto-generated with Python */
 #include "glprocs.h"
@@ -485,6 +456,36 @@
 }
 
 
+#ifdef USE_X86_ASM
+extern const GLubyte gl_dispatch_functions_start[];
+
+# if defined(PTHREADS)
+#  define X86_DISPATCH_FUNCTION_SIZE  32
+# else
+#  define X86_DISPATCH_FUNCTION_SIZE  16
+# endif
+
+
+/*
+ * Return dispatch function address the named static (built-in) function.
+ * Return NULL if function not found.
+ */
+static const GLvoid *
+get_static_proc_address(const char *funcName)
+{
+   const glprocs_table_t * const f = find_entry( funcName );
+
+   if ( f != NULL ) {
+      return gl_dispatch_functions_start 
+          + (X86_DISPATCH_FUNCTION_SIZE * f->Offset);
+   }
+   else {
+      return NULL;
+   }
+}
+
+#else
+
 /*
  * Return dispatch function address the named static (built-in) function.
  * Return NULL if function not found.
@@ -496,6 +497,8 @@
    return ( f != NULL ) ? f->Address : NULL;
 }
 
+#endif /* USE_X86_ASM */
+
 
 static const char *
 get_static_proc_name( GLuint offset )
Index: src/mesa/glapi/glthread.h
===================================================================
RCS file: /cvs/mesa/Mesa/src/mesa/glapi/glthread.h,v
retrieving revision 1.14
diff -u -d -r1.14 glthread.h
--- src/mesa/glapi/glthread.h   27 May 2004 00:03:53 -0000      1.14
+++ src/mesa/glapi/glthread.h   21 Jun 2004 21:17:26 -0000
@@ -109,6 +109,12 @@
 #define _glthread_UNLOCK_MUTEX(name) \
    (void) pthread_mutex_unlock(&(name))
 
+extern _glthread_TSD _gl_DispatchTSD;
+
+#define GL_CALL(name) \
+   (((__builtin_expect( _glapi_Dispatch != NULL, 1 )) \
+       ? _glapi_Dispatch : (struct _glapi_table *) 
pthread_getspecific(_gl_DispatchTSD.key))-> name)
+
 #endif /* PTHREADS */
 
 
@@ -291,8 +297,14 @@
 _glthread_SetTSD(_glthread_TSD *, void *);
 
 #ifndef GL_CALL
-# define GL_CALL(name) (*(_glapi_Dispatch-> name))
-#endif
+# if defined(THREADS)
+#  define GL_CALL(name) \
+   (((__builtin_expect( _glapi_Dispatch != NULL, 1 )) \
+       ? _glapi_Dispatch : _glapi_get_dispatch())-> name)
+# else
+#  define GL_CALL(name) (*(_glapi_Dispatch-> name))
+# endif /* defined(THREADS) */
+#endif  /* ndef GL_CALL */
 
 
 #endif /* THREADS_H */
Index: src/mesa/main/glheader.h
===================================================================
RCS file: /cvs/mesa/Mesa/src/mesa/main/glheader.h,v
retrieving revision 1.45
diff -u -d -r1.45 glheader.h
--- src/mesa/main/glheader.h    22 Apr 2004 00:27:32 -0000      1.45
+++ src/mesa/main/glheader.h    21 Jun 2004 21:17:26 -0000
@@ -328,6 +328,12 @@
 #endif
 
 
+#if !defined __GNUC__ || __GNUC__ < 3
+# define __builtin_expect(x, y) x
+#endif
+
+
+
 /**
  * Sometimes we treat GLfloats as GLints.  On x86 systems, moving a float
  * as a int (thereby using integer registers instead of FP registers) is

Reply via email to