Re: Fast thread-local storage for OpenGL drivers

2003-02-24 Thread Gareth Hughes
Roland McGrath wrote: In glibc, we actually allocate some excess space in the thread-local storage area layout determined at startup time. This lets a dynamically loaded module use static TLS if its PT_TLS segment fits in the available surplus. (In sysdeps/generic/dl-tls.c, see

Re: Fast thread-local storage for OpenGL drivers

2003-02-24 Thread Jakub Jelinek
On Sun, Feb 23, 2003 at 06:44:10PM -0800, Gareth Hughes wrote: In fact, we put this feature there with GL in mind... Did you inform the OpenGL vendors who were interested in this issue of this fact? Have you documented it anywhere, particularly in Ulrich Drepper's ELF Handling For

RE: Fast thread-local storage for OpenGL drivers

2003-02-23 Thread Gareth Hughes
Ulrich Weigand wrote: From what I can see, you are actually getting LE model code (omitting the ebp stuff which you can avoid using -fomit-frame-pointer): movl%gs:0, %eax movl[EMAIL PROTECTED](%eax), %edx movl(%edx), %ecx jmp *%ecx

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
Dan Kegel wrote: Gareth Hughes wrote: It is critically important for OpenGL drivers to have fast (single-instruction) access to thread local variables. ... While glibc's new thread library implementation has many benefits, particularly to application programmers (with support for the new keyword

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
Dan Kegel wrote: There have been a few good replies on the NPTL list; see e.g. https://listman.redhat.com/pipermail/phil-list/2003-February/000615.html in which Roland expands a bit on his first post. Here's the upshot of Roland's post: OpenGL apps *can* avoid the function call Gareth was

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
Gareth Hughes wrote: To be clear, what I'm proposing requires ZERO code changes to Wine and glibc. It will work today, with Wine implementations already in the field. All I'm proposing we do is make sure it continues to work in the future, irrespective of the decision made regarding the use of

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Dan Kegel wrote: If it turns out glibc's new __thread variable support really can do what you need on all platforms, do you agree that it might be better to use that? Insert the word new between all and platforms, and maybe you'll have more of an appreciation for my point of view :-) Which

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Dan Kegel wrote: It does look like the TLS model does what you want it to, and no new methods are needed. Can you explain in more detail why your new proposal is needed, if you still think it is? To be clear, what I'm proposing requires ZERO code changes to Wine and glibc. It will work

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Roland McGrath wrote: These people clearly haven't read all of the TLS paper, or looked at the GCC implementation of __thread long enough to notice -ftls-model and __attribute__ ((tls_model)). This is what I was talking about. I've read the entire document several times, and still can't see

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Daniel Jacobowitz
On Sat, Feb 22, 2003 at 09:51:26AM -0800, Gareth Hughes wrote: Roland McGrath wrote: These people clearly haven't read all of the TLS paper, or looked at the GCC implementation of __thread long enough to notice -ftls-model and __attribute__ ((tls_model)). This is what I was talking

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Alexandre Julliard
Gareth Hughes [EMAIL PROTECTED] writes: To be clear, what I'm proposing requires ZERO code changes to Wine and glibc. It will work today, with Wine implementations already in the field. All I'm proposing we do is make sure it continues to work in the future, irrespective of the decision made

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
Gareth Hughes wrote: Dan Kegel wrote: OK, all new platforms. Sounds like a good argument for tagging along with glibc's __thread variable support, if you ask me. We still have to support all older platforms as well. Sure, but it doesn't have to be as blazingly fast on old platforms. And,

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Alexandre Julliard wrote: This is not something we can guarantee. The layout of the thread structure in Wine is defined by Microsoft, and it's very possible that they will use these fields someday for something that we need to emulate. What about the area currently used for GDI? One other

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Dan Kegel wrote: OK, all new platforms. Sounds like a good argument for tagging along with glibc's __thread variable support, if you ask me. We still have to support all older platforms as well. -- Gareth Hughes ([EMAIL PROTECTED]) OpenGL Developer, NVIDIA Corporation

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Daniel Jacobowitz wrote: Note the always in Roland's paragraph. Note the fact that he said it would require one of the dynamic access models (GD or LD), which require at least one function call to access thread local variables. As I've said, this is an unacceptable hit on performance. When

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Jakub Jelinek wrote: What actually matters is the size of PT_TLS segment of the shared library which defines those 2-3 __thread variables (I assume it is libGL.so, right?). Generally, yes. It would be good if the rest of __thread variables which aren't performance critical is provided by

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Dan Kegel wrote: Sure, but it doesn't have to be as blazingly fast on old platforms. Actually, yes it does. And, practically speaking, what you're looking for might in fact only be possible using the new glibc stuff; as Alexander said, Wine might be forced to break any constraint you try

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Jakub Jelinek
On Sat, Feb 22, 2003 at 10:32:05AM -0800, Gareth Hughes wrote: Two or three pointers. I'm pretty sure we use less than 8 pointers all up, although many of those aren't performance critical. Three of ours most definitely are, and it would be nice if moving to a couple more didn't break

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Jakub Jelinek
On Sat, Feb 22, 2003 at 09:51:26AM -0800, Gareth Hughes wrote: This is what I was talking about. I've read the entire document several times, and still can't see a way that a dynamically loadable shared library can be guaranteed to use the single-instruction Local Exec access model. If I'm

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Alexandre Julliard
Gareth Hughes [EMAIL PROTECTED] writes: What about the area currently used for GDI? One other method could be to have space at a negative offset from %fs reserved for OpenGL, similar to what the IA32 implementation of NTPL does in Variant II of its design. There are no more guarantees with

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
Gareth Hughes wrote: Alexandre Julliard [mailto:[EMAIL PROTECTED] wrote: There are no more guarantees with the GDI area than with anything else. Microsoft is free to change that whenever they feel like it. Negative offsets could work, but they would waste a full page of memory for each thread

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread David Laight
I suspect most of the old versions of Linux you're worried about will vanish from user's desktops fairly quickly as soon as Linux distributions ship versions that include the current suite of improvements. (Not that people will kill for the new glibc, but there are a whole lot of

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Alexandre Julliard [mailto:[EMAIL PROTECTED] wrote: There are no more guarantees with the GDI area than with anything else. Microsoft is free to change that whenever they feel like it. Negative offsets could work, but they would waste a full page of memory for each thread since the TEB has

RE: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Gareth Hughes
Here's a sample app and library that uses a couple of __thread variables in a manner similar to OpenGL. I'd appreciate it if someone could take a look and explain what I'm doing incorrectly, as I've declared the variables as LE and yet seem to be getting IE. Note that I'm away from home and

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Dan Kegel
David Laight wrote: I suspect most of the old versions of Linux you're worried about will vanish from user's desktops fairly quickly as soon as Linux distributions ship versions that include the current suite of improvements. (Not that people will kill for the new glibc, but there are a whole lot

Re: Fast thread-local storage for OpenGL drivers

2003-02-22 Thread Ulrich Weigand
Gareth Hughes wrote: Here's a sample app and library that uses a couple of __thread variables in a manner similar to OpenGL. I'd appreciate it if someone could take a look and explain what I'm doing incorrectly, as I've declared the variables as LE and yet seem to be getting IE. Note that

Fast thread-local storage for OpenGL drivers

2003-02-21 Thread Gareth Hughes
It is critically important for OpenGL drivers to have fast (single-instruction) access to thread local variables. I'd be happy to provide more information to anyone who's interested, but a typical case where TLS access can severely hurt performance is at the very front-end of an OpenGL library.

Re: Fast thread-local storage for OpenGL drivers

2003-02-21 Thread Dan Kegel
Gareth Hughes wrote: It is critically important for OpenGL drivers to have fast (single-instruction) access to thread local variables. ... While glibc's new thread library implementation has many benefits, particularly to application programmers (with support for the new keyword '__thread', and so