Re: Automatic CDECL / STDCALL translation
Hmm. Could you send me a disassembly of the thunk as it is actually generated? Ok, here you are, directly from Wine's debugger : Wine-dbgbt bt Backtrace: =0 0x4967487e (wine_glGetString) (ebp=4eb610a0) Wine-dbgdisas 0x4967487e disas 0x4967487e 0x4967487e (wine_glGetString): movl 0x0(%esp),%eax 0x49674881 (wine_glGetString+0x3): leal 0x4(%esp),%edx 0x49674885 (.wine_glGetString [opengl_norm.c]): movl0x0(%edx),%ecx 0x49674887 (.wine_glGetString+0x2 [opengl_norm.c]): movl%eax,0x0(%edx) 0x49674889 (.wine_glGetString+0x4 [opengl_norm.c]): movl%ecx,%eax 0x4967488b (.wine_glGetString+0x6 [opengl_norm.c]): subl$4,%edx 0x4967488e (.wine_glGetString+0x9 [opengl_norm.c]): leal0x0(%esp),%ecx 0x49674891 (.wine_glGetString+0xc [opengl_norm.c]): cmpl%ecx,%edx 0x49674893 (.wine_glGetString+0xe [opengl_norm.c]): jnl 0x49674885 (.wine_glGetString [opengl_norm.c]) 0x49674895 (.wine_glGetString+0x10 [opengl_norm.c]): call 0x4967489a (.wine_glGetString.getgot.enter [opengl_norm.c]) 0x4967489a (.wine_glGetString.getgot.enter [opengl_norm.c]): popl %ebx 0x4967489b (.wine_glGetString.getgot.enter+0x1 [opengl_norm.c]): addl $0x14d32,%ebx 0x496748a1 (.wine_glGetString.getgot.enter+0x7 [opengl_norm.c]): pushl 0x6e4(%ebx) 0x496748a7 (.wine_glGetString.getgot.enter+0xd [opengl_norm.c]): call 0x4966fd98 (_init+0xa74) 0x496748ac (.wine_glGetString.getgot.enter+0x12 [opengl_norm.c]): call 0x496706f8 (_init+0x13d4) 0x496748b1 (.wine_glGetString.getgot.enter+0x17 [opengl_norm.c]): call 0x496748b6 (.wine_glGetString.getgot.leave [opengl_norm.c]) 0x496748b6 (.wine_glGetString.getgot.leave [opengl_norm.c]): popl %ebx 0x496748b7 (.wine_glGetString.getgot.leave+0x1 [opengl_norm.c]): addl $0x14d16,%ebx 0x496748bd (.wine_glGetString.getgot.leave+0x7 [opengl_norm.c]): pushl 0x6e4(%ebx) 0x496748c3 (.wine_glGetString.getgot.leave+0xd [opengl_norm.c]): call 0x49670738 (_init+0x1414) 0x496748c8 (.wine_glGetString.getgot.leave+0x12 [opengl_norm.c]): addl $4,%esp 0x496748cb (.wine_glGetString.getgot.leave+0x15 [opengl_norm.c]): ret Also helpful would be to get a stack dump at the point when the thunk is initially called, and another one at the point the called real glGetString routine has returned ... At the breakpoint, the stack looks like this (at least I hope it's where the stack should be :-) ) : Wine-dbgx /32x $esp 0x49646948 (_end+0x1798b8): 0041b6e2 1f03 4eb610a0 0x49646958 (_end+0x1798c8): 4ea44bc8 4017b388 0x49646968 (_end+0x1798d8): 081e0e80 49646980 407d8073 0001 0x49646978 (_end+0x1798e8): 081e0e80 496469ac 40180cd7 0021 0x49646988 (_end+0x1798f8): 407d8d3f 496895cc 40180c96 4024f658 0x49646998 (_end+0x179908): 0034 4024d330 0b78 0x496469a8 (_end+0x179918): 0001 496469dc 4017c249 0034 0x496469b8 (_end+0x179928): 0001 4017c1ba 496895cc 0x1F03 must be the argument to the glGetString function (it's the GL_EXTENSIONS enumerant). Now, the the stack before the call to the real glGetString is hard to get because of the entering of the X11 crit. section (VERY annoying, this). So, here I start again, but in setting '$critical_section' to '0' in Patrik's script : Wine-dbgbt Backtrace: =0 0x4967417c (wine_glGetString) (ebp=4eb610a0) Wine-dbgdisas disas 0x4967417c (wine_glGetString): movl 0x0(%esp),%eax 0x4967417f (wine_glGetString+0x3): leal 0x4(%esp),%edx 0x49674183 (.wine_glGetString [opengl_norm.c]): movl0x0(%edx),%ecx 0x49674185 (.wine_glGetString+0x2 [opengl_norm.c]): movl%eax,0x0(%edx) 0x49674187 (.wine_glGetString+0x4 [opengl_norm.c]): movl%ecx,%eax 0x49674189 (.wine_glGetString+0x6 [opengl_norm.c]): subl$4,%edx 0x4967418c (.wine_glGetString+0x9 [opengl_norm.c]): leal0x0(%esp),%ecx 0x4967418f (.wine_glGetString+0xc [opengl_norm.c]): cmpl%ecx,%edx 0x49674191 (.wine_glGetString+0xe [opengl_norm.c]): jnl 0x49674183 (.wine_glGetString [opengl_norm.c]) 0x49674193 (.wine_glGetString+0x10 [opengl_norm.c]): call 0x496706f8 (_init+0x13d4) 0x49674198 (.wine_glGetString+0x15 [opengl_norm.c]): addl $4,%esp 0x4967419b (.wine_glGetString+0x18 [opengl_norm.c]): ret Wine-dbgx /32x $esp x /32x $esp 0x49646948 (_end+0x1798b8): 0041b6e2 1f03 4eb610a0 0x49646958 (_end+0x1798c8): 4ea44bc8 407db616 0x49646968 (_end+0x1798d8): 081e17a8 49646980 407d8073 0001 0x49646978 (_end+0x1798e8): 081e17a8 080b3c58 496469cc 496469cc 0x49646988 (_end+0x1798f8): 407d8d3f 49684c6c 4040bb20 4024d330 0x49646998 (_end+0x179908): 080b42c4 496469ac 0d48 0x496469a8 (_end+0x179918): 8e42c4de 10f40001 0001 0x496469b8 (_end+0x179928): 0220042f 081d1d6c 4017c1ba 49684c6c (...) Wine-dbgni 0x49674193 (.wine_glGetString+0x10 [opengl_norm.c]): call 0x496706f8 (_init+0x13d4) Wine-dbgx /32x $esp x /32x $esp 0x49646948 (_end+0x1798b8): 1f03 0041b6e2
RE: Automatic CDECL / STDCALL translation
Register dump: CS:0023 SS:002b DS:002b ES:002b FS:0257 GS: EIP:496706f8 ESP:49646944 EBP:4eb610a0 EFLAGS:00010297( R- 00 I S -A-P1C) EAX:0041b6e2 EBX: ECX:49646948 EDX:49646944 ESI:4eb610a0 EDI: [snip] Backtrace: =0 0x496706f8 (_init+0x13d4) (ebp=4eb610a0) 0x496706f8 (_init+0x13d4): jmp *0x4f0(%ebx) Hope this helps Hmm, EBX is 0. That shouldn't happend. This is perhaps because the GOT must be reloaded in EBX before the call. Perhaps it worked for me by pure luck. A previous function had set EBX to the GOT. Not very unlikely though. Anyway, this new version works for me as well. Index: wine/dlls/opengl32/make_opengl_norm === RCS file: /home/wine/wine/dlls/opengl32/make_opengl_norm,v retrieving revision 1.2 diff -u -u -r1.2 make_opengl_norm --- wine/dlls/opengl32/make_opengl_norm 2000/05/18 00:07:53 1.2 +++ wine/dlls/opengl32/make_opengl_norm 2000/05/20 10:30:06 @@ -1,5 +1,9 @@ #!/usr/bin/perl -w +my $i386 = 1; +my $critical_section = 1; +my $pic = 1; + print " /* Auto-generated file... Do not edit ! */ @@ -9,10 +13,64 @@ "; +print '#define THUNK_STDCALL_TO_CDECL(stdcall_name, cdecl_name, argsize)' . " \\\n"; +print ' asm("\t.globl\t" #stdcall_name "\n"' . " \\\n"; +print '"\t.type\t" #stdcall_name ", @function\n"' . " \\\n"; +print '#stdcall_name ":\n"' . " \\\n"; +print '"\tmovl (%esp), %eax\n"' . " \\\n"; +print '"\tleal " #argsize "(%esp), %edx\n"' . " \\\n"; +print '"." #stdcall_name ":\n"' . " \\\n"; +print '"\tmovl (%edx), %ecx\n"' . " \\\n"; +print '"\tmovl %eax, (%edx)\n"' . " \\\n"; +print '"\tmovl %ecx, %eax\n"' . " \\\n"; +print '"\tsubl $4, %edx\n"' . " \\\n"; +print '"\tleal (%esp), %ecx\n"' . " \\\n"; +print '"\tcmpl %ecx, %edx\n"' . " \\\n"; +print '"\tjge ." #stdcall_name "\n"' . " \\\n"; +if($critical_section) { +if($pic) { + print '"\tcall ." #stdcall_name ".getgot.enter\n"' . " \\\n"; + print '"." #stdcall_name ".getgot.enter:\n"' . " \\\n"; + print '"\tpopl %ebx\n"' . " \\\n"; + print '"\taddl $_GLOBAL_OFFSET_TABLE_+[.-." #stdcall_name ".getgot.enter], %ebx\n"' . " \\\n"; + print '"\tpushl X11DRV_CritSection@GOT(%ebx)\n"' . " \\\n"; + print '"\tcall EnterCriticalSection@PLT\n"' . " \\\n"; +} else { + print '"\tpushl $X11DRV_CritSection\n"' . " \\\n"; + print '"\tcall EnterCriticalSection\n"' . " \\\n"; +} +} +if($pic) { +print '"\tcall ." #stdcall_name ".getgot.call\n"' . " \\\n"; +print '"." #stdcall_name ".getgot.call:\n"' . " \\\n"; +print '"\tpopl %ebx\n"' . " \\\n"; +print '"\taddl $_GLOBAL_OFFSET_TABLE_+[.-." #stdcall_name ".getgot.call], %ebx\n"' . " \\\n"; +print '"\tcall " #cdecl_name "@PLT\n"' . " \\\n"; +} else { +print '"\tcall " #cdecl_name "\n"' . " \\\n"; +} +if($critical_section) { +if($pic) { + print '"\tcall ." #stdcall_name ".getgot.leave\n"' . " \\\n"; + print '"." #stdcall_name ".getgot.leave:\n"' . " \\\n"; + print '"\tpopl %ebx\n"' . " \\\n"; + print '"\taddl $_GLOBAL_OFFSET_TABLE_+[.-." #stdcall_name ".getgot.leave], %ebx\n"' . " \\\n"; + print '"\tpushl X11DRV_CritSection@GOT(%ebx)\n"' . " \\\n"; + print '"\tcall LeaveCriticalSection@PLT\n"' . " \\\n"; +} else { + print '"\tpushl $X11DRV_CritSection\n"' . " \\\n"; + print '"\tcall LeaveCriticalSection\n"' . " \\\n"; +} +} +print '"\taddl $" #argsize ", %esp\n"' . " \\\n"; +print '"\tret\n"' . " \\\n"; +print '"\t.size\t" #stdcall_name ", .-" #stdcall_name "\n"' . " \\\n"; +print ' );' . "\n"; + # # Now, the functions from the include file # -open(INC, "/usr/X11R6/include/GL/gl.h") || die "Could not open GL/gl.h"; +open(INC, "/usr/include/GL/gl.h") || die "Could not open GL/gl.h"; while ($line = INC) { if ($line =~ /GLAPI.*GLAPIENTRY/) { # Start of a function declaration @@ -22,9 +80,11 @@ if (($name !~ /(MESA|PGI|ARB|EXT)/) || ($name =~ /MultiTexCoord/) || ($name =~ /ActiveTextureARB/)) { + print "/***\n" ; print " *\t\t$name\n"; print " */\n"; + print "\n/* " if $i386; print "$ret WINAPI wine_$name("; @rargs = (); @names = (); @@ -61,32 +121,41 @@ foreach (@rargs) { print ", $_"; } - print ") {\n"; - if ($ret !~ /void/) { - print " $ret ret;\n"; - } - print " ENTER_GL();\n"; - if ($ret !~ /void/) { - print " ret = "; + if($i386) { + print ") */\n"; } else { - print " "; + print ") {\n"; } -
Re: Automatic CDECL / STDCALL translation
Patrik Stridvall wrote: Hmm, EBX is 0. That shouldn't happend. This is perhaps because the GOT must be reloaded in EBX before the call. Correct. You need to have the GOT in %ebx before calling any PLT stub ... (Hmpf. I missed that as well :-/) Perhaps it worked for me by pure luck. A previous function had set EBX to the GOT. Not very unlikely though. Probably because you loaded it just before, when calling EnterCriticalSection ;-) Anyway, this new version works for me as well. Hmmm. There's still one problem, though: both the Linux/ELF and the Win32 ABI assume that %ebx is preserved across function calls. This means on the one hand that you don't need to reload the GOT all the time, but much more importantly on the other hand that you really shouldn't corrupt the caller's %ebx :-/ This won't have any effect if you're just running trivial test cases, but when a non-trivial routine calls one of these stubs, it will matter ... Bye, Ulrich -- Dr. Ulrich Weigand [EMAIL PROTECTED]
Re: Automatic CDECL / STDCALL translation
[snip] +print '"\tpopl %ecx\n"' . " \\\n";=0A= +print '"\tsubl $" #argsize ", %esp\n"' . " \\\n";=0A= +print '"\tjmp *%ecx\n"' . " \\\n";=0A= This appears to be broken; you need to *add* the argsize instead of subtracting it, and furthermore the return address lies now *above* the arguments after the stack permutation you did above ;-) What about this instead of the last three lines: print "\taddl $" #argsize ", %esp\n"; print "\tret\n"; I did that and it does not solve the problem : it crashes at the same GL call (glGetString) but this time at address 0x and not 0x1F00 as before. How can I help debugging this further (except by looking at an x86 ASM book :-) ) ? -- Lionel Ulmer - [EMAIL PROTECTED] My Advogato Wine diary : http://www.advogato.org/person/bbrox/
RE: Automatic CDECL / STDCALL translation
What about this instead of the last three lines: print "\taddl $" #argsize ", %esp\n"; print "\tret\n"; I did that and it does not solve the problem : it crashes at the same GL call (glGetString) but this time at address 0x and not 0x1F00 as before. How can I help debugging this further (except by looking at an x86 ASM book :-) ) ? Hmm. I have finally managed to test it and it work fine for me. In addition to a simple example application, I ran through _every_ OpenGL API with NULL values using winapi_test. You do compile with -fPIC don't you? Anyway here is my latest version. Index: wine/dlls/opengl32/make_opengl_norm === RCS file: /home/wine/wine/dlls/opengl32/make_opengl_norm,v retrieving revision 1.2 diff -u -u -r1.2 make_opengl_norm --- wine/dlls/opengl32/make_opengl_norm 2000/05/18 00:07:53 1.2 +++ wine/dlls/opengl32/make_opengl_norm 2000/05/20 00:09:05 @@ -1,5 +1,9 @@ #!/usr/bin/perl -w +my $i386 = 1; +my $critical_section = 1; +my $pic = 1; + print " /* Auto-generated file... Do not edit ! */ @@ -9,10 +13,56 @@ "; +print '#define THUNK_STDCALL_TO_CDECL(stdcall_name, cdecl_name, argsize)' . " \\\n"; +print ' asm("\t.globl\t" #stdcall_name "\n"' . " \\\n"; +print '"\t.type\t" #stdcall_name ", @function\n"' . " \\\n"; +print '#stdcall_name ":\n"' . " \\\n"; +print '"\tmovl (%esp), %eax\n"' . " \\\n"; +print '"\tleal " #argsize "(%esp), %edx\n"' . " \\\n"; +print '"." #stdcall_name ":\n"' . " \\\n"; +print '"\tmovl (%edx), %ecx\n"' . " \\\n"; +print '"\tmovl %eax, (%edx)\n"' . " \\\n"; +print '"\tmovl %ecx, %eax\n"' . " \\\n"; +print '"\tsubl $4, %edx\n"' . " \\\n"; +print '"\tleal (%esp), %ecx\n"' . " \\\n"; +print '"\tcmpl %ecx, %edx\n"' . " \\\n"; +print '"\tjge ." #stdcall_name "\n"' . " \\\n"; +if($critical_section) { +if($pic) { + print '"\tcall ." #stdcall_name ".getgot.enter\n"' . " \\\n"; + print '"." #stdcall_name ".getgot.enter:\n"' . " \\\n"; + print '"\tpopl %ebx\n"' . " \\\n"; + print '"\taddl $_GLOBAL_OFFSET_TABLE_+[.-." #stdcall_name ".getgot.enter], %ebx\n"' . " \\\n"; + print '"\tpushl X11DRV_CritSection@GOT(%ebx)\n"' . " \\\n"; + print '"\tcall EnterCriticalSection@PLT\n"' . " \\\n"; +} else { + print '"\tpushl $X11DRV_CritSection\n"' . " \\\n"; + print '"\tcall EnterCriticalSection\n"' . " \\\n"; +} +} +print '"\tcall " #cdecl_name "@PLT\n"' . " \\\n"; +if($critical_section) { +if($pic) { + print '"\tcall ." #stdcall_name ".getgot.leave\n"' . " \\\n"; + print '"." #stdcall_name ".getgot.leave:\n"' . " \\\n"; + print '"\tpopl %ebx\n"' . " \\\n"; + print '"\taddl $_GLOBAL_OFFSET_TABLE_+[.-." #stdcall_name ".getgot.leave], %ebx\n"' . " \\\n"; + print '"\tpushl X11DRV_CritSection@GOT(%ebx)\n"' . " \\\n"; + print '"\tcall LeaveCriticalSection@PLT\n"' . " \\\n"; +} else { + print '"\tpushl $X11DRV_CritSection\n"' . " \\\n"; + print '"\tcall LeaveCriticalSection\n"' . " \\\n"; +} +} +print '"\taddl $" #argsize ", %esp\n"' . " \\\n"; +print '"\tret\n"' . " \\\n"; +print '"\t.size\t" #stdcall_name ", .-" #stdcall_name "\n"' . " \\\n"; +print ' );' . "\n"; + # # Now, the functions from the include file # -open(INC, "/usr/X11R6/include/GL/gl.h") || die "Could not open GL/gl.h"; +open(INC, "/usr/include/GL/gl.h") || die "Could not open GL/gl.h"; while ($line = INC) { if ($line =~ /GLAPI.*GLAPIENTRY/) { # Start of a function declaration @@ -22,9 +72,11 @@ if (($name !~ /(MESA|PGI|ARB|EXT)/) || ($name =~ /MultiTexCoord/) || ($name =~ /ActiveTextureARB/)) { + print "/***\n" ; print " *\t\t$name\n"; print " */\n"; + print "\n/* " if $i386; print "$ret WINAPI wine_$name("; @rargs = (); @names = (); @@ -61,32 +113,41 @@ foreach (@rargs) { print ", $_"; } - print ") {\n"; - if ($ret !~ /void/) { - print " $ret ret;\n"; - } - print " ENTER_GL();\n"; - if ($ret !~ /void/) { - print " ret = "; + if($i386) { + print ") */\n"; } else { - print " "; + print ") {\n"; } - print "$name("; - - $farg = shift @names; - if ($farg) { - print "$farg"; - foreach (@names) { - print ", $_"; + if(!$i386) { + if ($ret !~ /void/) { + print " $ret ret;\n"; } + print " ENTER_GL();\n"; + if ($ret !~ /void/) {
Re: Automatic CDECL / STDCALL translation
I have made a small hack (attached) that tries to adapt my Solaris C patch to this problem. I havn't for various reason been able to test it so it might (read: will) not work, but it is better than nothing I hope. OK, some comments on the patch : 1) you seem to have generated your opengl_norm.c from a 'strange' source as it includes the 'glBlendFuncSeparateINGR' function (that should not be part of the '_norm.c' file as it is an extension. (this is the reason why I could not link in the generated libopengl32.so) 2) it crashes on the first call to an OpenGL function : Call opengl32.413: wglCreateContext(2ef0) ret=00418e11 fs=0247 trace:opengl:wglCreateContext (2ef0) Ret opengl32.413: wglCreateContext() retval=08100460 ret=00418e11 fs=0247 Call opengl32.423: wglMakeCurrent(2ef0,08100460) ret=00418e8b fs=0247 trace:opengl:wglMakeCurrent (2ef0,0x8100460) Ret opengl32.423: wglMakeCurrent() retval=0001 ret=00418e8b fs=0247 Call opengl32.153: glGetString(1f00) ret=00418ef8 fs=0247 Here, I am going into the debugger with this : Unhandled exception: page fault on read access to 0x1f00 in 32-bit code (0x1f00). In 32 bit mode. 0x1f00: Wine-dbgbt bt Backtrace: =0 0x1f00 (ebp=495ef3e0) 1 0x496e22e9 (OPENGL32.DLL.glGetString+0x5) (ebp=4e8b4a80) So it seems that instead of calling to the glGetString function, we instead call the argument of this function... -- Lionel Ulmer - [EMAIL PROTECTED] My Advogato Wine diary : http://www.advogato.org/person/bbrox/
RE: Automatic CDECL / STDCALL translation
I have made a small hack (attached) that tries to adapt my Solaris C patch to this problem. I havn't for various reason been able to test it so it might (read: will) not work, but it is better than nothing I hope. OK, some comments on the patch : 1) you seem to have generated your opengl_norm.c from a 'strange' source as it includes the 'glBlendFuncSeparateINGR' function (that should not be part of the '_norm.c' file as it is an extension. (this is the reason why I could not link in the generated libopengl32.so) Yes. It lacks GL/glext.h as well that is one of the reasons I couldn't test it. But you have the perl script so just regenerate it. BTW, it is latest MesaGL package from unstable Debian, so I don't understand why it lacks GL/glext.h has the glBlendFuncSeparateINGR. 2) it crashes on the first call to an OpenGL function : Call opengl32.413: wglCreateContext(2ef0) ret=00418e11 fs=0247 trace:opengl:wglCreateContext (2ef0) Ret opengl32.413: wglCreateContext() retval=08100460 ret=00418e11 fs=0247 Call opengl32.423: wglMakeCurrent(2ef0,08100460) ret=00418e8b fs=0247 trace:opengl:wglMakeCurrent (2ef0,0x8100460) Ret opengl32.423: wglMakeCurrent() retval=0001 ret=00418e8b fs=0247 Call opengl32.153: glGetString(1f00) ret=00418ef8 fs=0247 Here, I am going into the debugger with this : Unhandled exception: page fault on read access to 0x1f00 in 32-bit code (0x1f00). Perhaps it is because of the missing @PLT in PIC mode. In 32 bit mode. 0x1f00: Wine-dbgbt bt Backtrace: =0 0x1f00 (ebp=495ef3e0) 1 0x496e22e9 (OPENGL32.DLL.glGetString+0x5) (ebp=4e8b4a80) So it seems that instead of calling to the glGetString function, we instead call the argument of this function... Yes, we probably must jump in the table instead (@PLT).
Re: Automatic CDECL / STDCALL translation
Patrik Stridvall wrote: +print '#stdcall_name ":\n"' . " \\\n";=0A= +print '"\tmovl (%esp), %eax\n"' . " \\\n";=0A= +print '"\tleal " #argsize "(%esp), %edx\n"' . " \\\n";=0A= +print '"." #stdcall_name ":\n"' . " \\\n";=0A= +print '"\tmovl (%edx), %ecx\n"' . " \\\n";=0A= +print '"\tmovl %eax, (%edx)\n"' . " \\\n";=0A= +print '"\tmovl %ecx, %eax\n"' . " \\\n";=0A= +print '"\tsubl $4, %edx\n"' . " \\\n";=0A= +print '"\tleal (%esp), %ecx\n"' . " \\\n";=0A= +print '"\tcmpl %ecx, %edx\n"' . " \\\n";=0A= +print '"\tjge ." #stdcall_name "\n"' . " \\\n";=0A= [snip] +print '"\tpopl %ecx\n"' . " \\\n";=0A= +print '"\tsubl $" #argsize ", %esp\n"' . " \\\n";=0A= +print '"\tjmp *%ecx\n"' . " \\\n";=0A= This appears to be broken; you need to *add* the argsize instead of subtracting it, and furthermore the return address lies now *above* the arguments after the stack permutation you did above ;-) What about this instead of the last three lines: print "\taddl $" #argsize ", %esp\n"; print "\tret\n"; Bye, Ulrich -- Dr. Ulrich Weigand [EMAIL PROTECTED]
RE: Automatic CDECL / STDCALL translation
Patrik Stridvall wrote: +print '#stdcall_name ":\n"' . " \\\n";=0A= +print '"\tmovl (%esp), %eax\n"' . " \\\n";=0A= +print '"\tleal " #argsize "(%esp), %edx\n"' . " \\\n";=0A= +print '"." #stdcall_name ":\n"' . " \\\n";=0A= +print '"\tmovl (%edx), %ecx\n"' . " \\\n";=0A= +print '"\tmovl %eax, (%edx)\n"' . " \\\n";=0A= +print '"\tmovl %ecx, %eax\n"' . " \\\n";=0A= +print '"\tsubl $4, %edx\n"' . " \\\n";=0A= +print '"\tleal (%esp), %ecx\n"' . " \\\n";=0A= +print '"\tcmpl %ecx, %edx\n"' . " \\\n";=0A= +print '"\tjge ." #stdcall_name "\n"' . " \\\n";=0A= [snip] +print '"\tpopl %ecx\n"' . " \\\n";=0A= +print '"\tsubl $" #argsize ", %esp\n"' . " \\\n";=0A= +print '"\tjmp *%ecx\n"' . " \\\n";=0A= This appears to be broken; you need to *add* the argsize instead of subtracting it, and furthermore the return address lies now *above* the arguments after the stack permutation you did above ;-) What about this instead of the last three lines: print "\taddl $" #argsize ", %esp\n"; print "\tret\n"; Yes, that is probably correct. As a side note, this is what I did in one of the cases in the my Solaris C patch. A typical example of Murphy's law. I knew that there were two cases but I choose the wrong one...
RE: Automatic CDECL / STDCALL translation
1) instead of generating C code for the conversion (as in opengl_norm.c), generate some ASM in-line to do it as fast as possible. The problem with this is how to get the address of the 'destination' function to put in the ASM... I'm not sure exacly what you mean by how to get the address of the destination function, since you have "static" thunks, but I think I have solved that problem for the Solaris C thunking that is dynamic. What I meant is that these 'thunks' would call libGL functions. So that the function address I need to call is only known at link time. Yes, but that is the problem of the linker. It is the same problem whether you write in assembler or C and regardless if you make the call directly or indirectly through function pointer. OK, in assembler you need to have special PIC code, as I'm my attempt to an implementation in the most recent mail. For 'dynamic' thunks, the function address is known, but I do not know how to get it in the ASM code for 'to be linked in' functions. You use the word static and dynamic differently that me. When I say dynamic I mean that the number of arguments or other specific data is not know at compile or link time. Thread safety is a problem, however I think I have solved that. What I did was to reshuffle the stack and allocate space _before_ the arguments. Not that efficient, but it is optimized assembler and likely more efficent than what GNU C does. I have attached some code that might intrest you. I will look at it. Anyway, I would like to hack something quickly (even not thread safe) just to see how many FPS I can gain to see if I even need to bother trying to optimize this :-) Try the patch in my most recent mail. As for thread safety, for most of OpenGL apps that people will use with Wine (i.e. games :-) ), there should be no need to have it... So having a non-thread safe but fast solution that could be compiled in at configure time could be nice (and as I can detect easily when an app is doing multi-threaded OpenGL calls, I can even warn the user by a ERR that he should recompile Wine). My patch (the latest one) is designed for a worst case scenario where 1. There might be calls from multiple thread 2. There might be a reentrancy in the same thread. I'm not sure 2 is even possible for OpenGL functions. Are there OpenGL functions that takes callbacks? And if so are these callbacks are allowed to call OpenGL functions. If not the thunking might be done faster.
Re: Automatic CDECL / STDCALL translation
On Wed, May 17, 2000 at 10:25:02PM +0200, Lionel Ulmer wrote: As for thread safety, for most of OpenGL apps that people will use with Wine (i.e. games :-) ), there should be no need to have it... So having a non-thread safe but fast solution that could be compiled in at configure time could be nice (and as I can detect easily when an app is doing multi-threaded OpenGL calls, I can even warn the user by a ERR that he should recompile Wine). Hmm ?? What about a simple if (we_want_thread_safety_NOW) EnterCriticalSection(...); . . . if (we_want_thread_safety_NOW) LeaveCriticalSection(...); and wine.conf parameter ? Or does this burn still too many CPU cycles ? Andreas Mohr
Re: Automatic CDECL / STDCALL translation
I don't think a comparison with Windows is that technically relevant, since the graphic cards drivers are completly different. Well, I am using NVIDIA's OpenGL drivers... And as far as I know, the codebase is the same for Windows and Linux drivers. So performance should be comparable betwenn Windows and Linux. 1) instead of generating C code for the conversion (as in opengl_norm.c), generate some ASM in-line to do it as fast as possible. The problem with this is how to get the address of the 'destination' function to put in the ASM... I'm not sure exacly what you mean by how to get the address of the destination function, since you have "static" thunks, but I think I have solved that problem for the Solaris C thunking that is dynamic. What I meant is that these 'thunks' would call libGL functions. So that the function address I need to call is only known at link time. For 'dynamic' thunks, the function address is known, but I do not know how to get it in the ASM code for 'to be linked in' functions. Thread safety is a problem, however I think I have solved that. What I did was to reshuffle the stack and allocate space _before_ the arguments. Not that efficient, but it is optimized assembler and likely more efficent than what GNU C does. I have attached some code that might intrest you. I will look at it. Anyway, I would like to hack something quickly (even not thread safe) just to see how many FPS I can gain to see if I even need to bother trying to optimize this :-) As for thread safety, for most of OpenGL apps that people will use with Wine (i.e. games :-) ), there should be no need to have it... So having a non-thread safe but fast solution that could be compiled in at configure time could be nice (and as I can detect easily when an app is doing multi-threaded OpenGL calls, I can even warn the user by a ERR that he should recompile Wine). -- Lionel Ulmer - [EMAIL PROTECTED]
Automatic CDECL / STDCALL translation
Hi all, After doing some benchmarks, I found out that the OpenGL performance is not too bad compared to Windows : about 25 % slower on the Tirtanium benchmark when removing the X11 critical section protection, 50 % slower with it. Now, I think most of the remaining FPS are lost in the CDECL - STDCALL conversion of all the OpenGL routines. I looked at the code GCC generated for the OpenGL code and it's not really efficient : it 'pops' all the arguments in registers and then pushes them again for the calling of the CDECL function... So I had two ideas to optimize this, based on what Marcus did for elf.c (as I am not an x86 ASM guru, I wouls have some difficulties doing any of my proposals, but well :-) ) : 1) instead of generating C code for the conversion (as in opengl_norm.c), generate some ASM in-line to do it as fast as possible. The problem with this is how to get the address of the 'destination' function to put in the ASM... 2) the other possibility would have been to modify 'build' to have a new keyword for function that are 'synonyms' between Windows and Linux with only the calling convertion that changes. OpenGL's spec file would look like this : @ stdcall wglUseFontOutlines(long long long long long long long) wglUseFontOutlines @ synonym glClearIndex(long ) For these functions, when GetProcAddress is called (or the equivalent for 'static' linking) a code equivalent to the one in 'elf.c' would be generated. This would greatly simplify the OpenGL code (no more auto-generated opengl_norm.c and opengl_ext.c files). I think the only real problem remaining would be how to generate this ASM code to be at the same time efficient and thread-safe (I thought a bit about it, and it seems non-trivial). So what do you think about this ? -- Lionel Ulmer - [EMAIL PROTECTED]