Please do not reply to this email: if you want to comment on the bug, go to
the URL shown below and enter yourcomments there.
https://bugs.freedesktop.org/show_bug.cgi?id=4197
[EMAIL PROTECTED] changed:
What |Removed |Added
----------------------------------------------------------------------------
OtherBugsDependingO| |1690
nThis| |
------- Additional Comments From [EMAIL PROTECTED] 2005-12-18 12:36 -------
first of all, to [EMAIL PROTECTED]: please do not remove the dependency of
1690, it was an explicit request from ajax back then. second, if you do remove
it, then at least give some food for thought for discussion. mine's below,
regarding the importance of text relocations.
(In reply to comment #7)
> (In reply to comment #6)
>
> FYI, gl_x86_asm.py is located in src/mesa/glapi.
i can't find it in my copy of MesaLib-6.4.tar.bz2 (md5sum:
85a84e47a3f718f752f306b9e0954ef6), else i would have modified it as i did
before.
> > ignore the previous one, it had some bugs (blind forward port wasn't a good
> > idea ;-). also, there're much more textrels due to -fPIC being explicitly
> > omitted from the x86 DRI makefiles, i fixed that by patching
> > configs/linux-dri-x86 but i'm not sure if it's the correct way.
>
> At the very least, this patch needs some conditionals to disable it.
> I suspect that a lot of this will break on other platforms (e.g., Windows,
cygwin, etc.).
some of the patch explicitly depends on __PIC__ already, mmx_blend* and
glapi_x86.S don't. i'll modify mmx_blend* to accomodate non-PIC capable
platforms if that code can be used on them(?), but i'm unsure about glapi_x86.S
as it already has an explicit ifndef __WIN32__ and DJGPP's Makefile.DJ doesn't
use it at all. are there any other x86 platforms that use this file but are not
capable of PIC?
> I also expect that it will kill the performance of quite a few apps.
> There's a
> reason that we use custom assembly for this part of the code, and this patch
> seems to defeat most of that purpose.
if you *really* cared about performance (instead of trying to find an excuse)
then you'd have looked hard at the whole GL API/ABI picture already. let's see
what a GL API call ends up in on linux/i386.
app: call [EMAIL PROTECTED]
app.plt: jmp [EMAIL PROTECTED]
gl.dispatch: mov eax,[_glapi_Dispatch]
test eax,eax
jz .get_dispatch
jmp [eax+0x...]
.get_dispatch:
call _glapi_get_dispatch
jmp [eax+0x...]
on the fast path, that's 6 insns, with no less than 3 memory accesses (potential
cache misses), 2 of which are indirect control flow changes (potential branch
misprediction). and all that just to get to the first insn of the actual GL API
(which will do its prologue/epilogue on top of doing real work, mind you). if
that's not an overkill then i don't know what is.
my patch adds 5 insns to this, 3 of which are memory accesses. one memory access
is the same as the old code (same potential cache miss), the other two are
guaranteed to be cached, one is the return address on the stack, the other is
the call insn executed just before the access. in other words, the execution
overhead in absolute terms (clock cycles) is minimal. where you can have a
measurable impact is when the absolut overhead is comparable to the given GL API
execution time itself. i recall ajax mentioned that some of them are very short
(in terms of asm insns), but those are also the APIs that already suffer the 6
insns/3 memory access 'custom assembly' that you had for this purpose (again, in
addition to the API's prologue/epilogue). adding 5 more will make it worse, but
it's already bad and can't be the reason for outright rejection. if you really
wanted to fix it, then you'd find a way for inlining such short API calls, that
will not only eliminate the API call overhead (including prologue/epilogue code)
but also help the compiler optimize register/memory accesses in the *caller* -
can't get better than that for performance.
> Given the choice between removing TEXTRELs and improving (or maintaining)
> performance, I will pick performance every time.
i think you don't realize what choice you're making (and i definitely disagree
that you should be making that choice for all users and deny them a textrel-free
X/GL). textrels are a subset of runtime code generation which itself is a
privilege that plays a very important role in security, in particular,
exploiting memory corruption based bugs (stack/heap overflow, integer handling,
etc). this privilege (which is granted to all userland on all contemporary OSs,
unless you use PaX) is the one that allows exploiting these bugs via executing
remotely injected code. therefore taking away this privilege prevents a huge
class of exploits from working, in practice you'll find that about 99.9...% of
exploits rely on this privilege. eliminating them for good is good for security,
that is, if you care about it.
it so happens that there're quite a few people who do and yet they'd also like
to enjoy the usual benefits of their systems. one reason this privilege is
needed when some code has textrels, that's why we (in the hardened gentoo
project) have been actively eliminating them for some years now. X has been a
particularly painful exercise due to its elfloader, so we quickly switched to
dlloader (as far as i know, we were the first distro to use it), we supported
x.org to move to it by default (big thanks to ajax for his work), and we would
of course like to get it all right now that X is officially moving to the
dlloader. anyway, the thing is, be careful with your choices, voting for
performance gains (rather questionable ones at that, considering the above
discussion) and textrels will expose every single app using GL to the most
widely used exploit methods. i and many others chose differently, hence these
patches and our hope is that we can find a way to make them available in X (else
they'll continue to live in gentoo's portage as they did so far).
if you're still not convinced about the role of textrels, read Ulrich Drepper's
DSO howto (http://people.redhat.com/drepper/dsohowto.pdf). if even that's not
enough then try to convince him that glibc should be compiled without -fPIC for
extra performance, and see how far you get.
> > i think @NTPOFF is resolved at link time so that whole runtime relocation
> > thing
> > can go and the stubs can directly reference %gs:[EMAIL PROTECTED]
>
> Grep for 'wtext' in glapi_x86.S.
thanks, i missed that while looking at the code.
> The run-time patching is done to inline the
> call to _x86_get_dispatch and thereby avoid the extra overhead. Yes, there
> are
> applications that would see measurable performance degredation from leaving
> that
> as a call (rather than inline).
this is the same wrong performance argument that i discussed above. fix the
API/ABI properly if you truly care about performance.
> If I recall correctly, having the 'movl %gs:[EMAIL PROTECTED], %eax'
> inline at build-time had a significant impact on the disk size of libGL.so.
what's 'significant' here? and in any case, relocation data is used only once
during, well, relocation, the kernel's page cache will discard those pages
quickly and there's no impact otherwise. so this runtime generated code can be
eliminated.
--
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev