On Feb 22, 2011, at 8:25 PM, ext Karol Lewandowski wrote:
Actually I think there is something fishy going on somewhere.
I think it might be make, bash, dynamic linker, kernel (exactly in
that order). Sadly, I'm not sure which component is to blame for
this problem.
From my understanding the problem we observe might occur when one
incorrectly sets RLIMIT_DATA to the size of sbrk(0). In this
situation kernel correctly returns mmap() addresses that are just
after sbrk(0).
If that is really the case, that one of the programs sets RLIMIT_DATA
now, it might explain the problems you have seen. And the guilty
program should be quite easy to locate: libsb2 already intercepts
setrlimit(), so just make it log a warning whenever setrmilit() is
used (the gate function is in execgates.c).
A quick grep on the sources of the programs you listed revealed
nothing here, which also might make sense because we haven't seen the
problem here.
We might be using something which is older than what you have, and
given that I already had to invent a workaround to libsb2 because
gmake messed up things by using setrlimit() to set RLIMIT_STACK to
infinity, I wouldn't be surprised if someone had done something as
stupid again.
I guess that doing setrlimit(RLIMIT_DATA, "infinity-or-some-large-
number") just before exec() would provide sensible workaround.
Or more likely, restore the limit to what it was before the exec. Same
way as RLIMIT_STACK is handled.
Here is idea - because we run run dynamic linker by ourself we
inherit RLIMIT_DATA (along with all other limits) from the parent
process. This *isn't* true when kernel invokes dynamic linker via
it's own ELF loader, right? Shouldn't we reset all the limits to
some sane defaults just before exec?
I think that all limits should be inherited regardless of the way how
the dynamic linker (ld.so) gets started. Otherwise the "ulimit"
builtin of the shell wouldn't have any use, would it? Also, static
binaries would get different treatment than dynamic ones.
Instead, the reason might be that memory layout of a process will be
different depending on how ld.so is started. It is loaded by the
kernel in both cases, but the program itself isn't. And order of
mapping the executables differs:
In the "ordinary case", the kernel first maps a dynamically linked
executable, notices that ld.so is needed, and maps that also. Ld.so
will find that the program is already in memory, so it has to only map
the required libraries.
But when ld.so is executed directly (as sb2 does), the kernel treats
it as a statically linked binary, maps it, and starts it. Ld.so is
then responsible for mapping the program to memory, followed by
mapping of all libraries. (this is done by eglibs's elf/rtld.c, in
dl_main())
Now it might be possible that the resulting memory map in the latter
case is causing trouble when RLIMIT_DATA has been set, even if the
mappings are completely valid as such in both cases.
What do you think about this?
(I'm aware that this might be a bit tricky. ;)
It should be quite straightforward to find out if it is really caused
by insane use of setrlimit().
Just extend the existing wrapper to restore the limits, or even turn
it to a nop for debugging purposes (if an environment variable has
been set, don't change any limits, etc). Testing it that way should
not take to long. I can't do it because I don't see the problem in the
first place.
Theoretically, could it be that RLIMIT_DATA is originally set to
something that is too small and nobody enlarges it? That could be more
difficult to circumvent.
Honestly, I've hard time believing that brk(2) isn't supported
anymore.
Well. It won't be too surprising if brk() suffers from software rot.
I think brk() was removed from posix some ten years ago and use of it
is probably fading in any case. This is just what happens to less used
legacy features... (on the other hand, this makes me feel old; I still
remember those days when brk()/sbrk() was the only way to get more
memory for the application process from the Unix kernel :-)
brk() was probably deprecated because it forces some assumptions about
the memory map layout. It might be that those assumptions can be
broken by fancy use of setrlimit() nowadays. Fortunately it should be
easy to test it out.
Lauri
_______________________________________________
Scratchbox-devel mailing list
[email protected]
http://lists.scratchbox.org/cgi-bin/mailman/listinfo/scratchbox-devel