Its a painful world.
The 17. of august 1998, I wrote:
> Anybody having experience with installing egcs (1.0.2-8, as the RH5.1
> standard RPM) over a vanilla RH5.0 system (implying a glibc of
> 2.0.5c-10) ? I've just done that, and experience weirdnesses like
> smashing of the frame/stack pointer during calls to things like printf
> (from c++ code, compiling cleanly). Does this sound like a glibc
> problem, is egcs broken, or do I write subversive code (I dont expect
> anybody to comment on the latter :-))
>
It took me really long time to locate the (deeply) subversive code. As
the story carries a lesson, here it goes:
For those familiar with SYSV IPC msgrcv (3?), it among other things
accepts a pointer to a buffer which it will write in, and a length of
the buffer, to prevent overrun. Now, what I wasn't aware of, was that
this length param is not that of the buffer, but of the data part of the
buffer only, implying that msgrcv will write back sizeof(long) bytes
more than the value given.
This implies that the following (from memory, but it should give the
picture):
int getit( int qhandle )
{
struct { long msgtype; int data } buf;
int res = msgrcv( qhandle, &buf, sizeof(buf), 0 );
if (res == sizeof(buf))
return buf.data;
else
<handle the queue error somehow>
}
goes really wrong. If doing a similar error in the other end of the
queue (which I did), the queue will transmit and thus receive and write
12 bytes, while returning 8 as result, which fits nicely with
sizeof(buf). If the compiler is fairly conventional, buf will be stack
allocated, in terms of address just _below_ the stored frame pointer (on
x86: typically ebp) of the calling process. This implies that if having
something like:
...
int a,b,c
...
a = getit( qhandle );
...
then after the call to getit, the then restored frame pointer, and thus
the correct locations of variables a,b,c, parameters etc is broken.
Hereafter, the program will have a hard time doing as intended, but as
the stack pointer is not broken, it might carry on for quite a while
until segfaulting for some secondary reason. If in gdb, this gets as
confused as the program, as it uses the frame pointer of the running
program to examine the local variables.
Morale(s):
read the man pages
never pass a pointer without fear
dont blame the compiler as the first
(dont bother the list too early)
regards,
Niels HP
[EMAIL PROTECTED]