Re: ABI is broken??

2000-11-05 Thread Jeroen Ruigrok van der Werven

-On [20001102 05:30], Garrett Wollman ([EMAIL PROTECTED]) wrote:
On Wed, 1 Nov 2000 14:43:55 -0800, "David O'Brien" [EMAIL PROTECTED] said:

 Any reason to not get [libc ABI changes] in -current now and make
 the bump?

Mostly because they're too small to be worth the pain.  I'm waiting
for something more significant that I can piggy-back on.

Which of course has the implicit risk that if something big doesn't show
up these fixes will be added only at the nearing of 5.0-RELEASE and thus
with less shake-down time.

I also gather it has to do with the Austin project Garrett?

-- 
Jeroen Ruigrok van der Werven  Network- and systemadministrator
[EMAIL PROTECTED]VIA Net.Works The Netherlands
BSD: Technical excellence at its best  http://www.via-net-works.nl
In my mind nothing makes sense...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-05 Thread Garrett Wollman

On Sun, 5 Nov 2000 13:51:09 +0100, Jeroen Ruigrok van der Werven 
[EMAIL PROTECTED] said:

 I also gather it has to do with the Austin project Garrett?

Yes and no.  The errors have been there since the beginning of time,
but I actually noticed them doing review of our implementation wrt the
new standard.  For example, the System V IPC implementation uses a
data structure bogusly copied bitwise from SVR3, which only had 16-bit
[ug]id_t's.  The patch I sent out before BSDcon contains a number of
things noted in this regard.

The other thing that we need to do is to hide the DB 1.85 library
that's currently in libc, so that it doesn't prevent third-party
applications from making full use of DB 3.x (when installed).  This
will involve renaming all of the public identifiers in the DB library,
and converting the libc code to use the internal names.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-02 Thread Maxim Sobolev

John Polstra wrote:

 In article [EMAIL PROTECTED],
 Maxim Sobolev  [EMAIL PROTECTED] wrote:
  John Polstra wrote:
   Overall I would lean toward putting the hack into pthread_mutex_lock.
   Comments?
 
  Huh, why we can't just bump libc_r version number and put older (buggy) version 
into
  lib/compat as usually? This would not require any ugly hacks at all.

 The bug wasn't in libc_r -- it was in libgcc_r.  That's a static
 library, so it doesn't have a version number.  And it is statically
 linked into old executables.  Nothing we do to libgcc_r will help old
 executables, because they won't even use the new libgcc_r.

Nope it should help, because the bug is triggered if someone tries to use old 
executables
with new libc_r.

-Maxim



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-02 Thread John Polstra

In article [EMAIL PROTECTED], Maxim Sobolev
[EMAIL PROTECTED] wrote:
 John Polstra wrote:

  The bug wasn't in libc_r -- it was in libgcc_r.  That's a
  static library, so it doesn't have a version number.  And it is
  statically linked into old executables.  Nothing we do to libgcc_r
  will help old executables, because they won't even use the new
  libgcc_r.

 Nope it should help, because the bug is triggered if someone tries
 to use old executables with new libc_r.

Yes, I think you're right after all.  But since I've already worked
around the problem in libc_r, there's no need to do anything else at
this point.

John


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-02 Thread John Polstra

In article [EMAIL PROTECTED],
Max Khon [EMAIL PROTECTED] wrote:

 do we still need uthread_autoinit.cc?

It still might be needed by old executables.  Anyway I don't see a
good reason to get rid of it.

John


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



ABI is broken??

2000-11-01 Thread Maxim Sobolev

Hi,

I'm not sure what exactly caused this behaviour (I can guess two potential
victims: O'Brien's changes in crt stuff and recent Polstra's changes in
libgcc_r), but it seems that some programs built on the previous -current from
27 October immediately segfault when I'm trying to run then on system installed
from today's sources. The segfault disappeared when I recompiled affected
program. With this message I'm attaching short backtrace.

-Maxim

root@notebook# galeon
GNU gdb 4.18
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
(gdb) r
Starting program: /usr/X11R6/bin/galeon-bin
(no debugging symbols found)...
[...]
(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
(gdb) bt
#0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
#1  0x806e782 in __register_frame_info ()
#2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
#3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
#4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1
(gdb) q



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Udo Schweigert

On Wed, Nov 01, 2000 at 19:17:26 +0200, Maxim Sobolev wrote:
 Hi,
 
 I'm not sure what exactly caused this behaviour (I can guess two potential
 victims: O'Brien's changes in crt stuff and recent Polstra's changes in
 libgcc_r), but it seems that some programs built on the previous -current from
 27 October immediately segfault when I'm trying to run then on system installed
 from today's sources. The segfault disappeared when I recompiled affected
 program. With this message I'm attaching short backtrace.
 
 -Maxim
 
 #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
 

Same for me in -stable (4.2-BETA) and python-1.6. After rebuilding the port
this disappeared. My gdb showed the same error message as the quoted above.

Regards.
-- 
Udo Schweigert, Siemens AG   | Voice  : +49 89 636 42170
ZT IK 3, Siemens CERT| Fax: +49 89 636 41166
D-81730 Muenchen / Germany   | email  : [EMAIL PROTECTED]
PGP-2/5 fingerprint  | D8 A5 DF 34 EC 87 E8 C6  E2 26 C4 D0 EE 80 36 B2


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Daniel Eischen

On Wed, 1 Nov 2000, John Polstra wrote:
 In article [EMAIL PROTECTED],
 Maxim Sobolev  [EMAIL PROTECTED] wrote:
  
  I'm not sure what exactly caused this behaviour (I can guess two potential
  victims: O'Brien's changes in crt stuff and recent Polstra's changes in
  libgcc_r), but it seems that some programs built on the previous -current from
  27 October immediately segfault when I'm trying to run then on system installed
  from today's sources. The segfault disappeared when I recompiled affected
  program. With this message I'm attaching short backtrace.
 [...]
  Program received signal SIGSEGV, Segmentation fault.
  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
  (gdb) bt
  #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
  #1  0x806e782 in __register_frame_info ()
  #2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
  #3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
  #4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1
 
 Here are all the random facts which, when put together, explain what
 is going on.
 
 Your old application was (like all -pthread programs) linked
 with "/usr/lib/libgcc_r.a".  That library contains a function
 "__register_frame_info" which uses some of the facilities of the
 pthreads library "libc_r".
 
 The pthreads library has to be initialized before it can be used, by
 a call to _thread_init.  If some functions such as pthread_mutex_lock
 are called before the library has been initialized, a segmentation
 violation results.
 
 _thread_init is called automatically from libc_r's _init function
 when the dynamic linker loads the library.  Unfortunately, that
 isn't early enough.  libgcc_r is the first thing to be initialized,
 and it calls pthread_mutex_lock before _thread_init has been called.
 Or rather I should say that OLD versions of libgcc_r did that --
 because they were buggy.
 
 In other words, your old application was linked with a buggy version
 of libgcc_r, but it didn't become apparent until now.
 
 It didn't become apparent until now because our crtbegin.o and
 crtend.o were also buggy.  They failed to call __register_frame_info.
 This was a problem for C++ programs using exceptions, especially when
 the gcc port was used and DWARF2 exception handling was selected.
 
 Now we have fixed crtbegin.o and crtend.o, and we have fixed
 libgcc_r.a.  But it causes problems for your old application because
 the new crtbegin.o and crtend.o (linked into the new shared libraries
 such as libc_r) call __register_frame_info in your old, buggy,
 statically linked libgcc_r.a.
 
 Are you dizzy yet?

Yes ;-)

 To sum up, your old executable contains the bug but
 it wasn't triggered until the recent changes.
 
 Now, what can or should we do about this?  Arguably we should simply
 say in the release notes, "Relink your old multithreaded applications.
 They had a bug which is now fixed."  But if there are binary-only
 commercial apps which exhibit the problem, this solution is useless.
 I don't know whether there are any such apps, but I doubt it.  N.B.,
 Linux apps don't count because they were never linked with our
 libgcc_r in the first place.
 
 Or we can try to work around it, but there aren't any perfectly nice
 ways to do so.  Here are some possibilities:
 
 - Put a hack in the threads library so that whenever
   pthread_mutex_lock is called it checks to make sure that the
   threads library has been initialized, and if not, it calls
   _thread_init.  This is a poor solution because it adds overhead to
   a rather performance-critical function -- though admittedly the
   overhead is very small.  Another potential problem is that there
   could be a race condition if several threads all called
   pthread_mutex_lock at once before the threads library had been
   initialized.  I don't think the race condition would materialize,
   though, since the first call would come from libgcc_r, well before
   the application had gotten control.
 
 - Put a hack into the dynamic linker to call _thread_init very early
   if that symbol was defined.  I like this solution even less,
   because it's too hackish.  The dynamic linker isn't the place for
   special hooks like that.
 
 - Put a hack into crtbegin.o or crtend.o.  But we are using the
   standard GNU versions of these, and I really really don't want to
   change that.  In any case, it's the wrong place for the
   work-around.
 
 Overall I would lean toward putting the hack into pthread_mutex_lock.
 Comments?

If that's the lesser evil, then I guess it's OK with me.

-- 
Dan Eischen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread John Polstra

In article [EMAIL PROTECTED],
Daniel Eischen  [EMAIL PROTECTED] wrote:
  
  Overall I would lean toward putting the hack into pthread_mutex_lock.
  Comments?
 
 If that's the lesser evil, then I guess it's OK with me.

Thanks for replying so quickly.  I'll test this to make sure it
really works, and then commit it.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Maxim Sobolev

John Polstra wrote:

 In article [EMAIL PROTECTED],
 Maxim Sobolev  [EMAIL PROTECTED] wrote:
 
  I'm not sure what exactly caused this behaviour (I can guess two potential
  victims: O'Brien's changes in crt stuff and recent Polstra's changes in
  libgcc_r), but it seems that some programs built on the previous -current from
  27 October immediately segfault when I'm trying to run then on system installed
  from today's sources. The segfault disappeared when I recompiled affected
  program. With this message I'm attaching short backtrace.
 [...]
  Program received signal SIGSEGV, Segmentation fault.
  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
  (gdb) bt
  #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
  #1  0x806e782 in __register_frame_info ()
  #2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
  #3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
  #4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1

 Here are all the random facts which, when put together, explain what
 is going on.

 Your old application was (like all -pthread programs) linked
 with "/usr/lib/libgcc_r.a".  That library contains a function
 "__register_frame_info" which uses some of the facilities of the
 pthreads library "libc_r".

 The pthreads library has to be initialized before it can be used, by
 a call to _thread_init.  If some functions such as pthread_mutex_lock
 are called before the library has been initialized, a segmentation
 violation results.

 _thread_init is called automatically from libc_r's _init function
 when the dynamic linker loads the library.  Unfortunately, that
 isn't early enough.  libgcc_r is the first thing to be initialized,
 and it calls pthread_mutex_lock before _thread_init has been called.
 Or rather I should say that OLD versions of libgcc_r did that --
 because they were buggy.

 In other words, your old application was linked with a buggy version
 of libgcc_r, but it didn't become apparent until now.

 It didn't become apparent until now because our crtbegin.o and
 crtend.o were also buggy.  They failed to call __register_frame_info.
 This was a problem for C++ programs using exceptions, especially when
 the gcc port was used and DWARF2 exception handling was selected.

 Now we have fixed crtbegin.o and crtend.o, and we have fixed
 libgcc_r.a.  But it causes problems for your old application because
 the new crtbegin.o and crtend.o (linked into the new shared libraries
 such as libc_r) call __register_frame_info in your old, buggy,
 statically linked libgcc_r.a.

 Are you dizzy yet?  To sum up, your old executable contains the bug but
 it wasn't triggered until the recent changes.

 Now, what can or should we do about this?  Arguably we should simply
 say in the release notes, "Relink your old multithreaded applications.
 They had a bug which is now fixed."  But if there are binary-only
 commercial apps which exhibit the problem, this solution is useless.
 I don't know whether there are any such apps, but I doubt it.  N.B.,
 Linux apps don't count because they were never linked with our
 libgcc_r in the first place.

 Or we can try to work around it, but there aren't any perfectly nice
 ways to do so.  Here are some possibilities:

 - Put a hack in the threads library so that whenever
   pthread_mutex_lock is called it checks to make sure that the
   threads library has been initialized, and if not, it calls
   _thread_init.  This is a poor solution because it adds overhead to
   a rather performance-critical function -- though admittedly the
   overhead is very small.  Another potential problem is that there
   could be a race condition if several threads all called
   pthread_mutex_lock at once before the threads library had been
   initialized.  I don't think the race condition would materialize,
   though, since the first call would come from libgcc_r, well before
   the application had gotten control.

 - Put a hack into the dynamic linker to call _thread_init very early
   if that symbol was defined.  I like this solution even less,
   because it's too hackish.  The dynamic linker isn't the place for
   special hooks like that.

 - Put a hack into crtbegin.o or crtend.o.  But we are using the
   standard GNU versions of these, and I really really don't want to
   change that.  In any case, it's the wrong place for the
   work-around.

 Overall I would lean toward putting the hack into pthread_mutex_lock.
 Comments?

Huh, why we can't just bump libc_r version number and put older (buggy) version into
lib/compat as usually? This would not require any ugly hacks at all.

-Maxim



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread John Polstra

In article [EMAIL PROTECTED],
Maxim Sobolev  [EMAIL PROTECTED] wrote:
 John Polstra wrote:
  Overall I would lean toward putting the hack into pthread_mutex_lock.
  Comments?
 
 Huh, why we can't just bump libc_r version number and put older (buggy) version into
 lib/compat as usually? This would not require any ugly hacks at all.

The bug wasn't in libc_r -- it was in libgcc_r.  That's a static
library, so it doesn't have a version number.  And it is statically
linked into old executables.  Nothing we do to libgcc_r will help old
executables, because they won't even use the new libgcc_r.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Garrett Wollman

On Wed, 01 Nov 2000 21:09:12 +0200, Maxim Sobolev [EMAIL PROTECTED] said:

 Huh, why we can't just bump libc_r version number and put older (buggy) version into
 lib/compat as usually? This would not require any ugly hacks at all.

If you want to bump libc_r's version, we should do it to libc as well,
and in that case there are a large number of ABI fixes that I have
queued up which should be done at the same time.

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread David O'Brien

On Wed, Nov 01, 2000 at 03:19:36PM -0500, Garrett Wollman wrote:
 If you want to bump libc_r's version, we should do it to libc as well,
 and in that case there are a large number of ABI fixes that I have
 queued up which should be done at the same time.

Any reason to not get them in -current now and make the bump?
 
-- 
-- David  ([EMAIL PROTECTED])
  GNU is Not Unix / Linux Is Not UniX


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Garrett Wollman

On Wed, 1 Nov 2000 14:43:55 -0800, "David O'Brien" [EMAIL PROTECTED] said:

 Any reason to not get [libc ABI changes] in -current now and make
 the bump?

Mostly because they're too small to be worth the pain.  I'm waiting
for something more significant that I can piggy-back on.

-GAWollman

--
Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
[EMAIL PROTECTED]  | O Siem / The fires of freedom 
Opinions not those of| Dance in the burning flame
MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread Max Khon

hi, there!

On Wed, 1 Nov 2000, John Polstra wrote:

 Here are all the random facts which, when put together, explain what
 is going on.
 
 Your old application was (like all -pthread programs) linked
 with "/usr/lib/libgcc_r.a".  That library contains a function
 "__register_frame_info" which uses some of the facilities of the
 pthreads library "libc_r".
 
 The pthreads library has to be initialized before it can be used, by
 a call to _thread_init.  If some functions such as pthread_mutex_lock
 are called before the library has been initialized, a segmentation
 violation results.

[...] 

 Overall I would lean toward putting the hack into pthread_mutex_lock.
 Comments?

do we still need uthread_autoinit.cc?

/fjoe



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ABI is broken??

2000-11-01 Thread John Polstra

In article [EMAIL PROTECTED],
Maxim Sobolev  [EMAIL PROTECTED] wrote:
 
 I'm not sure what exactly caused this behaviour (I can guess two potential
 victims: O'Brien's changes in crt stuff and recent Polstra's changes in
 libgcc_r), but it seems that some programs built on the previous -current from
 27 October immediately segfault when I'm trying to run then on system installed
 from today's sources. The segfault disappeared when I recompiled affected
 program. With this message I'm attaching short backtrace.
[...]
 Program received signal SIGSEGV, Segmentation fault.
 0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
 (gdb) bt
 #0  0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
 #1  0x806e782 in __register_frame_info ()
 #2  0x287a3137 in _init () from /usr/lib/libc_r.so.4
 #3  0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
 #4  0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1

Here are all the random facts which, when put together, explain what
is going on.

Your old application was (like all -pthread programs) linked
with "/usr/lib/libgcc_r.a".  That library contains a function
"__register_frame_info" which uses some of the facilities of the
pthreads library "libc_r".

The pthreads library has to be initialized before it can be used, by
a call to _thread_init.  If some functions such as pthread_mutex_lock
are called before the library has been initialized, a segmentation
violation results.

_thread_init is called automatically from libc_r's _init function
when the dynamic linker loads the library.  Unfortunately, that
isn't early enough.  libgcc_r is the first thing to be initialized,
and it calls pthread_mutex_lock before _thread_init has been called.
Or rather I should say that OLD versions of libgcc_r did that --
because they were buggy.

In other words, your old application was linked with a buggy version
of libgcc_r, but it didn't become apparent until now.

It didn't become apparent until now because our crtbegin.o and
crtend.o were also buggy.  They failed to call __register_frame_info.
This was a problem for C++ programs using exceptions, especially when
the gcc port was used and DWARF2 exception handling was selected.

Now we have fixed crtbegin.o and crtend.o, and we have fixed
libgcc_r.a.  But it causes problems for your old application because
the new crtbegin.o and crtend.o (linked into the new shared libraries
such as libc_r) call __register_frame_info in your old, buggy,
statically linked libgcc_r.a.

Are you dizzy yet?  To sum up, your old executable contains the bug but
it wasn't triggered until the recent changes.

Now, what can or should we do about this?  Arguably we should simply
say in the release notes, "Relink your old multithreaded applications.
They had a bug which is now fixed."  But if there are binary-only
commercial apps which exhibit the problem, this solution is useless.
I don't know whether there are any such apps, but I doubt it.  N.B.,
Linux apps don't count because they were never linked with our
libgcc_r in the first place.

Or we can try to work around it, but there aren't any perfectly nice
ways to do so.  Here are some possibilities:

- Put a hack in the threads library so that whenever
  pthread_mutex_lock is called it checks to make sure that the
  threads library has been initialized, and if not, it calls
  _thread_init.  This is a poor solution because it adds overhead to
  a rather performance-critical function -- though admittedly the
  overhead is very small.  Another potential problem is that there
  could be a race condition if several threads all called
  pthread_mutex_lock at once before the threads library had been
  initialized.  I don't think the race condition would materialize,
  though, since the first call would come from libgcc_r, well before
  the application had gotten control.

- Put a hack into the dynamic linker to call _thread_init very early
  if that symbol was defined.  I like this solution even less,
  because it's too hackish.  The dynamic linker isn't the place for
  special hooks like that.

- Put a hack into crtbegin.o or crtend.o.  But we are using the
  standard GNU versions of these, and I really really don't want to
  change that.  In any case, it's the wrong place for the
  work-around.

Overall I would lean toward putting the hack into pthread_mutex_lock.
Comments?

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra  Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message