At 09:34 PM 05/21/02 +0100, Nicholas Clark wrote:

First, thanks for helping.  I really appreciate it.

>I think it's  good news. It's conclusive proof that the problem is in the
>dynamic linking. It's not in your module, the aspell library or core perl.
>So at least you know where it is.

Ok, well that's good.  But I also suspect something to do with the
implementation of exception handling in C++.

Let me ask a bit about PERL_DL_NONLAZY and dlopen().  It's a bit off topic,
but might help me resolve the problem.

I do think I understand that PERL_DL_NONLAZY sets the flag on dlopen() to
fail if all symbols can't be resolved.   I don't really understand how
dynamic linking works -- I suppose the first time the code call a function
that's in an .so library that you really end up calling some code that
loads the library, then that code fixes up things so next time you just
call the newly loaded code.  Maybe that's simplistic or wrong, but a
reasonable guess?

Anyway, PERL_DL_OPEN sets a flag on the call to dlopen() to say every
symbol in that loaded library must be resolved.  Does that mean that if
that library calls functions that are in yet another shared library that it
will then load that library on that same dlopen() call?  Seems like that
flag would trigger a chain reaction on the first dlopen().  And I hope
dlopen() is reentrant.

Here's my real question.  Does PERL_DL_OPEN effect every dlopen() call?
That is, if I link with c++ code, and while the c++ code is executing it
calls a function that requires a library to be loaded, will that dlopen()
call also have the flag set to resolve all symbols?

It would be nice if I could set a break point on the dlopen(), or run
something like strace to show the dlopen calls.

Here's my thinking of how things run when it's working

1) I run perl with my module
2) perl calls into my module which (if not static) needs to be loaded
3) my module (as a .so) is loaded via dlopen()
4) my module calls into the aspell library
5) dlopen() is called to load the aspell libraries.
6) the c++ code in aspell starts a try {} block and attempts to open a file
7) the file does not exist so the code throws a "CantReadFile" exception
8) the "CantReadFile" exception constructor is called
9) now the associated catch CantReadFile {} block is called

I can see this happen (using the good old print "I'm here!" technique)
I do see the catch block runs.

Now, when it *doesn't* work I see (with my print statements)

<as above>
....
7) the file does not exist so the code throws an exception
8) the exception constructor is called
Aborted

But the catch block is never reached.  Note, I do have cout << "I'm Here!"
messages on every catch block for that exception class.

What happens between the exception constructor and the catch block?  I've
been told that all the destructors are called for any objects that were
created up through the stack frame to the point of the catch block.

So, I see two possibilities:

If PERL_DL_NONLAZY sets the flag on *all* dlopen() calls I can imagine some
destructor calling on some function that's not loaded yet, dlopen() opens
that library and then there's an unresolved symbol in that library and
poof!  Seems unlikely.

The other possible situation is that some destructor calls throw again.  I
assume that a throw from within a throw might cause the program to abort.
But that's unlikely why PERL_DL_NONLAZY would change any logic in the
aspell library.

[Opens another beer...]

>> I've been using this:
>> 
>> make distclean
>> ../Configure -des -Dprefix=/home/moseley/perl/ithread \
>> -Dusethreads -Doptimize='-g' -Duseshrplib -Dusedevel
>> make && make test
>> make install
>> 
>> (or just now)
>> ../Configure -des -Dprefix=/home/moseley/perl/static \
>> -Doptimize='-g' -Dusedevel
>
>You don't need to build a fully static perl to link your extension statically
>against it. Sorry if that wasn't clear and I've caused you to spend lots of
>time waiting for perl to rebuild.

No, I think I had to rebuild.  When I built with that first ./configure
above I didn't end up with a lib/5.7.3/i686-linux/CORE/libperl.a library.
So that's why I needed to rebuild perl.

Sorry for being so thick, but I'm not really clear what a "fully static
perl" is compared to, eh, the other thing....

>Does SuSE have a different glibc version to Debian?

Eh, how do I tell?  

Debian:

$ ls -lF /usr/lib/libglib* 
/usr/lib/libglib-1.2.so.0 -> libglib-1.2.so.0.0.10
/usr/lib/libglib-1.2.so.0.0.10
/usr/lib/libglib-2.0.so.0 -> libglib-2.0.so.0.0.1
/usr/lib/libglib-2.0.so.0.0.1
/usr/lib/libglib.a
/usr/lib/libglib.la
/usr/lib/libglib.so -> libglib-1.2.so.0.0.10

SuSE:

/usr/lib/libglib-1.2.so.0 -> libglib-1.2.so.0.0.6*
/usr/lib/libglib-1.2.so.0.0.6*
/usr/lib/libglib.a
/usr/lib/libglib.la*
/usr/lib/libglib.so -> libglib-1.2.so.0.0.6*


Back to gdb:

Here's what gdb says about the abort:

Program received signal SIGABRT, Aborted.
0x4007f7b1 in kill () from /lib/libc.so.6

Do we expect libc to be calling kill?







Here's how much fun it is trying to debug this:

Current language:  auto; currently c++
(gdb) l
24          }
25          abort();
26        }
27
28        void ConfigData::throw_file_exception(const char * file) const {
29          if (config_.error_number() == PERROR_CANT_READ_FILE) {
30            cout << __FILE__ ":" << __LINE__ << ": about to throw 'throw
CantReadFile(file)'\n";
31            throw CantReadFile(file);
32            cout << __FILE__ ":" << __LINE__ << ": [after] about to throw
'throw CantReadFile(file)'\n";
33          }
(gdb) n
30            cout << __FILE__ ":" << __LINE__ << ": about to throw 'throw
CantReadFile(file)'\n";
(gdb) n
config_data.cc:30: about to throw 'throw CantReadFile(file)'
178         : dat (nilRep.grab ()) { assign (s); }
(gdb) b CantReadFile::CantReadFile
Segmentation fault

:(




-- 
Bill Moseley
mailto:[EMAIL PROTECTED]

Reply via email to