At 09:34 PM 05/21/02 +0100, Nicholas Clark wrote: First, thanks for helping. I really appreciate it.
>I think it's good news. It's conclusive proof that the problem is in the >dynamic linking. It's not in your module, the aspell library or core perl. >So at least you know where it is. Ok, well that's good. But I also suspect something to do with the implementation of exception handling in C++. Let me ask a bit about PERL_DL_NONLAZY and dlopen(). It's a bit off topic, but might help me resolve the problem. I do think I understand that PERL_DL_NONLAZY sets the flag on dlopen() to fail if all symbols can't be resolved. I don't really understand how dynamic linking works -- I suppose the first time the code call a function that's in an .so library that you really end up calling some code that loads the library, then that code fixes up things so next time you just call the newly loaded code. Maybe that's simplistic or wrong, but a reasonable guess? Anyway, PERL_DL_OPEN sets a flag on the call to dlopen() to say every symbol in that loaded library must be resolved. Does that mean that if that library calls functions that are in yet another shared library that it will then load that library on that same dlopen() call? Seems like that flag would trigger a chain reaction on the first dlopen(). And I hope dlopen() is reentrant. Here's my real question. Does PERL_DL_OPEN effect every dlopen() call? That is, if I link with c++ code, and while the c++ code is executing it calls a function that requires a library to be loaded, will that dlopen() call also have the flag set to resolve all symbols? It would be nice if I could set a break point on the dlopen(), or run something like strace to show the dlopen calls. Here's my thinking of how things run when it's working 1) I run perl with my module 2) perl calls into my module which (if not static) needs to be loaded 3) my module (as a .so) is loaded via dlopen() 4) my module calls into the aspell library 5) dlopen() is called to load the aspell libraries. 6) the c++ code in aspell starts a try {} block and attempts to open a file 7) the file does not exist so the code throws a "CantReadFile" exception 8) the "CantReadFile" exception constructor is called 9) now the associated catch CantReadFile {} block is called I can see this happen (using the good old print "I'm here!" technique) I do see the catch block runs. Now, when it *doesn't* work I see (with my print statements) <as above> .... 7) the file does not exist so the code throws an exception 8) the exception constructor is called Aborted But the catch block is never reached. Note, I do have cout << "I'm Here!" messages on every catch block for that exception class. What happens between the exception constructor and the catch block? I've been told that all the destructors are called for any objects that were created up through the stack frame to the point of the catch block. So, I see two possibilities: If PERL_DL_NONLAZY sets the flag on *all* dlopen() calls I can imagine some destructor calling on some function that's not loaded yet, dlopen() opens that library and then there's an unresolved symbol in that library and poof! Seems unlikely. The other possible situation is that some destructor calls throw again. I assume that a throw from within a throw might cause the program to abort. But that's unlikely why PERL_DL_NONLAZY would change any logic in the aspell library. [Opens another beer...] >> I've been using this: >> >> make distclean >> ../Configure -des -Dprefix=/home/moseley/perl/ithread \ >> -Dusethreads -Doptimize='-g' -Duseshrplib -Dusedevel >> make && make test >> make install >> >> (or just now) >> ../Configure -des -Dprefix=/home/moseley/perl/static \ >> -Doptimize='-g' -Dusedevel > >You don't need to build a fully static perl to link your extension statically >against it. Sorry if that wasn't clear and I've caused you to spend lots of >time waiting for perl to rebuild. No, I think I had to rebuild. When I built with that first ./configure above I didn't end up with a lib/5.7.3/i686-linux/CORE/libperl.a library. So that's why I needed to rebuild perl. Sorry for being so thick, but I'm not really clear what a "fully static perl" is compared to, eh, the other thing.... >Does SuSE have a different glibc version to Debian? Eh, how do I tell? Debian: $ ls -lF /usr/lib/libglib* /usr/lib/libglib-1.2.so.0 -> libglib-1.2.so.0.0.10 /usr/lib/libglib-1.2.so.0.0.10 /usr/lib/libglib-2.0.so.0 -> libglib-2.0.so.0.0.1 /usr/lib/libglib-2.0.so.0.0.1 /usr/lib/libglib.a /usr/lib/libglib.la /usr/lib/libglib.so -> libglib-1.2.so.0.0.10 SuSE: /usr/lib/libglib-1.2.so.0 -> libglib-1.2.so.0.0.6* /usr/lib/libglib-1.2.so.0.0.6* /usr/lib/libglib.a /usr/lib/libglib.la* /usr/lib/libglib.so -> libglib-1.2.so.0.0.6* Back to gdb: Here's what gdb says about the abort: Program received signal SIGABRT, Aborted. 0x4007f7b1 in kill () from /lib/libc.so.6 Do we expect libc to be calling kill? Here's how much fun it is trying to debug this: Current language: auto; currently c++ (gdb) l 24 } 25 abort(); 26 } 27 28 void ConfigData::throw_file_exception(const char * file) const { 29 if (config_.error_number() == PERROR_CANT_READ_FILE) { 30 cout << __FILE__ ":" << __LINE__ << ": about to throw 'throw CantReadFile(file)'\n"; 31 throw CantReadFile(file); 32 cout << __FILE__ ":" << __LINE__ << ": [after] about to throw 'throw CantReadFile(file)'\n"; 33 } (gdb) n 30 cout << __FILE__ ":" << __LINE__ << ": about to throw 'throw CantReadFile(file)'\n"; (gdb) n config_data.cc:30: about to throw 'throw CantReadFile(file)' 178 : dat (nilRep.grab ()) { assign (s); } (gdb) b CantReadFile::CantReadFile Segmentation fault :( -- Bill Moseley mailto:[EMAIL PROTECTED]