Note: Our of tntnet is in Linux. In response to your reply re the RTLD_GLOBAL issue:
You wrote: "If the driver is loaded with RTLD_LOCAL, the exception type is not determined correctly." Normally (and for a very good reason) dynamically shared libs have only their code shared while their data structures are kept in strict isolation across each loaded library instance unless explicitly made to do otherwise by the library coder through a well defined API or some other mechanism. The use of the RTLD_GLOBAL flag overrides, and as a consequence breaks, the normal and expected behaviour of dynamically loaded libraries. If we have to explicitly override the fundamental behaviour of how dynamic libraries operate in order to make one essential component work correctly (i.e., the C++ exception handling), then perhaps this shows that there's a design issue (flaw or limitation), either with the exception handling of the underlying application code, or with the the C++ exception handling mechanism itself - no matter what the case may be, in either situation a solution must be found that allows threaded library code to operate as expected - which is in isolation of each other across threads. The very best way to make sure threaded code is stable, is to isolate, isolate and isolate some more! The bottom line is that if RTLD_GLOBAL is kept in place, then certain applications will eventually fail despite the programmers best efforts, and this will happen even in cases where namespaces are being used - for example, two threads using the same namespaced library will obviously see the same names and will share these structures no matter if namespaces are used or not, and two different versions of the same library will use the same names, unless one version of the code is extensively modified to make the names separate *but even despite doing so* two threads using the same library version will of course still see the same names! I describe later on in more detail the problem of tntnet threads sharing loaded library data structures, see "Thread instances share globally defined dynamically loaded library data structures" near the end of this message. I think the best solution is to first attempt to resolve the problem of exceptions not being caught without the use of RTLD_GLOBAL. For whatever reason, we do not have a problem with exceptions not being caught in our code, so I suggest that we compare our respective code, so that we can see what the differences are that makes one scenario work while another not work (or perhaps we're just not doing the same things?). In addition we've uncovered another serious problem that may be related. This is a new (but possibly related) problem not previously reported as follows ... Thread instances share globally defined library data structures. Despite changing RTLD_GLOBAL to RTLD_LOCAL, we're still experiencing a similar (and very serious) problem with the tntnet threading mechanism, where globally defined library structures that ought to be isolated across simultaneous thread instances are in fact being shared across threads. For example, if I include a library header inside an ecpp file, and the header references an "extern" data structure that's defined inside a dynamically loaded library, then the extern data structure will be shared across all threads, resulting in read/write conflicts and therefore instability unless the structure is made read-only. What we had expected, was for the shared library's data structure to be duplicated across each thread, and therefore kept in isolation (this is the usual and expected shared library behaviour). Instead we see that the library is somehow being loaded once in a way that makes it global to the threads scope, as opposed to being loaded multiple times inside the local scope of each thread instance. To get around this problem, we have to design our shared libs with the expectation that they will be loaded up by tntnet as one shared instance used by multiple threads, and this can in some cases make our code many times more complex than what we thought was necessary. While I normally prefer to deal with only one issue at a time, I bring this problem up because it may be happening for the same (or similar) line of reasoning that caused RTLD_GLOBAL to be chosen as a viable solution. In considering the above, I think it makes a lot of sense for each of us to understand fully why shared library data structures are being seen as GLOBAL inside tntnet/cxxtools, and to determine if these structures can be kept 100% isolated (unless the application coder decides otherwise) so that the normal isolation of library data structures can be put back in place. We really do have to resolve the GLOBAL problem fully, since it's far too easy to introduce globally shared structures inside a threaded library despite knowing that it is unsafe to do so, it's also very limiting when you cannot safely do this as is normally expected, and when you have to do it, in some cases your code will have to be made much more complex than ought to be required (to deal with threading on shared structures). I hope we can work together to resolve this issue in a way that will make both of us very happy :) Thanks for your attention to this matter. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Tntnet-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/tntnet-general
