Note: Our of tntnet is in Linux.

In response to your reply re the RTLD_GLOBAL issue:

You wrote: "If the driver is loaded with RTLD_LOCAL, the exception type 
is not determined correctly."

Normally (and for a very good reason) dynamically shared libs have only 
their code shared while their data structures are kept in strict 
isolation across each loaded library instance unless explicitly made to 
do otherwise by the library coder through a well defined API or some 
other mechanism. The use of the RTLD_GLOBAL flag overrides, and as a 
consequence breaks, the normal and expected behaviour of dynamically 
loaded libraries.

If we have to explicitly override the fundamental behaviour of how 
dynamic libraries operate in order to make one essential component work 
correctly (i.e., the C++ exception handling), then perhaps this shows 
that there's a design issue (flaw or limitation), either with the 
exception handling of the underlying application code, or with the the 
C++ exception handling mechanism itself - no matter what the case may 
be, in either situation a solution must be found that allows threaded 
library code to operate as expected - which is in isolation of each 
other across threads. The very best way to make sure threaded code is 
stable, is to isolate, isolate and isolate some more! The bottom line is 
that if RTLD_GLOBAL is kept in place, then certain applications will 
eventually fail despite the programmers best efforts, and this will 
happen even in cases where namespaces are being used - for example, two 
threads using the same namespaced library will obviously see the same 
names and will share these structures no matter if namespaces are used 
or not, and two different versions of the same library will use the same 
names, unless one version of the code is extensively modified to make 
the names separate *but even despite doing so* two threads using the 
same library version will of course still see the same names! I describe 
later on in more detail the problem of tntnet threads sharing loaded 
library data structures, see "Thread instances share globally defined 
dynamically loaded library data structures" near the end of this message.

I think the best solution is to first attempt to resolve the problem of 
exceptions not being caught without the use of RTLD_GLOBAL. For whatever 
reason, we do not have a problem with exceptions not being caught in our 
code, so I suggest that we compare our respective code, so that we can 
see what the differences are that makes one scenario work while another 
not work (or perhaps we're just not doing the same things?).

In addition we've uncovered another serious problem that may be related. 
This is a new (but possibly related) problem not previously reported as 
follows ...

Thread instances share globally defined library data structures.

Despite changing RTLD_GLOBAL to RTLD_LOCAL, we're still experiencing a 
similar (and very serious) problem with the tntnet threading mechanism, 
where globally defined library structures that ought to be isolated 
across simultaneous thread instances are in fact being shared across 
threads. For example, if I include a library header inside an ecpp file, 
and the header references an "extern" data structure that's defined 
inside a dynamically loaded library, then the extern data structure will 
be shared across all threads, resulting in read/write conflicts and 
therefore instability unless the structure is made read-only. What we 
had expected, was for the shared library's data structure to be 
duplicated across each thread, and therefore kept in isolation (this is 
the usual and expected shared library behaviour). Instead we see that 
the library is somehow being loaded once in a way that makes it global 
to the threads scope, as opposed to being loaded multiple times inside 
the local scope of each thread instance. To get around this problem, we 
have to design our shared libs with the expectation that they will be 
loaded up by tntnet as one shared instance used by multiple threads, and 
this can in some cases make our code many times more complex than what 
we thought was necessary.

While I normally prefer to deal with only one issue at a time, I bring 
this problem up because it may be happening for the same (or similar) 
line of reasoning that caused RTLD_GLOBAL to be chosen as a viable solution.

In considering the above, I think it makes a lot of sense for each of us 
to understand fully why shared library data structures are being seen as 
GLOBAL inside tntnet/cxxtools, and to determine if these structures can 
be kept 100% isolated (unless the application coder decides otherwise) 
so that the normal isolation of library data structures can be put back 
in place. We really do have to resolve the GLOBAL problem fully, since 
it's far too easy to introduce globally shared structures inside a 
threaded library despite knowing that it is unsafe to do so, it's also 
very limiting when you cannot safely do this as is normally expected, 
and when you have to do it, in some cases your code will have to be made 
much more complex than ought to be required (to deal with threading on 
shared structures).

I hope we can work together to resolve this issue in a way that will 
make both of us very happy :)

Thanks for your attention to this matter.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Tntnet-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tntnet-general

Reply via email to