Re: [C++-sig] [Boost.Python v3] Conversions and Registries

Niall Douglas Tue, 20 Sep 2011 11:43:38 -0700

On 20 Sep 2011 at 12:38, Jim Bosch wrote:

> I'd also considered having a different set of template conversions that 
> are checked first for performance reasons, but I'd actually viewed the 
> override preference argument from the opposite direction - once a 
> template converter traits class has been fully specialized, you can't 
> specialize it again differently in another module (well, maybe symbol 
> visibility labels can get you out of that bind in practice).  So it 
> seemed a registry-based override would be the only way to override a 
> template-based conversion, and hence the registry-based conversions 
> would have to go first.


Ah, sorry, I didn't explain myself well at all. I've been doing a lot 
of work surrounding ISO C and C++ standards recently, so my head is 
kinda trapped in future C and C++. When I was speaking of ODR, I was 
kinda assuming that we have C++ modules available for newer compilers 
in the post C++-1x TR (see http://www.open-
std.org/jtc1/sc22/wg21/docs/papers/2007/n2316.pdf) and we can emulate 
much of module support using -fvisibility=hidden on GCC. On MSVC, of 
course, you get module proto-support for free anyway due to how their 
DLLs work.

You're absolutely correct that right now, outside of the Windows 
platform, ODR is a process wide problem in most compilers on their 
default settings. That's a PITA, so everyone is agreed that we ought 
to do something about it. The big problem is how far we ought to go, 
hence N2316 not making it into C++-1x and being pushed into TR.

What I can say is that that TR will very likely be highly compatible 
with the Windows DLL system (and its GCC visibility near-equivalent) 
due to backwards compatibility. I would suggest that you code as if 
both are true as a reasonable proxy for future C++ module support. 
Then you're covered ten years down the line from now.

> But overall I think your proposal to just try the templates first is 
> cleaner, because having multiple specializations of the same traits 
> class in different modules would be a problem either way; allowing users 
> to override the compile-time conversions with registry-based conversions 
> is at best a poor workaround.

I know this is a little off-topic, but Boost could really do with a 
generic runtime type registry implementation. There are lots of use 
cases outside BPL and if we had one, highly extensible, properly 
written system it could be applied to lots of use cases.

For example, Java-style automagical metaprogrammed C++ type 
reflection into SQL is perfectly possible. At the time I wrote it, it 
was the only example of it anywhere I could find (maybe things have 
since changed). It makes talking to databases super-easy at the cost 
of making the compiler work very hard.

There are lots more use cases too e.g. talking with .NET, or 
Objective C.

> > Just make sure what you do works with precompiled headers :)
> >
> > P.S.: This is trickier than it sounds.
> 
> Yuck.  Precompiled headers are something I've never dealt with before, 
> but I suppose I had better learn.

Getting them working can make the difference between a several hour 
recompile and ten minutes. They're painful though due to compiler 
bugs.

> > The same mechanism usefully also takes care of multiple python
> > interpreters too.
> 
> I have to admit I'm only barely following you here - threads are another 
> thing I don't deal with often.  It sounds like you have a totally 
> different option from the ones I was anticipating.  Could you explain in 
> more detail how this would work?

Sure. You have the problem when working with Python of handling the 
GIL which is strongly related to what the "current" interpreter is. 
These are TLS items in python, so each thread has its own current 
setting.

Therefore, what one really ought to have in BPL is something like:

// normal C++ code
...
// I want to call python code in interpreter X
{
  boost::python::hold_interpreter interpreter_holder(X); // Replaces 
"current" interpreter with X
  boost::python::hold_GIL gil_holder(interpreter_holder); // Acquire 
the GIL for that interpreter
  call_some_BPL_or_python_function();
} // On scope exit gil_holder and interpreter_holder gets destroyed, 
thus releasing the GIL and resetting the "current" interpreter to 
whatever it was before
// Back to normal C++ code

This obviously refers to the embedded case, but it ought to be 
similar when BPL calls into C++: the "current" interpreter should be 
available per thread as a BPL object instance wrapping the python TLS 
config. Then a call into C++ can safely call into other interpreters.

What's useful here of course is that you can keep a per-thread list 
of interpreter nestings. This means you can see exactly which module 
entered which interpreter and in which order, and therefore what to 
search and what to unwind when necessary.

> An interesting idea - avoid trying all possible conversions a runtime 
> seems a very worthy goal, though I could also see this inflating the 
> size of the modules.  Can you point me at anything existing for an example?

The closest that I have publicly available is the SQL type reflection 
machinery in TnFOX. Have a look at the following:

https://github.com/ned14/tnfox/blob/master/include/TnFXSQLDB.h
https://github.com/ned14/tnfox/blob/master/include/TnFXSQLDB_ipc.h
https://github.com/ned14/tnfox/blob/master/include/TnFXSQLDB_sqlite3.h

Note that this is an entirely *static* type registry, so it exists 
exclusively in the compiler.

It does happily extend into a dynamic registry however. I can supply 
the source which extends the TnFOX static registry with a dynamic 
runtime, but I'd need you to agree to an NDA and a promise not to 
distribute them.

> If I understand your argument, it's not the global registry that causes 
> ODR violations - it's the fact that you're trying to mimic having local 
> registries by forcing distinct BPLs for each module, and that makes BPL 
> symbols ambiguous.  If you had a pair of modules that were happy using 
> each other's converters, they would do the standard thing and share one 
> BPL and one registry and you wouldn't have any ODR problems.

ODR is a C++ (and C) spec issue and has nothing to do with BPL per 
se. It's rather that because real world code routinely violates ODR 
that it becomes a problem for anything which operates a type 
registry.

BTW code can't help violating it. Libraries have absolutely no 
control over what they must coexist with in a given process.

> In other words, it's not the fact that DLL D and DLL E register 
> different conversions for class foo that causes the ODR problems; that 
> just makes modules interact unfortunately (but in a deterministic and 
> debuggable way).  It's the workaround (loading multiple BPLs) that 
> causes the actual ODR problems.

RTLD_GLOBAL operates okay for most C++ programs because that's the 
default. Indeed, until very recently, GCC couldn't throw exceptions 
properly unless RTLD_GLOBAL was set.

Unfortunately, Python sets RTLD_LOCAL for the process because up 
until I patched GCC to add -fvisibility, there was no easy way to 
separate Python extension modules from one another. They routinely 
defined functions with identical symbols and therefore one got all 
sorts of unpleasant conflicts.

One therefore gets a big problem when using anything C++ with a type 
registry within Python. One typically has to resort to unpleasant 
hacking of dlopen settings.

> So it sounds we agree that we should only ever have one BPL loaded.  We 
> just need to implement the registry so it can know which module DLL 
> instance a particular registry lookup is coming from, whether that's 
> using special module-instance IDs or compiling the registries into the 
> module DLLs or something else.
> 
> Is that right?

Ah, but it gets worse! You can't guarantee that BPL won't be loaded 
multiply anyway. For example, one might have dependencies on two 
separate versions of BPL, or some sublibrary might link a copy of BPL 
in statically.

In fact, you can't even guarantee that there aren't multiple pythons 
running! One (nasty) way of implementing parallel python is to 
instantiate multiple pythons each with their own GIL and run them in 
separate threads. Of course, forking yourself is far saner.

In the end though, BPL is a *library*. You have absolutely no control 
over what you're combined with, but you can try your best for most 
reasonable scenarios.

I know this sounds tricky, but what you need is a design which copes 
with having one BPL loaded or many and/or one python loaded or many 
and/or one interpreter running or many. If you follow the system 
described above where each thread keeps a list of which BPL and 
python interpreter is "current", you now know which type registries 
to search in any given scenario.

You can see most of an existing implementation of what I described 
above at:

https://github.com/ned14/tnfox/blob/master/Python/FXPython.h
https://github.com/ned14/tnfox/blob/master/Python/FXPython.cxx


And oh, BTW, here is a very useful piece of C++ metaprogramming for 
BPL:

https://github.com/ned14/tnfox/blob/master/Python/FXCodeToPythonCode.h

This lets you handle a limitation in present BPL where you want to 
supply one of a list of python functions as a C callback function 
e.g. a comparison function for sorting. The metaprogramming generates 
a N member jump table and you supply a policy which thunks the C 
callback into Python. You can then install or deinstall python 
functions to the "slot" as it were and pass the appropriate C wrapper 
to the C callback function taking code.

In other words, the metaprogramming generates a unique C function 
address for each unique Python function (for the possibilities 
supplied). This is extremely useful.

Hope these help. If you have any questions, please do ask. I always 
felt it a shame I never had the time to port TnFOX extensions to BPL 
back into Boost, kinda wasted me writing it all as no one uses TnFOX 
:(

Niall

-- 
Technology & Consulting Services - ned Productions Limited.
http://www.nedproductions.biz/. VAT reg: IE 9708311Q. Company no: 
472909.



_______________________________________________
Cplusplus-sig mailing list
Cplusplus-sig@python.org
http://mail.python.org/mailman/listinfo/cplusplus-sig

Re: [C++-sig] [Boost.Python v3] Conversions and Registries

Reply via email to