On 1 September 2013 18:11, Stefan Behnel <stefan...@behnel.de> wrote: > Nick Coghlan, 01.09.2013 03:28: >> On 1 Sep 2013 05:18, "Stefan Behnel" wrote: >>> I can't really remember a case where I could afford the >>> runtime overhead of implementing a wrapper in Python and going through >>> something like ctypes or cffi. I mean, testing C libraries with Python >>> tools would be one, but then, you wouldn't want to write an extension >>> module for that and instead want to call it directly from the test code as >>> directly as possible. >>> >>> I'm certainly aware that that use case exists, though, and also the case >>> of just wanting to get things done as quickly and easily as possible. >> >> Keep in mind I first came to Python as a tool for test automation of custom >> C++ hardware APIs that could be written to be SWIG friendly. > > Interesting again. Would you still do it that way? I recently had a > discussion with Holger Krekel of py.test fame about testing C code with > Cython, and we quickly agreed that wrapping the code in an extension module > was both too cumbersome and too inflexible for testing purposes. > Specifically, neither of Cython's top selling points fits here, not speed, > not clarity, not API design. It's most likely different for SWIG, which > involves less (not no, just less) manual work and gives you API-wise more > of less exactly what you put in. However, cffi is almost certainly the > better way to do it, because it gives you all sorts of flexibility for your > test code without having to think about the wrapper design all the time. > > The situation is also different for C++ where you have less options for > wrapping it. I can imagine SWIG still being the tool of choice on that > front when it comes to bare and direct testing of large code bases.
To directly wrap C++, I'd still use SWIG. It makes a huge difference when you can tweak the C++ side of the API to be SWIG friendly rather than having to live with whatever a third party C++ library provides. Having classes in C++ map directly to classes in Python is the main benefit of doing it this way over using a C wrapper and cffi. However, for an existing C API, or a custom API where I didn't need the direct object mapping that C++ can provide, using cffi would be a more attractive option than SWIG these days (the stuff I was doing with SWIG was back around 2003 or so). I think this is getting a little off topic for the list, though :) >> I now work for an OS vendor where the 3 common languages for system >> utilities are C, C++ and Python. >> >> For those use cases, dropping a bunch of standard Python objects in a >> module dict is often going to be a quick and easy solution that avoids a >> lot of nasty pointer lifecycle issues at the C level. > > That's yet another use case, BTW. When you control the whole application, > then safety doesn't really matter at these points and keeping a bunch of > stuff in a dict will usually work just fine. I'm mainly used to writing > libraries for (sometimes tons of) other people, in which case the > requirements are so diverse on user side that safety is a top thing to care > about. Anything you can keep inside of C code should stay there. > (Especially when dealing with libxml2&friends in lxml which continuously > present their 'interesting' usability characteristics.) I don't think it's a coincidence that it was the etree interface with expat that highlighted the deficiencies of the current extension module hooks when it comes to working properly with test.support.import_fresh_module :) >> * PEP 3121 with a size of "0". As above, but avoids the module state APIs >> in order to support reloading. All module state (including type >> cross-references) is stored in hidden state (e.g. an instance of a custom >> type not exposed to Python, with a reference stored on each custom type >> object defined in the module, and any module level "functions" actually >> being methods of a hidden object). > > Thanks for elaborating. I had completely failed to make the mental link > that you could simply stick bound methods as functions into the module > dict, i.e. that they don't even have to be methods of the module itself. > That's something that Cython could already use in older CPythons, even as a > preparation for any future import protocol changes. The object that they > are methods of would then eventually become the module instance. > > You'd still suffer a slight performance hit from going from a static global > C variable to a pointer indirection - for everything: string constants, > cached Python objects, all user defined global C variables would have to go > there as Cython cannot know if they are module instance specific state or > not (they usually will be, I guess). But that has to be done anyway if the > goal is to get rid of static state to enable sub-interpreters. I can't wait > seeing lxml run threaded in mod_wsgi... ;-) To be honest, I didn't realise that such a trick might already be possible until I was writing down this list of alternatives. If you manage to turn it into a real solution for lxml (or Cython in general), it would be great to hear more about how you turned the general idea into something real :) That means the powers any new extension initialisation API will offer will be limited to: * letting the module know its own name (and other details) * letting the module explicitly block reloading * letting the module support loading multiple copies at once by taking the initial import out of sys.modules (but keeping a separate reference to it alive) <snip> >>> As soon as you have more than one extension type in your module, and they >>> interact with each other, they will almost certainly have to do type >>> checks >>> against each other to make sure users haven't passed them rubbish before >>> they access any C struct fields of the object. Doing a type check means >>> that at least one type has a pointer to the other, meaning that it holds >>> global module state. >> >> Sure, but you can use the CPython API rather than writing normal C code. We >> do this fairly often in CPython when we're dealing with things stored in >> modules that can be manipulated from Python. >> >> It incurs CPython's dynamic dispatch overhead, but sometimes that's worth >> it to avoid needing to deal with C level lifecycle issues. > > Not so much of a problem in Cython, because all you usually have to do to > get fast C level access to something is to change a "def" into a "cdef" > somewhere, or add a decorator, or an assignment to a known extension type > variable. Once the module global state is 'virtualised', this will also be > a safe thing to do in the face of multiple module instances, and still be > much faster than going through Python calls. It's kinda cool how designing a next generation API can sometimes reveal hidden possibilities of an *existing* API :) We had another one of those recently on distutils-sig, when I realised the much-maligned .pth modules are actually a decent solution to sharing distributions between virtual environments. I'm so used to disliking their global side effects when used with the system Python that it took me a long time to recognise the validity of using them to make implicit path additions in a more controlled virtual environment :) In terms of where we go from here - do you mind if I use your pre-PEP as the initial basis for a PEP of my own some time in the next week or two (listing you as co-author)? Improving extension module initialisation has been the driver for most of the PEP 451 feedback I've been giving to Eric over on import-sig, so I have some definite ideas on how I think that API should look :) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com