On Fri, Aug 19, 2016 at 11:19 AM, Jason Newton <nev...@gmail.com> wrote: > > On Fri, Aug 19, 2016 at 5:36 AM, Robert Bradshaw <rober...@gmail.com> wrote: >> >> On Thu, Aug 18, 2016 at 12:05 PM, Jason Newton <nev...@gmail.com> wrote: >> > Accidentally posted to an already-opened tab for the cython-users ML >> > yesterday, moving to here. Following up from a github opened issue here: >> > >> > https://github.com/cython/cython/issues/1440 >> > >> > I was hoping we could give a way to drop straight into C/C++ inside of >> > Cython pyx files. >> > >> > Why? >> > >> > -It [helps] avoid needing to declare every class/function in cython, a >> > somewhat daunting/impossible task that I think everyone hates. Have you >> > libraries like Eigen or others that use complex template based >> > techniques? >> > How about those with tons of [member] functions to boot or getting C++ >> > inheritance involved. >> >> I agree that this is a pain, and better tooling should be developed to >> (mostly?) eliminate this (e.g. see the recent thread at >> https://groups.google.com/forum/#!topic/cython-users/c8ChI6jERzY ). >> >> Of course having symbols magically appear (e.g. due some #Include, who >> knows which) has its downsides too, which is why import * is often >> discouraged in Python too. >> >> > -It works around having the Cython compiler know about all of C++'s >> > nuances >> > - as an advanced C++ developer these are painful and it is a 2nd class >> > citizen to Cython's simpler C-support - that's no good. Just right now >> > I >> > was bitten by yet another template argument bug and it's clear C++ >> > template >> > arguments have been kind of dicy since support appeared. >> >> Yes, C++ is an extraordinarily complicated language, and exposing all >> of that would add significant amounts of complexity to Cython itself, >> and perhaps more importantly increase the barrier of entry to reading >> Cython code. One of the guiding principles is that we try to stay as >> close to Python as possible (if you know Python, you can probably read >> Cython, and with a minimal amount of C knowledge start writing it) and >> much of C++ simply isn't Pythonic. > > Maybe it's off topic but I debate the guiding principle of Cython - I was > not able to comprehend Cython before reading tutorials and this is not my > first time looking at it, I had a couple runins over the last 5 years with > it on projects such as h5py easily got lost on what was going on between all > the wrapper-wrappers (pure py code wrapping py-invokable code) and > re-declarations.
In my experience Cython has generally been fairly easy to pick up for people who already know Python. And Python often easy to pick up for people who already know C/C++. Of course for many wrappings it often takes non-trivial knowledge of the wrapped library itself too, but typically at the same level as would be required to grok code written against that same library directly from C/C++. Yes, there's bad code out there in any language (no offense meant towards h5py--I haven't looked at that project myself). Much of it due to cargo-cult perpetuations or archaic (or simply flat-out-wrong) contortions due to historical limitations (e.g. creating a Python module to wrap an _extension module, avoiding all C++ features with extensive C wrappers, ...). (You're familiar with C++, so likely no stranger to this effect.) > These projects complied with Cython's current philosophy > to the degradation of clarity, context, and overall idea of how code was > hooked up. Perhaps Cython should take the lessons learned from it's > inception, time, and the results of the state of the c-python userbase to > guide us into a new philosophy. I fail to see how "staying close to Python" caused "degradation of clarity, context, etc." If anything, the lessons learned over time have validated this philosophy. More on this later. >> Though, as you discovered, there are some basic things like non-type >> template arguments that we would like to have. If there are other >> specific C++ constructs that are commonly used but impossible to >> express in Cython it'd be useful to know. > > I haven't had the capability to use Cython sufficiently to learn more of > them because it currently can't solve my problems. From prior SWIG et al > experiences, my outlook is that it is treacherous path to walk and unless > you brought in llvm/clang into the project for parsing/AST, I'd hold onto > that outlook. I agree that any efforts to trying to parsing C++ without building on an actual compiler are fraught with danger. That's not the case with generating C++ code, which is the direction we're going. In particular, our goal is to understand C++ enough to invoke it, which allows us to be much less pedantic. A that automatically extract definitions (or even wrappings, even partially) from C++ headers is another topic, and should almost certainly lean on an existing compiler. >> > -It would allow single source files - I think this is important for >> > runtime >> > compiled quasi-JIT/AOC fragments, like OpenCL/PyOpenCL/PyCUDA provide >> >> Not quite following what you're saying here. > > Maybe PyInline was a better example off the bat for you, but something a > little more flexible but also with less work is needed. Compare with > PyOpenCL: > https://documen.tician.de/pyopencl/ - check out some examples. There is a > c runtime api between the contexts hooking things up (this is the OpenCL > runtime part) - it's a pretty similar story to PyCuda (and by the same > author, execpt for that project has to jump out to nvcc and cache kernel > compilation like the inline function implementation does). There's no limit > to the number of functions you can declare though and the OpenCL side is > kept simple - things are generally pretty typesafe/do what you would expect > on dispatch. PyInline for comparison looks like it might lean on the Python > c-api for it's work more and maybe limited in the number of functions per > snippet it can declare. I don't expect to be able to work with Numpy > ndarray data easily with it. The fact that you're essentially defining a kernel/ufunc with a well defined API in another language makes this a somewhat more natural and tractable case than freely sliding back and forth between C and Python data structures, function calls, etc. multiple times in a function's body. (Not to say that these aren't quite sophisticated projects--they tackle more interesting difficulties elsewhere.) >> > The idea is that Cython glue makes the playing field for extracting data >> > easy, but that once it's extracted to a cdef variable for instance, >> > cython >> > doesn't need to know what happens. Maybe in a way sort of like the GCC >> > asm >> > extension. Hopefully simpler variable passing though. >> >> Cython uses "mangled" names (e.g. with a __pyx prefix) to avoid any >> possible conflicts. Specifying what/how to mangle could get as ugly as >> GCC's asm variable passing. And embedded variable declarations, let >> alone control flow statements (especially return, break, ...) could >> get really messy. It obscures analysis Cython can do on the code, such >> as whether variables are used or what values they may take. Little >> code snippets are not always local either, e.g. do they often need to >> refer to variables (or headers) referenced elsewhere. And they must >> all be mutually compatible. > > Like gcc's asm, let's let adults do what they want and let them worry about > the consequences of flow control/stray includes. I'm not even sure how most > of this would be an issue (switch/break/if) if you are properly nesting pyxd > output. The only thing I think is an issue here is mangled names. I > haven't yet figured out why (cdef) variable names must be mangled. Can you > explain? Maybe we add an option to allow it to be unmangled in their > declaration? C++ has extern "C" for example. Name mangling is done for the standard reasons--to avoid possible conflicts with all other symbols that may be defined. E.g. We don't want things to suddenly break if I happen to create a variable called "PyNone." Or "__pyx_something_we_defined_implicitly." And of course we want to mangle globals, function names, etc. lest they conflict with some otherwise irrelevant symbol defined in some (possibly recursively) included header somewhere. Again, you could just say "Don't name things like that." This exposes some more guiding principles. (1) If it's valid Python, it should be valid Cython and (2) we always try to produce valid C code--if you haven't lied to us (too much) about your external declarations, a successful Cython compilation results in a valid C/C++ output. Also (3) you shouldn't have to read or understand the generated C and the Python/C API to use, let alone debug, Cython (though you're happy to do so if you want, like Java developers sometimes read bytecodes, but not usually, though understanding implementation can sometimes be helpful when chasing performance (for all languages)). There's an obvious tension between giving users all the rope they want vs. providing an API that is possibly more restrictive, but inherently correct by construction. I'll concede that Cython necessarily has pointers, so I'll give that there's plenty of room for foot-shooting (and better interfacing with modern C++ would be good help there), but the kind of errors one runs into by injecting arbitrary code snippets take things to a whole new level (and specifically violate (3) when developing and debugging). The escape hatch is to wrap the C++ in an actual C++ file and invoke the wrapping. Typically this is the minority of one's code--if it's the "whole library" then you probably have an API that's only understandable to someone well versed in C++ anyways. You've given a single example (non-type template arguments) that we would like to support that's blocking you. > Why is allowing arbitrary code inside not a good idea? We're not talking > something necessarily like eval here and the reputation it got, Actually, I think it's a whole lot like eval. It's taking an opaque (string) chunk of data and executing it as code. But potentially worse as it's in a different language and evaluated in a transformed (even if the names were unmangled) context. If we were to go this direction, I might go with a function call (like weave, maybe even follow it) rather than a new statement as the latter is difficult to extend with the myriad of optional configuration parameters, etc. that would beg to follow. > we're > talking C/C++ glue and for the scale of Cython's development effort, proper > C++ support is insurmountable simply because the language is that hard to > work with on the compiler side. You can't just go and take something that > is readily in the process of evolving (as newer standards actively come > out), that very slowly achieves full compiler support in it's own domain, > and pretend/aspire you can work like that though high level constructs in > your language - this project and none before it have not accomplished > anywhere near that and simply don't have the man-years/decades to > contribute. I truly think it is irresponsible to think a) we should do > nothing or b) delude ourselfs that we can make high level language level > features to get around this. As Ian said, we can declare the interface, not necessarily the implementation, and when things get so bad that you need to express them in C++, express them in direct C++ in a separate file and invoke the nicer interface from Cython. C++ is evolving, but most (not all) of that is happening at the level of implementation details and much of the API changes are in terms of syntactic sugar/new features that look a lot like old features. Cython only needs concern itself with that which commonly affects a library's API. As I said before, if there are specific missing features that block wrapping much of C++ code, it's worth looking at how they could fit into the language. We aim for a clean API, but also strive to be pragmatic. There is much room for improvement. >> That's aside from the jarring nature of interleaving Python and C++ >> code. Would semicolons be required? How would parsers (including IDEs) >> handle the C++ snippets? Or would they be placed in opaque >> strings/comments (which I'd rather avoid)? > > Opaque strings. It's a good and time tested solution to the issue. I'm > very happy with it in the contexts I use it in. > >> Feels like you want PyInline or weave.inline, but at compile time. > > You must realize that almost any other python driven way to compile c-code > in the spirit these projects do is deprecated/dead. Cython has absorbed all > the reputation and users that didn't go to pure-c/boost.python - pybind11 is > the new kid on the block there so I'm not including it (I'm of the opinion > that SWIG users stayed unchanged). Community belief/QA/designers/google all > think of Cython first. Weave has effectively closed up it's doors and I'm > not even sure it had the power to do what I wanted anyway because Cython > provides a language that eases the data-extraction/typecasting part of > inlining C/C++. You seem to be repeatedly bringing up the points * Many (most?) of these string-based approaches are essentially dead, often pointing people to Cython instead, but * Cython should adopt the string-embedding approach of these earlier projects. You ask at the beginning of the email whether time has vindicated our philosophy. I think, based on the mindshare vs. these other attempts at integrating with C, in large part it has. It has served us and our users well; we will strive to stay close to Python. Tight interleaving of multiple languages in is cute for making a polyglot script, but I do not think it leads to legible code. An "eval_cpp" operator would be a lot like the builtin eval--it'd be really tempting to do the "quick and easy" hack of dropping in some executable string instead of thinking how to structure things such that that could be avoided, but putting in this effort leads to more comprehensible code. It's hard to say "no" to features, but I think such an introduction would fundamentally change Cython and how it's written for the worse. - Robert _______________________________________________ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel