On Sun, Feb 19, 2012 at 2:56 AM, David Cournapeau <courn...@gmail.com>wrote:
> Hi Mark, > > thank you for joining this discussion. > > On Sun, Feb 19, 2012 at 7:18 AM, Mark Wiebe <mwwi...@gmail.com> wrote: > > The suggestion of transitioning the NumPy core code from C to C++ has > > sparked a vigorous debate, and I thought I'd start a new thread to give > my > > perspective on some of the issues raised, and describe how such a > transition > > could occur. > > > > First, I'd like to reiterate the gcc rationale for their choice to > switch: > > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale > > > > In particular, these points deserve emphasis: > > > > The C subset of C++ is just as efficient as C. > > C++ supports cleaner code in several significant cases. > > C++ makes it easier to write cleaner interfaces by making it harder to > break > > interface boundaries. > > C++ never requires uglier code. > > I think those arguments will not be very useful: they are subjective, > and unlikely to convince people who prefer C to C++. They are arguments from a team which implement both a C and a C++ compiler. In the spectrum of possible authorities on the matter, they rate about as high as I can imagine. > > > > There are concerns about ABI/API interoperability and interactions with > C++ > > exceptions. I've dealt with these types of issues on enough platforms to > > know that while they're important, they're a lot easier to handle than > the > > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been > that > > providing a C API from a C++ library is no harder than providing a C API > > from a C library. > > This needs more details. I have some experience in both areas as well, > and mine is quite different. Reiterating a few examples that worry me: > - how can you ensure that exceptions happening in C++ will never > cross different .so/.dll ? This is a necessary part of providing a C API, and is included as a requirement of doing that. All C++ libraries which expose a C API deal with this. > How can one make sure C++ extensions built > by different compilers can work ? This is no different from the situation in C. Already in C on Windows, one can't build NumPy with a different version of Visual C++ than the one used to build CPython. > Is not using exceptions like it is > done in zeromq acceptable ? (would be nice to find out more about the > decisions made by the zeromq team about their usage of C++). I prefer to use exceptions in C++, but some major projects have decided to disable them. LLVM/Clang is the most notable example. My experience working with high-performance graphics code has been that appropriate use of exceptions (i.e. not doing something like using them for control flow) do not pose a problem. I cannot > find a recent example, but I have seen errors similar to > this(http://software.intel.com/en-us/forums/showthread.php?t=42940) > quite a few times. > This kind of thing would happen when using 'new' to allocate memory, and with the compiler setting enabled to raise bad_alloc on such allocation failures (the default for most compilers nowadays). If exception handling is disabled in the compiler, new will return NULL instead. Unless the compiler has a bizarre issue, catching either std::exception or std::bad_alloc specifically within NumPy should be sufficient to deal with it. Also note that the possibility of something like this will only arise once more advanced C++ features are being adopted. - how can you expose in C some heavily-using C++ features ? If the advantages of those C++ features depend on the C++ language, you have to map them to a limited subset of the feature in C. For example, if a feature is based on a C++ template, you can instantiate specific instances of the template for all the types you want to support from C. > I would > expect you would like to use templates for iterators in numpy - you > can you make them available to 3rd party extensions without requiring > C++. > Yes, something like the nditer is a good example. From C, it would have to retain an API in the current style, but C++ users could gain an easier-to-use variant. > > > > > It's worth comparing the possibility of C++ versus the possibility of > other > > languages, and the ones that have been suggested for consideration are D, > > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language > > has to interact naturally with the CPython API. It needs to provide > direct > > access to all the various sizes of signed int, unsigned int, and float. > It > > needs to have mature compiler support wherever we want to deploy NumPy. > > Taken together, these requirements eliminate a majority of these > > possibilities. From these criteria, the only languages which seem to > have a > > clear possibility for the implementation of Numpy are C, C++, and D. For > D, > > I suspect the tooling is not mature enough, but I'm not 100% certain of > > that. > > While I agree that no other language is realistic, staying in C has > the nice advantage that we can more easily use one of them if they > mature (rust/D - go, rpython, C#/java can be dismissed for fundamental > technical reasons right away). This is not a very strong argument > against using C++, obviously. > To provide a counterpoint to this argument, switching to C++ could actually make a transition to another language easier. C++ classes and templates map to equivalent features in D quite naturally, to provide a specific example. > > > > > 1) Immediately after branching for 1.7, we minimally patch all the .c > files > > so that they can build with a C++ compiler and with a C compiler at the > same > > time. Then we rename all .c -> .cpp, and update the build systems for > C++. > > 2) During the 1.8 development cycle, we heavily restrict C++ feature > usage. > > But, where a feature implementation would be arguably easier and less > > error-prone with C++, we allow it. This is a period for learning about > C++ > > and how it can benefit NumPy. > > 3) After the 1.8 release, the community will have developed more > experience > > with C++, and will be in a better position to discuss a way forward. > > A step that would be useful sooner rather than later is one where > numpy has been split into smaller extensions (instead of > multiarray/ufunc, essentially). This would help avoiding recompilation > of lots of code for any small change. It is already quite painful with > C, but with C++, it will be unbearable. This can be done in C, and > would be useful whether the decision to move to C++ is accepted or > not. > I'm pretty confident that the current code will compile in C++ in nearly identical time to C. Having a properly working incremental build system would be a nice step to take numpy builds out of the dark ages, though. Your tireless efforts to make this happen are appreciated! -Mark > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion