On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor <jtaylor.deb...@googlemail.com> wrote: > hi, > > as the numpy gsoc topic page is a little short on options I was thinking > about adding two topics for interested students. But as I have no > experience with gsoc or mentoring and the ideas are not very fleshed out > yet I'd like to ask if it might make sense at all: > > 1. configurable algorithm precision [...] > with np.precmode(default="fast"): > np.abs(complex_array) > > or fast everything except sum and hypot > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > np.sum(d) [...]
Not a big fan of this one -- it seems like the biggest bulk of the effort would be in figuring out a non-horrible API for exposing these things and getting consensus around it, which is not a good fit to the SoC structure. I'm pretty nervous about the datetime proposal that's currently on the wiki, for similar reasons -- I'm not sure it's actually doable in the SoC context. > 2. vector math library integration This is a great suggestion -- clear scope, clear benefit. Two more ideas: 3. Using Cython in the numpy core The numpy core contains tons of complicated C code implementing elaborate operations like indexing, casting, ufunc dispatch, etc. It would be really nice if we could use Cython to write some of these things. However, there is a practical problem: Cython assumes that each .pyx file generates a single compiled module with its own Cython-defined API. Numpy, however, contains a large number of .c files which are all compiled together into a single module, with its own home-brewed system for defining the public API. And we can't rewrite the whole thing. So for this to be viable, we would need some way to compile a bunch of .c *and .pyx* files together into a single module, and allow the .c and .pyx files to call each other. This might involve changes to Cython, some sort of clever post-processing or glue code to get existing cython-generated source code to play nicely with the rest of numpy, or something else. So this project would have the following goals, depending on how practical this turns out to be: (1) produce a hacky proof-of-concept system for doing the above, (2) turn the hacky proof-of-concept into something actually viable for use in real life (possibly this would require getting changes upstream into Cython, etc.), (3) use this system to actually port some interesting numpy code into cython. 4. Pythonic dtypes The current dtype system is klugey. It basically defines its own class system, in parallel to Python's, and unsurprisingly, this new class system is not as good. In particular, it has limitations around the storage of instance-specific data which rule out a large variety of interesting user-defined dtypes, and causes us to need some truly nasty hacks to support the built-in dtypes we do have. And it makes defining a new dtype much more complicated than defining a new Python class. This project would be to implement a new dtype system for numpy, in which np.dtype becomes a near-empty base class, different dtypes (e.g., float64, float32) are simply different subclasses of np.dtype, and dtype objects are simply instances of these classes. Further enhancements would be to make it possible to define new dtypes in pure Python by subclassing np.dtype and implementing special methods for the various dtype operations, and to make it possible for ufunc loops to see the dtype objects. This project would provide the key enabling piece for a wide variety of interesting new features: missing value support, better handling of strings and categorical data, unit handling, automatic differentiation, and probably a bunch more I'm forgetting right now. If we get someone who's up to handling the dtype thing then I can mentor or co-mentor. What do y'all think? (I don't think I have access to update that wiki page -- or maybe I'm just not clever enough to figure out how -- so it would be helpful if someone who can, could?) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion