On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor
<jtaylor.deb...@googlemail.com> wrote:
> hi,
>
> as the numpy gsoc topic page is a little short on options I was thinking
> about adding two topics for interested students. But as I have no
> experience with gsoc or mentoring and the ideas are not very fleshed out
> yet I'd like to ask if it might make sense at all:
>
> 1. configurable algorithm precision
[...]
> with np.precmode(default="fast"):
>   np.abs(complex_array)
>
> or fast everything except sum and hypot
>
> with np.precmode(default="fast", sum="kahan", hypot="standard"):
>   np.sum(d)
[...]

Not a big fan of this one -- it seems like the biggest bulk of the
effort would be in figuring out a non-horrible API for exposing these
things and getting consensus around it, which is not a good fit to the
SoC structure.

I'm pretty nervous about the datetime proposal that's currently on the
wiki, for similar reasons -- I'm not sure it's actually doable in the
SoC context.

> 2. vector math library integration

This is a great suggestion -- clear scope, clear benefit.

Two more ideas:

3. Using Cython in the numpy core

The numpy core contains tons of complicated C code implementing
elaborate operations like indexing, casting, ufunc dispatch, etc. It
would be really nice if we could use Cython to write some of these
things. However, there is a practical problem: Cython assumes that
each .pyx file generates a single compiled module with its own
Cython-defined API. Numpy, however, contains a large number of .c
files which are all compiled together into a single module, with its
own home-brewed system for defining the public API. And we can't
rewrite the whole thing. So for this to be viable, we would need some
way to compile a bunch of .c *and .pyx* files together into a single
module, and allow the .c and .pyx files to call each other. This might
involve changes to Cython, some sort of clever post-processing or glue
code to get existing cython-generated source code to play nicely with
the rest of numpy, or something else.

So this project would have the following goals, depending on how
practical this turns out to be: (1) produce a hacky proof-of-concept
system for doing the above, (2) turn the hacky proof-of-concept into
something actually viable for use in real life (possibly this would
require getting changes upstream into Cython, etc.), (3) use this
system to actually port some interesting numpy code into cython.

4. Pythonic dtypes

The current dtype system is klugey. It basically defines its own class
system, in parallel to Python's, and unsurprisingly, this new class
system is not as good. In particular, it has limitations around the
storage of instance-specific data which rule out a large variety of
interesting user-defined dtypes, and causes us to need some truly
nasty hacks to support the built-in dtypes we do have. And it makes
defining a new dtype much more complicated than defining a new Python
class.

This project would be to implement a new dtype system for numpy, in
which np.dtype becomes a near-empty base class, different dtypes
(e.g., float64, float32) are simply different subclasses of np.dtype,
and dtype objects are simply instances of these classes. Further
enhancements would be to make it possible to define new dtypes in pure
Python by subclassing np.dtype and implementing special methods for
the various dtype operations, and to make it possible for ufunc loops
to see the dtype objects.

This project would provide the key enabling piece for a wide variety
of interesting new features: missing value support, better handling of
strings and categorical data, unit handling, automatic
differentiation, and probably a bunch more I'm forgetting right now.

If we get someone who's up to handling the dtype thing then I can
mentor or co-mentor.

What do y'all think?

(I don't think I have access to update that wiki page -- or maybe I'm
just not clever enough to figure out how -- so it would be helpful if
someone who can, could?)

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to