Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
What is the status of this proposal? On Wed, Jun 22, 2011 at 6:56 PM, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jun 22, 2011 at 4:57 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Jun 22, 2011 at 1:31 PM, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jun 22, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote: Darren Dale writes: On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: How does the ufunc get called so it doesn't get caught in an endless loop? [...] The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. I didn't understand, could you please expand on that or show an example? As I understood the initial description and examples, the ufunc overload will keep being used as long as its arguments are of classes that declare ufunc overrides (i.e., classes with the _numpy_ufunc_ attribute). Thus Mark's comment saying that you have to either transform the arguments into raw ndarrays (either by creating new ones or passing a view) or use the subok = False kwarg parameter to break a possible overloading loop. The sequence of events is something like this: 1. You call np.sin(x) 2. The np.sin ufunc looks at x, sees the _numpy_ufunc_ attribute, and calls x._numpy_ufunc_(np.sin, x) 3. _numpy_ufunc_ uses np.sin.name (which is sin) to get the correct my_sin function to call 4A. If my_sin called np.sin(x), we would go back to 1. and get an infinite loop 4B. If x is a subclass of ndarray, my_sin can call np.sin(x, subok=False), as this disables the subclass overloading mechanism. 4C. If x is not a subclass of ndarray, x needs to produce an ndarray, for instance it might have an x.arr property. Then it can call np.sin(x.arr) Ok, that seems straightforward and, for what its worth, it looks like it would meet my needs. However, I wonder if the _numpy_func_ mechanism is the best approach. This is a bit sketchy, but why not do something like: class Proxy: def __init__(self, ufunc, *args): self._ufunc = ufunc self._args = args def __call__(self, func): self._ufunc._registry[tuple(type(arg) for arg in self._args)] = func return func class UfuncObject: ... def __call__(self, *args, **kwargs): func = self._registry.get(tuple(type(arg) for arg in args), None) if func is None: raise TypeError return func(*args, **kwargs) def register(self, *args): return Proxy(self, *args) @np.sin.register(Quantity) def sin(pq): if pq.units != degrees: pq = pq.rescale(degrees) temp = np.sin(pq.view(np.ndarray)) return Quantity(temp, copy=False) This way, classes don't have to implement special methods to support ufunc registration, special attributes to claim primacy in ufunc registration lookup, special versions of the functions for each numpy ufunc, *and* the logic to determine whether the combination of arguments is supported. By that I mean, if I call np.sum with a quantity and a masked array, and Quantity wins the __array_priority__ competition, then I also need to check that my special sum function(s) know how to operate on that combination of inputs. With the decorator approach, I just need to implement the special versions of the ufuncs, and the decorators handle the logic of knowing what combinations of arguments are supported. It might be worth considering using ABCs for registration and have UfuncObject use isinstance to determine the appropriate special function to call. The thing I'm not sure about with this idea is how to do something general to modify all ufuncs for a particular class in one swoop. Having to individually override everything would be a hassle, I think. I also don't think this approach saves very much effort compared to the _numpy_ufunc_ call approach, checking the types and raising NotImplemented if they aren't what's wanted is pretty much the only difference, and that check could still be handled by a decorator like you're showing. -Mark Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] X11 system info
On Wed, Jul 20, 2011 at 4:58 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 19 Jul 2011 21:55:28 +0200, Ralf Gommers wrote: On Sun, Jul 17, 2011 at 11:48 PM, Darren Dale dsdal...@gmail.com wrote: In numpy.distutils.system info: default_x11_lib_dirs = libpaths(['/usr/X11R6/lib','/usr/X11/lib', '/usr/lib'], platform_bits) default_x11_include_dirs = ['/usr/X11R6/include','/usr/X11/include', '/usr/include'] These defaults won't work on the forthcoming Ubuntu 11.10, which installs X into /usr/lib/X11 and /usr/include/X11. Did you check that some compilation fails because of this? If not, how did you find the information that the location is changed? I discovered the problem when I tried to build the entire Enthought Tool Suite from source on a Kubuntu-11.10 pre-release system. Even after changing the paths to point at the right location, there are other problems, as seen from this traceback for building Enable: /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525: UserWarning: Specified path /usr/local/include/python2.7 is invalid. warnings.warn('Specified path %s is invalid.' % d) /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525: UserWarning: Specified path /usr/include/suitesparse/python2.7 is invalid. warnings.warn('Specified path %s is invalid.' % d) /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525: UserWarning: Specified path is invalid. warnings.warn('Specified path %s is invalid.' % d) /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525: UserWarning: Specified path /usr/lib/X1164 is invalid. warnings.warn('Specified path %s is invalid.' % d) Traceback (most recent call last): File setup.py, line 56, in module config = configuration().todict() File setup.py, line 48, in configuration config.add_subpackage('kiva') File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 972, in add_subpackage caller_level = 2) File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 941, in get_subpackage caller_level = caller_level + 1) File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 878, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File kiva/setup.py, line 27, in configuration config.add_subpackage('agg') File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 972, in add_subpackage caller_level = 2) File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 941, in get_subpackage caller_level = caller_level + 1) File /usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py, line 878, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File kiva/agg/setup.py, line 235, in configuration x11_info = get_info('x11', notfound_action=2) File /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py, line 308, in get_info return cl().get_info(notfound_action) File /usr/lib/pymodules/python2.7/numpy/distutils/system_info.py, line 459, in get_info raise self.notfounderror(self.notfounderror.__doc__) numpy.distutils.system_info.X11NotFoundError: X11 libraries not found. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] X11 system info
In numpy.distutils.system info: default_x11_lib_dirs = libpaths(['/usr/X11R6/lib','/usr/X11/lib', '/usr/lib'], platform_bits) default_x11_include_dirs = ['/usr/X11R6/include','/usr/X11/include', '/usr/include'] These defaults won't work on the forthcoming Ubuntu 11.10, which installs X into /usr/lib/X11 and /usr/include/X11. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: How does the ufunc get called so it doesn't get caught in an endless loop? [...] The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. I didn't understand, could you please expand on that or show an example? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Wed, Jun 22, 2011 at 1:31 PM, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jun 22, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote: Darren Dale writes: On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: How does the ufunc get called so it doesn't get caught in an endless loop? [...] The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. I didn't understand, could you please expand on that or show an example? As I understood the initial description and examples, the ufunc overload will keep being used as long as its arguments are of classes that declare ufunc overrides (i.e., classes with the _numpy_ufunc_ attribute). Thus Mark's comment saying that you have to either transform the arguments into raw ndarrays (either by creating new ones or passing a view) or use the subok = False kwarg parameter to break a possible overloading loop. The sequence of events is something like this: 1. You call np.sin(x) 2. The np.sin ufunc looks at x, sees the _numpy_ufunc_ attribute, and calls x._numpy_ufunc_(np.sin, x) 3. _numpy_ufunc_ uses np.sin.name (which is sin) to get the correct my_sin function to call 4A. If my_sin called np.sin(x), we would go back to 1. and get an infinite loop 4B. If x is a subclass of ndarray, my_sin can call np.sin(x, subok=False), as this disables the subclass overloading mechanism. 4C. If x is not a subclass of ndarray, x needs to produce an ndarray, for instance it might have an x.arr property. Then it can call np.sin(x.arr) Ok, that seems straightforward and, for what its worth, it looks like it would meet my needs. However, I wonder if the _numpy_func_ mechanism is the best approach. This is a bit sketchy, but why not do something like: class Proxy: def __init__(self, ufunc, *args): self._ufunc = ufunc self._args = args def __call__(self, func): self._ufunc._registry[tuple(type(arg) for arg in self._args)] = func return func class UfuncObject: ... def __call__(self, *args, **kwargs): func = self._registry.get(tuple(type(arg) for arg in args), None) if func is None: raise TypeError return func(*args, **kwargs) def register(self, *args): return Proxy(self, *args) @np.sin.register(Quantity) def sin(pq): if pq.units != degrees: pq = pq.rescale(degrees) temp = np.sin(pq.view(np.ndarray)) return Quantity(temp, copy=False) This way, classes don't have to implement special methods to support ufunc registration, special attributes to claim primacy in ufunc registration lookup, special versions of the functions for each numpy ufunc, *and* the logic to determine whether the combination of arguments is supported. By that I mean, if I call np.sum with a quantity and a masked array, and Quantity wins the __array_priority__ competition, then I also need to check that my special sum function(s) know how to operate on that combination of inputs. With the decorator approach, I just need to implement the special versions of the ufuncs, and the decorators handle the logic of knowing what combinations of arguments are supported. It might be worth considering using ABCs for registration and have UfuncObject use isinstance to determine the appropriate special function to call. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs
On Tue, Jun 21, 2011 at 2:28 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe mwwi...@gmail.com wrote: NumPy has a mechanism built in to allow subclasses to adjust or override aspects of the ufunc behavior. While this goal is important, this mechanism only allows for very limited customization, making for instance the masked arrays unable to work with the native ufuncs in a full and proper way. I would like to deprecate the current mechanism, in particular __array_prepare__ and __array_wrap__, and introduce a new method I will describe below. If you've ever used these mechanisms, please review this design to see if it meets your needs. The current approach is at a dead end, so something better needs to be done. Any class type which would like to override its behavior in ufuncs would define a method called _numpy_ufunc_, and optionally an attribute __array_priority__ as can already be done. The class which wins the priority battle gets its _numpy_ufunc_ function called as follows: return arr._numpy_ufunc_(current_ufunc, *args, **kwargs) To support this overloading, the ufunc would get a new support method, result_type, and there would be a new global function, broadcast_empty_like. The function ufunc.empty_like behaves like the global np.result_type, but produces the output type or a tuple of output types specific to the ufunc, which may follow a different convention than regular arithmetic type promotion. This allows for a class to create an output array of the correct type to pass to the ufunc if it needs to be different than the default. The function broadcast_empty_like is just like empty_like, but takes a list or tuple of arrays which are to be broadcast together for producing the output, instead of just one. How does the ufunc get called so it doesn't get caught in an endless loop? I like the proposed method if it can also be used for classes that don't subclass ndarray. Masked array, for instance, should probably not subclass ndarray. The function being called needs to ensure this, either by extracting a raw ndarray from instances of its class, or adding a 'subok = False' parameter to the kwargs. Supporting objects that aren't ndarray subclasses is one of the purposes for this approach, and neither of my two example cases subclassed ndarray. Sounds good. Many of the current uses of __array_wrap__ that I am aware of are in the wrappers in the linalg module and don't go through the ufunc machinery. How would that be handled? I contributed the __array_prepare__ method a while back so classes could raise errors before the array data is modified in place. Specifically, I was concerned about units support in my quantities package (http://pypi.python.org/pypi/quantities). But I agree that this approach is needs to be reconsidered. It would be nice for subclasses to have an opportunity to intercept and process the values passed to a ufunc on their way in. For example, it would be nice if when I did np.cos(1.5 degrees), my subclass could intercept the value and pass a new one on to the ufunc machinery that is expressed in radians. I thought PJ Eby's generic functions PEP would be a really good way to handle ufuncs, but the PEP has stagnated. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] python3 setup.py install fails with git maint/1.6.x
I just checkout out the 1.6 branch, attempted to install with python3: RefactoringTool: Line 695: You should use a for loop here Running from numpy source directory.Traceback (most recent call last): File setup.py, line 196, in module setup_package() File setup.py, line 170, in setup_package write_version_py() File setup.py, line 117, in write_version_py from numpy.version import git_revision as GIT_REVISION ImportError: cannot import name git_revision Next, I built and installed with python2, with no problems. Then I attempted to install with python3 again, at which point git_revision was importable, presumably because it was provided during the python2 build. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] python3 setup.py install fails with git maint/1.6.x
On Mon, Apr 4, 2011 at 3:31 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Mon, Apr 4, 2011 at 9:15 PM, Darren Dale dsdal...@gmail.com wrote: I just checkout out the 1.6 branch, attempted to install with python3: I hope you mean the 1.6.0b1 tarball, not the current branch head? This problem is (or should have been) fixed. It was the branch HEAD... Just tried again with python3.2 and 1.6.0b2, installs fine. The line it fails on is only reached when a numpy/version.py exists, which is the case for source releases or if you did not clean your local git repo before building. ... but I think this was the case. I just deleted numpy/version.py, and build/, and now everything is ok. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] nearing a milestone
Numpy is nearing a milestone: http://sourceforge.net/projects/numpy/files/NumPy/stats/timeline?dates=2007-09-25+to+2011-04-01 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] nearing a milestone
On Fri, Apr 1, 2011 at 4:08 PM, Benjamin Root ben.r...@ou.edu wrote: Whoosh! Ben Root P.S. -- In case it is needed to be said, that is 1e6 downloads from sourceforge only. NumPy is now on github... The releases are still distributed through sourceforge. Maybe the SciPy2011 Program Committee could prepare a mounted gold record to mark the occasion. I could probably dig up and donate an old copy of the Footloose soundtrack. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] When was the ddof kwarg added to std()?
Does anyone know when the ddof kwarg was added to std()? Has it always been there? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] When was the ddof kwarg added to std()?
On Wed, Mar 16, 2011 at 9:10 AM, Scott Sinclair scott.sinclair...@gmail.com wrote: On 16 March 2011 14:52, Darren Dale dsdal...@gmail.com wrote: Does anyone know when the ddof kwarg was added to std()? Has it always been there? Does 'git log --grep=ddof' help? Yes: March 7, 2008 Thanks ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] core library structure
On Fri, Feb 4, 2011 at 2:23 PM, Lluís xscr...@gmx.net wrote: Darren Dale writes: With generic functions, you wouldn't have to remember to use the ufunc provided by masked array for one type, or the default numpy for another type. Sorry, but I don't see how generic functions should be a better approach compared to redefining methods on masked_array [1]. In both cases you have to define them one-by-one. [1] assuming 'np.foo' and 'ma.foo' (which would now be obsolete) simply call 'instance.foo', which in the ndarray level is the 'foo' ufunc object. That's a bad assumption. np.ndarray.__add__ actually calls the np.add ufunc, not the other way around. For example, np.arcsin is a ufunc that operates on ndarray, yet there is no ndarray.arcsin method. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] core library structure
On Thu, Feb 3, 2011 at 2:07 PM, Mark Wiebe mwwi...@gmail.com wrote: Moving this to a new thread. On Thu, Feb 3, 2011 at 10:50 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Feb 3, 2011 at 11:07 AM, Mark Wiebe mwwi...@gmail.com wrote: [...] Yeah, I understand it's the result of organic growth and merging from many different sources. The core library should probably become layered in a manner roughly as follows, with each layer depending only on the previous APIs. This is what Travis was getting at, I believe, with the generator array idea affecting mainly the Storage and Iteration APIs, generalizing them so that new storage and iteration methods can be plugged in. Data Type API: data type numbers, casting, byte-swapping, etc. Array Storage API: array creation/manipulation/deletion. Array Iteration API: array iterators, copying routines. Array Calculation API: typedefs for various types of calculation functions, common calculation idioms, ufunc creation API, etc. Then, the ufunc module would use the Array Calculation API to implement all the ufuncs and other routines like inner, dot, trace, diag, tensordot, einsum, etc. I like the lower two levels if, as I assume, they are basically aimed at allocating, deallocating blocks of memory (or equivalent) and doing basic manipulations such as dealing with endianess and casting. Where do you see array methods making an appearance? That's correct. Currently, for example, the cast functions take array objects as parameters, something that would no longer be the case. The array methods vs functions only shows up in the Python exposure, I believe. The above structure only affects the C library, and how its exposed to Python could remain as it is now. The original Numeric only had three (IIRC) rather basic methods and everything else was function based, an approach which is probably easier to maintain. The extensive use of methods came from numarray and might be something that could be added at a higher level so that the current ndarrays would be objects combining ow level arrays and ufuncs. Concerning ufuncs: I wonder if we could consider generic functions as a replacement for the current __array_prepare__/__array_wrap__ mechanism. For example, if I have an ndarray, a masked array, and quantity, and I want to multiply the three together, it would be great to be able to do so with two calls to a single mul ufunc. Also, a generic function approach might provide a better mechanism to allow changes to the arrays on their way into the ufunc: import numpy as np import quantities as pq a = [1,2,3]*pq.deg # yields a subclass of ndarray np.sin(a) This is not currently possible with __array_prepare__/__array_wrap__, because __array_prepare__ is called too late by the ufunc to return process the input array and rescale it to radians. I suggested on this list that it might be possible to so with the addition of a *third* method, call it __input_prepare__... at which point Chuck rightly complained that things were getting way out of hand. Imagine if we could do something like @np.sin.register(pq.Quantity) def my_sin(q): return np.sin.default(q.rescale('radians')) np.sin(a) With generic functions, you wouldn't have to remember to use the ufunc provided by masked array for one type, or the default numpy for another type. This is something I have been meaning to revisit on the list for a while (along with the possibility of merging quantities into numpy), but keep forgetting to do so. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] problem with numpy/cython on python3, ok with python2
I just installed numpy for both python2 and 3 from an up-to-date checkout of the 1.5.x branch. I am attempting to cythonize the following code with cython-0.13: --- cimport numpy as np import numpy as np def test(): cdef np.ndarray[np.float64_t, ndim=1] ret ret_arr = np.zeros((20,), dtype=np.float64) ret = ret_arr --- I have this setup.py file: --- from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import numpy setup( cmdclass = {'build_ext': build_ext}, ext_modules = [ Extension( test_open, [test_open.pyx], include_dirs=[numpy.get_include()] ) ] ) --- When I run python setup.py build_ext --inplace, everything is fine. When I run python3 setup.py build_ext --inplace, I get an error: running build_ext cythoning test_open.pyx to test_open.c Error converting Pyrex file to C: ... # For use in situations where ndarray can't replace PyArrayObject*, # like PyArrayObject**. pass ctypedef class numpy.ndarray [object PyArrayObject]: cdef __cythonbufferdefaults__ = {mode: strided} ^ /home/darren/.local/lib/python3.1/site-packages/Cython/Includes/numpy.pxd:173:49: mode is not a buffer option Error converting Pyrex file to C: ... cimport numpy as np import numpy as np def test(): cdef np.ndarray[np.float64_t, ndim=1] ret ^ /home/darren/temp/test/test_open.pyx:6:8: 'ndarray' is not a type identifier building 'test_open' extension gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/home/darren/.local/lib/python3.1/site-packages/numpy/core/include -I/usr/include/python3.1 -c test_open.c -o build/temp.linux-x86_64-3.1/test_open.o test_open.c:1: error: #error Do not use this file, it is the result of a failed Cython compilation. error: command 'gcc' failed with exit status 1 Is this a bug, or is there a problem with my example? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] seeking advice on a fast string-array conversion
Sorry, I accidentally hit send long before I was finished writing. But to answer your question, they contain many 2048-element multi-channel analyzer spectra. Darren On Tue, Nov 16, 2010 at 9:26 AM, william ratcliff william.ratcl...@gmail.com wrote: Actually, I do use spec when I have synchotron experiments. But why are your files so large? On Nov 16, 2010 9:20 AM, Darren Dale dsdal...@gmail.com wrote: I am wrapping up a small package to parse a particular ascii-encoded file format generated by a program we use heavily here at the lab. (In the unlikely event that you work at a synchrotron, and use Certified Scientific's spec program, and are actually interested, the code is currently available at https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/ .) I have been benchmarking the project against another python package developed by a colleague, which is an extension module written in pure C. My python/cython project takes about twice as long to parse and index a file (~0.8 seconds for 100MB), which is acceptable. However, actually converting ascii strings to numpy arrays, which is done using numpy.fromstring, takes a factor of 10 longer than the extension module. So I am wondering about the performance of np.fromstring: import time import numpy as np s = b'1 ' * 2048 *1200 d = time.time() x = np.fromstring(s) print time.time() - d ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] seeking advice on a fast string-array conversion
On Tue, Nov 16, 2010 at 10:31 AM, Darren Dale dsdal...@gmail.com wrote: On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote: [clip] That loop takes 0.33 seconds to execute, which is a good start. I need some help converting this example to return an actual numpy array. Could anyone please offer a suggestion? Easiest way is probably to use ndarray buffers and resize them when needed. For example: https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980 Thank you Pauli. That makes it *incredibly* simple: import time cimport numpy as np import numpy as np cdef extern from 'stdlib.h': double atof(char*) def test(): py_string = '100' cdef char* c_string = py_string cdef int i, j cdef double val i = 0 j = 2048*1200 cdef np.ndarray[np.float64_t, ndim=1] ret ret_arr = np.empty((2048*1200,), dtype=np.float64) ret = ret_arr d = time.time() while ij: c_string = py_string ret[i] = atof(c_string) i += 1 ret_arr.shape = (1200, 2048) print ret_arr, ret_arr.shape, time.time()-d The loop now takes only 0.11 seconds to execute. Thanks again. One follow-up issue: I can't cythonize this code for python-3. I've installed numpy with the most recent changes to the 1.5.x maintenance branch, then re-installed cython-0.13, but when I run python3 setup.py build_ext --inplace with this setup script: from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import numpy setup( cmdclass = {'build_ext': build_ext}, ext_modules = [ Extension( test_open, [test_open.pyx], include_dirs=[numpy.get_include()] ) ] ) I get the following error. Any suggestions what I need to fix, or should I report it to the cython list? $ python3 setup.py build_ext --inplace running build_ext cythoning test_open.pyx to test_open.c Error converting Pyrex file to C: ... # For use in situations where ndarray can't replace PyArrayObject*, # like PyArrayObject**. pass ctypedef class numpy.ndarray [object PyArrayObject]: cdef __cythonbufferdefaults__ = {mode: strided} ^ /Users/darren/.local/lib/python3.1/site-packages/Cython/Includes/numpy.pxd:173:49: mode is not a buffer option Error converting Pyrex file to C: ... cdef char* c_string = py_string cdef int i, j cdef double val i = 0 j = 2048*1200 cdef np.ndarray[np.float64_t, ndim=1] ret ^ /Users/darren/temp/test/test_open.pyx:16:8: 'ndarray' is not a type identifier building 'test_open' extension /usr/bin/gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/darren/.local/lib/python3.1/site-packages/numpy/core/include -I/opt/local/Library/Frameworks/Python.framework/Versions/3.1/include/python3.1 -c test_open.c -o build/temp.macosx-10.6-x86_64-3.1/test_open.o test_open.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. error: command '/usr/bin/gcc-4.2' failed with exit status 1 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] seeking advice on a fast string-array conversion
Apologies, I accidentally hit send... On Tue, Nov 16, 2010 at 9:20 AM, Darren Dale dsdal...@gmail.com wrote: I am wrapping up a small package to parse a particular ascii-encoded file format generated by a program we use heavily here at the lab. (In the unlikely event that you work at a synchrotron, and use Certified Scientific's spec program, and are actually interested, the code is currently available at https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/ .) I have been benchmarking the project against another python package developed by a colleague, which is an extension module written in pure C. My python/cython project takes about twice as long to parse and index a file (~0.8 seconds for 100MB), which is acceptable. However, actually converting ascii strings to numpy arrays, which is done using numpy.fromstring, takes a factor of 10 longer than the extension module. So I am wondering about the performance of np.fromstring: import time import numpy as np s = b'1 ' * 2048 *1200 d = time.time() x = np.fromstring(s, dtype='d', sep=b' ') print time.time() - d That takes about 1.3 seconds on my machine. A similar metric for the extension module is to load 1200 of these 2048-element arrays from the file: d=time.time() x=[s.mca(i+1) for i in xrange(1200)] print time.time()-d That takes about 0.127 seconds on my machine. This discrepancy is unacceptable for my usecase, so I need to develop an alternative to fromstring. Here is bit of testing with cython: import time cdef extern from 'stdlib.h': double atof(char*) py_string = '100' cdef char* c_string = py_string cdef int i, j j=2048*1200 d = time.time() while ij: c_string = py_string val = atof(c_string) i += 1 print val, time.time()-d That loop takes 0.33 seconds to execute, which is a good start. I need some help converting this example to return an actual numpy array. Could anyone please offer a suggestion? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] seeking advice on a fast string-array conversion
I am wrapping up a small package to parse a particular ascii-encoded file format generated by a program we use heavily here at the lab. (In the unlikely event that you work at a synchrotron, and use Certified Scientific's spec program, and are actually interested, the code is currently available at https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/ .) I have been benchmarking the project against another python package developed by a colleague, which is an extension module written in pure C. My python/cython project takes about twice as long to parse and index a file (~0.8 seconds for 100MB), which is acceptable. However, actually converting ascii strings to numpy arrays, which is done using numpy.fromstring, takes a factor of 10 longer than the extension module. So I am wondering about the performance of np.fromstring: import time import numpy as np s = b'1 ' * 2048 *1200 d = time.time() x = np.fromstring(s) print time.time() - d ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] seeking advice on a fast string-array conversion
On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen p...@iki.fi wrote: Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote: [clip] That loop takes 0.33 seconds to execute, which is a good start. I need some help converting this example to return an actual numpy array. Could anyone please offer a suggestion? Easiest way is probably to use ndarray buffers and resize them when needed. For example: https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980 Thank you Pauli. That makes it *incredibly* simple: import time cimport numpy as np import numpy as np cdef extern from 'stdlib.h': double atof(char*) def test(): py_string = '100' cdef char* c_string = py_string cdef int i, j cdef double val i = 0 j = 2048*1200 cdef np.ndarray[np.float64_t, ndim=1] ret ret_arr = np.empty((2048*1200,), dtype=np.float64) ret = ret_arr d = time.time() while ij: c_string = py_string ret[i] = atof(c_string) i += 1 ret_arr.shape = (1200, 2048) print ret_arr, ret_arr.shape, time.time()-d The loop now takes only 0.11 seconds to execute. Thanks again. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] seeking advice on a fast string-array conversion
On Tue, Nov 16, 2010 at 11:46 AM, Christopher Barker chris.bar...@noaa.gov wrote: On 11/16/10 7:31 AM, Darren Dale wrote: On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanenp...@iki.fi wrote: Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote: [clip] That loop takes 0.33 seconds to execute, which is a good start. I need some help converting this example to return an actual numpy array. Could anyone please offer a suggestion? Darren, It's interesting that you found fromstring() so slow -- I've put some time into trying to get fromfile() and fromstring() to be a bit more robust and featurefull, but found it to be some really painful code to work on -- but it didn't dawn on my that it would be slow too! I saw all the layers of function calls, but I still thought that would be minimal compared to the actual string parsing. I guess not. Shows that you never know where your bottlenecks are without profiling. Slow is relative, of course, but since the whole point of fromfile/string is performance (otherwise, we'd just parse with python), it would be nice to get them as fast as possible. I had been thinking that the way to make a good fromfile was Cython, so you've inspired me to think about it some more. Would you be interested in extending what you're doing to a more general purpose tool? Anyway, a comment or two: cdef extern from 'stdlib.h': double atof(char*) One thing I found with the current numpy code is that the use of the ato* functions is a source of a lot of bugs (all of them?) the core problem is error handling -- you have to do a lot of pointer checking to see if a call was successful, and with the fromfile code, that error handling is not done in all the layers of calls. In my case, I am making an assumption about the integrity of the file. Anyone know what the advantage of ato* is over scanf()/fscanf()? Also, why are you doing string parsing rather than parsing the files directly, wouldn't that be a bit faster? Rank inexperience, I guess. I don't understand what you have in mind. scanf/fscanf don't actually convert strings to numbers, do they? I've got some C extension code for simple parsing of text files into arrays of floats or doubles (using fscanf). I'd be curious how the performance compares to what you've got. Let me know if you're interested. I'm curious, yes. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bzr mirror
On Fri, Nov 12, 2010 at 9:42 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, While cleaning up the numpy wiki start page I came across a bzr mirror that still pointed to svn, https://launchpad.net/numpy, originally registered by Jarrod. It would be good to either point that to git or delete it. I couldn't see how to report or do anything about that on Launchpad, but that's maybe just me - I can never find anything there. For now I've removed the link to it on Trac, if the mirror gets updated please put it back. From ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bzr mirror
On Sat, Nov 13, 2010 at 7:27 AM, Darren Dale dsdal...@gmail.com wrote: On Fri, Nov 12, 2010 at 9:42 PM, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, While cleaning up the numpy wiki start page I came across a bzr mirror that still pointed to svn, https://launchpad.net/numpy, originally registered by Jarrod. It would be good to either point that to git or delete it. I couldn't see how to report or do anything about that on Launchpad, but that's maybe just me - I can never find anything there. For now I've removed the link to it on Trac, if the mirror gets updated please put it back. Comment 8 at https://bugs.launchpad.net/launchpad-registry/+bug/38349 : Ask to deactivate the project at https://answers.launchpad.net/launchpad/+addquestion If the project has no data that is useful to the community, it will be deactivated. If the project has code or bugs, the community may still use the project even if the maintainers are no interested in it. Launchpad admins will not deactivate projects that the community can use. Consider transferring maintainership to another user. Note the continued use of deactivate throughout the answer to repeated inquiries of how to delete a project. From https://help.launchpad.net/PrivacyPolicy : Launchpad retains all data submitted by users permanently. Except in the circumstances listed below, Launchpad will only delete data if required to do so by law or if data (including files, PPA submissions, bug reports, bug comments, bug attachments, and translations) is inappropriate. Canonical reserves the right to determine whether data is inappropriate. Spam, malicious code, and defamation are considered inappropriate. Where data is deleted, it will be removed from the Launchpad database but it may continue to exist in backup archives which are maintained by Canonical. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale dsdal...@gmail.com wrote: Hi Chuck, On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris charlesr.har...@gmail.com wrote: I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear... I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared: Configuring line ending conversions How should Git treat line endings in text files? x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows (core.autocrlf is set to true) o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix (core.autocrlf is set to input). o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects (core.autocrlf is set to false) This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the Yes, this would be good information to have in the notes. repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line * text=auto ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository. Might be worth trying in a numpy/.gitconfig just to see what happens. Documentation isn't always complete. Now that I understand the situation a little better, I don't think we would want such a .gitconfig in the repository itself. Most windows users would probably opt for autcrlf=true, but that is definitely not the case for mac and linux users. I've been testing the changes in the pull request this morning on linux, mac and windows, all using git-1.7.3.1. I made a testing branch from whitespace-cleanup and added two files created on windows: temp.txt and tmp.txt. One of them was added to .gitattributes to preserve the crlf in the repo. windows: with autocrlf=true, all files in the working directory are crlf. With autocrlf=false, files marked in .gitattributes for crlf do have crlf, the other files are lf. Check. mac: tested with autocrlf=input. files marked in .gitattributes for crlf have crlf, others are lf. Check. linux (kubuntu 10.10): tested with autocrlf=input and false. All files in the working directory have lf, even those marked for crlf. This is confusing. I copied temp.txt from windows, verified that it still had crlf endings, and copied it into the working directory. Git warns that crlf will be converted to lf, but attempting a commit yields nothing to do. I had to do git add temp.txt, at which point git status tells me the working directory is clean and there is nothing to commit. I'm not too worried about this, its a situation that is unlikely to ever occur in practice. I think I have convinced myself that the pull request is satisfactory. Devs should bear in mind, though, that there is a small risk when committing changes to binary files that git will corrupt such a file by incorrectly identifying and converting crlf to lf. Git should warn when line conversions are going to take place, so they can be disabled for a binary file in .gitattributes: mybinaryfile.dat -text That is all, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
Hi Chuck, On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris charlesr.har...@gmail.com wrote: I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear... I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared: Configuring line ending conversions How should Git treat line endings in text files? x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows (core.autocrlf is set to true) o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix (core.autocrlf is set to input). o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects (core.autocrlf is set to false) This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line * text=auto ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository. And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). (git config -l confirms that core.autocrlf is true.) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep \r\n. If it were after noon where I live, I would be looking for a bottle of whiskey. But its not, so I'll just beat my head against my desk until I've forgotten about this whole episode. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Thu, Oct 28, 2010 at 12:23 PM, josef.p...@gmail.com wrote: On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale dsdal...@gmail.com wrote: And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). (git config -l confirms that core.autocrlf is true.) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep \r\n. If it were after noon where I live, I would be looking for a maybe just something obvious: Did you read the files in python as binary 'rb' ? No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Thu, Oct 28, 2010 at 3:23 PM, josef.p...@gmail.com wrote: On Thu, Oct 28, 2010 at 2:40 PM, Darren Dale dsdal...@gmail.com wrote: On Thu, Oct 28, 2010 at 12:23 PM, josef.p...@gmail.com wrote: On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale dsdal...@gmail.com wrote: And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). (git config -l confirms that core.autocrlf is true.) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep \r\n. If it were after noon where I live, I would be looking for a maybe just something obvious: Did you read the files in python as binary 'rb' ? No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings. The python default for open(filename).read() or open(filename, 'r').read() is to standardize line endings to \n. Although, on a mac: In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';\r\n' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Wed, Oct 27, 2010 at 8:36 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: Hi Darren, 2010/10/19 Darren Dale dsdal...@gmail.com: I have the following set in my ~/.gitconfig file: [apply] whitespace = fix [core] autocrlf = input which is attempting to correct some changes in: branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in Here an excerpt from git-config: core.autocrlf Setting this variable to true is almost the same as setting the text attribute to auto on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed. From git-apply: ``fix`` outputs warnings for a few such errors, and applies the patch after fixing them (strip is a synonym --- the tool used to consider only trailing whitespace characters as errors, and the fix involved stripping them, but modern gits do more). So I think your autocrlf=input makes the .nsi.in file checked out as LF since it's in LF in the repo, and no output conversion is performed due to core.autocrlf=input in your .gitconfigure. So the svg changes must come from the 'fix' value for the whitespace action. I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree? What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors. No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced. This whitespace newline thing is really painful, I suggest you set in your .gitconfig: [core] autocrlf = true I don't think so: Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ? and in our numpy .gitattributes: * text=auto That is already included in the pull request. while the text=auto is more strong and a superset of autocrlf=true. I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo. Can you check this please? Check what? I was near to leaving a comment like asap on github, but since this is so horribly complicated and error-prone ... I'm starting to consider canceling the pull request. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Wed, Oct 27, 2010 at 11:31 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: Hi Darren, 2010/10/27 Darren Dale dsdal...@gmail.com: So the svg changes must come from the 'fix' value for the whitespace action. I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree? What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors. No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced. I agree. But I just guess that the changes of the svgs in your pull request might be not due to eols but due to whitespace fixes. No, it was not. I explicitly checked the svg files before and after, using open(foo.svg).readlines[0], and saw that the files were CRLF before the commit on my branch, and LF after. I think so because in my numpy (current master branch) I cannot see any CRLF there in the repo. Checked with ``* text=auto``, which also affects non-normalised files in the repo. But it might be that the conversion is done silently, although I don't know how to do it like that. So no changed showing up implies no non-LF eol. This whitespace newline thing is really painful, I suggest you set in your .gitconfig: [core] autocrlf = true I don't think so: Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ? Aha, this is a misunderstanding. Somehow I thought you're working on Windows. Is there then a specific reason not to use CRLF? I mean, you can check it in with LF anyway. The page you mentioned is very brief and like a recipe, not my taste. I want to know what's going on in detail. and in our numpy .gitattributes: * text=auto That is already included in the pull request. Yes, I know. I meant to leave the line with the eol=crlf alone. All based on the assumtion that you're working with crlf anyway, so might be wrong. while the text=auto is more strong and a superset of autocrlf=true. I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo. Can you check this please? Check what? My conclusions above. We both know that in this subject all conclusions are pretty error-prone ... I was near to leaving a comment like asap on github, but since this is so horribly complicated and error-prone ... I'm starting to consider canceling the pull request. At least we should check if it's really what we intend. I understand now better why at all you wanted to force the .nsi.in file to be crlf. From your previous posts, i.e. that it would be the default for Win users anyway, I see now that I should have asked. To my understanding the strategy should be two things: 1) LF force in the repo. This is independent from the .nsi.in thing, but missing currently in the official branches. We can do that at the same time. 2) Forcing the .nsi.in file to be crlf in the check-out (and only there) at all times. There is one higher level in $GITDIR, but I think we can ignore that. To (1): The default Win user would check-in *newly created* files currently in CRLF, at least this is what I did with a not-so-recent git some time ago (other repos) When I switched to Mac, all my files were marked changed. afaik git does not do normalisation if you do not tell it to do so. While git normally leaves file contents alone, it can be configured to normalize line endings to LF in the repository and, optionally, to convert them to CRLF when files are checked out. (http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html) I still do not understand why my files showed up changed. They're still crlf, I just copied them, and vim tells [dos]. Please also confirm or show that I'm wrong with my observation of LFCR in your .nsi.in file instead of CRLF (it's swapped). I thought this was already settled. OK, on my whitespace-cleanup branch, I modify .gitattributes to comment out the line about the nsi.in file. I check out the nsi.in file from HEAD, and: In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';\n' Then I do git checkout HEAD^ tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in : In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';\r\n' CRLF, not LFCR. I checked
Re: [Numpy-discussion] ANN: NumPy 1.5.1 release candidate 1
On Sun, Oct 17, 2010 at 7:35 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, I am pleased to announce the availability of the first release candidate of NumPy 1.5.1. This is a bug-fix release with no new features compared to 1.5.0. [...] Please report any other issues on the Numpy-discussion mailing list. Just installed on kubuntu-10.10, python-2.7 and python-3.1.2. Tests look fine for py2.7, but I see datetime errors with py3k: . == ERROR: test_creation (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 10, in test_creation dt1 = np.dtype('M8[750%s]'%unit) TypeError: data type not understood == ERROR: test_creation_overflow (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 62, in test_creation_overflow timesteps = np.array([date], dtype='datetime64[s]')[0].astype(np.int64) TypeError: data type not understood == ERROR: test_divisor_conversion_as (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 58, in test_divisor_conversion_as self.assertRaises(ValueError, lambda : np.dtype('M8[as/10]')) File /usr/lib/python3.1/unittest.py, line 589, in assertRaises callableObj(*args, **kwargs) File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 58, in lambda self.assertRaises(ValueError, lambda : np.dtype('M8[as/10]')) TypeError: data type not understood == ERROR: test_divisor_conversion_bday (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 32, in test_divisor_conversion_bday assert np.dtype('M8[B/12]') == np.dtype('M8[2h]') TypeError: data type not understood == ERROR: test_divisor_conversion_day (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 37, in test_divisor_conversion_day assert np.dtype('M8[D/12]') == np.dtype('M8[2h]') TypeError: data type not understood == ERROR: test_divisor_conversion_fs (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 54, in test_divisor_conversion_fs assert np.dtype('M8[fs/100]') == np.dtype('M8[10as]') TypeError: data type not understood == ERROR: test_divisor_conversion_hour (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 42, in test_divisor_conversion_hour assert np.dtype('m8[h/30]') == np.dtype('m8[2m]') TypeError: data type not understood == ERROR: test_divisor_conversion_minute (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 46, in test_divisor_conversion_minute assert np.dtype('m8[m/30]') == np.dtype('m8[2s]') TypeError: data type not understood == ERROR: test_divisor_conversion_month (test_datetime.TestDateTime) -- Traceback (most recent call last): File /home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py, line 21, in test_divisor_conversion_month assert np.dtype('M8[M/2]') == np.dtype('M8[2W]') TypeError: data type not understood == ERROR: test_divisor_conversion_second
Re: [Numpy-discussion] ANN: NumPy 1.5.1 release candidate 1
On Sun, Oct 24, 2010 at 11:29 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sun, Oct 24, 2010 at 9:22 AM, Darren Dale dsdal...@gmail.com wrote: On Sun, Oct 17, 2010 at 7:35 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, I am pleased to announce the availability of the first release candidate of NumPy 1.5.1. This is a bug-fix release with no new features compared to 1.5.0. [...] Please report any other issues on the Numpy-discussion mailing list. Just installed on kubuntu-10.10, python-2.7 and python-3.1.2. Tests look fine for py2.7, but I see datetime errors with py3k: [...] You may have left over tests in the installation directory. Can you try deleting it and installing again? You're right. Tests are passing. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Thu, Oct 21, 2010 at 9:26 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Oct 21, 2010 at 8:47 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/10/21 David Cournapeau courn...@gmail.com: On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/10/20 Darren Dale dsdal...@gmail.com: On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: Due to Darren's config file the .nsi.in file made it with CRLF into the repo. Uh, no. You mean I'm wrong? Yes, the file has always used CRLF, and needs to stay that way. I see, misunderstanding, for me I used made it in the sense succeeded in :-) So to be clear, I meant that I understood your config file. Btw, it has \n\r, so it's LFCR and not CRLF as it should be on Windows (ref: de.wikipedia). I checked both my understanding of CR/LF as well as used $grep -PU '$\n\r' again. See also http://de.wikipedia.org/wiki/Zeilenumbruch (german, the en version doesn't have the table). So either: 1) You encoded for whatever reason the file with CR and LF swapped Nobody encoded the file in a special manner. It just happens to be a file used on windows, by a windows program, and as such should stay in CR/LF format. I am not sure why you say LF and CR are swapped, I don't see it myself, and vim tells me it is in DOS (e.g. CR/LF) format. 2) It doesn't matter what the order is It does matter. Although text editors are generally smart about line endings, other windows softwares are not. I filed a new pull request, http://github.com/numpy/numpy/pull/7 . This should enforce LF on all text files, with the current exception of the nsi.in file, which is CRLF. The svgs have been converted to LF. Additional, confusing reading can be found at http://help.github.com/dealing-with-lineendings/ , http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html . Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Thu, Oct 21, 2010 at 4:48 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/10/21 Darren Dale dsdal...@gmail.com: I filed a new pull request, http://github.com/numpy/numpy/pull/7 . This should enforce LF on all text files, with the current exception of the nsi.in file, which is CRLF. The svgs have been converted to LF. Additional, confusing reading can be found at http://help.github.com/dealing-with-lineendings/ , http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html . Hm, I like you pull request more than my own branch, but I think your conclusions might be incorrect. ``* text=auto`` forces git to normalise *all* text files, including the .nsi.in file, to LF *in the repo only*. But it says nothing about how to set eol in the working dir. ``[...].nsi.in eol=crlf`` forces git to check-out the .nsi.in file with CRLF. I see. Thank you for clarifying. It probably is not necessary then to have the exception for the nsi.in file, since git will create files with CRLF eols in the working directory on windows by default. The eols in the working directory can be controlled by the core.eol setting, which defaults to native. But unless David C gives his blessing, I will leave the pull request as is. Pretty confusing. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: Due to Darren's config file the .nsi.in file made it with CRLF into the repo. Uh, no. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] whitespace in git repo
On Wed, Oct 20, 2010 at 11:56 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/10/20 Darren Dale dsdal...@gmail.com: On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt friedrichromst...@gmail.com wrote: Due to Darren's config file the .nsi.in file made it with CRLF into the repo. Uh, no. You mean I'm wrong? Due to my config file... nothing. I simply noticed the already-existing CRLF line endings in the repository. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] whitespace in git repo
We have been discussing whitespace and line endings at the following pull request: http://github.com/numpy/numpy/pull/4 . Chuck suggested we discuss it here on the list. I have the following set in my ~/.gitconfig file: [apply] whitespace = fix [core] autocrlf = input which is attempting to correct some changes in: branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in David C. suggested that the nsi.in file should not be changed. I suggested adding a .gitattributes file along with the existing .gitignore file in the numpy repo. This would enforce windows line endings for the nsi.in file: tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in eol=crlf alternatively this would disable any attempt to convert line endings: tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in -text I think the former is preferable. But it seems like a good idea to include some git config files in the repo to ensure trailing whitespace is stripped and line endings are appropriate to the numpy project, regardless of what people may have in their ~/.gitconfig file. Comments? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] test errors in the trunk
On Sat, Jul 31, 2010 at 7:22 AM, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Sat, Jul 31, 2010 at 4:55 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jul 30, 2010 at 13:22, Darren Dale dsdal...@gmail.com wrote: I just upgraded my svn checkout and did a fresh install. When I try to run the test suite, I get a ton of errors: np.test() Running unit tests for numpy NumPy version 2.0.0.dev8550 NumPy is installed in /Users/darren/.local/lib/python2.6/site-packages/numpy Python version 2.6.5 (r265:79063, Jul 19 2010, 09:08:11) [GCC 4.2.1 (Apple Inc. build 5659)] nose version 0.11.3 Reloading numpy.lib Reloading numpy.lib.info Reloading numpy.lib.numpy Reloading numpy Reloading numpy.numpy Reloading numpy.show == [...] File /Users/darren/.local/lib/python2.6/site-packages/numpy/lib/__init__.py, line 23, in module __all__ += type_check.__all__ NameError: name 'type_check' is not defined I checked numpy/lib/__init__.py, and it does a bunch of imports like from type_check import * but not import type_check, which are needed to append to __all__. Not quite. The code does work, as-is, in most situations thanks to a detail of Python's import system. When a submodule is imported in a package, whether through a direct import package.submodule or from submodule import *, Python will take the created module object and assign it into the package.__init__'s namespace with the appropriate name. So while the code doesn't look correct, it usually is correct. The problem is test_getlimits.py: import numpy.lib try: reload(numpy.lib) except NameError: # Py3K import imp imp.reload(numpy.lib) These are causing reloads of the hierarchy under numpy.lib and are presumably interfering with the normal import process (for some reason). Does anyone know why we reload(numpy.lib) here? The log history is unhelpful. It goes back to when this code was in scipy. I suspect that we can just remove it. If no one remembers, can we remove this before the 1.5.0 beta (i.e. tomorrow) so it gets tested enough before the final release? Tested on OS X with python 2.6.5 and 3.1, no problems after removing it. I just committed the change in svn 8568. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] test errors in the trunk
I just upgraded my svn checkout and did a fresh install. When I try to run the test suite, I get a ton of errors: np.test() Running unit tests for numpy NumPy version 2.0.0.dev8550 NumPy is installed in /Users/darren/.local/lib/python2.6/site-packages/numpy Python version 2.6.5 (r265:79063, Jul 19 2010, 09:08:11) [GCC 4.2.1 (Apple Inc. build 5659)] nose version 0.11.3 Reloading numpy.lib Reloading numpy.lib.info Reloading numpy.lib.numpy Reloading numpy Reloading numpy.numpy Reloading numpy.show == [...] File /Users/darren/.local/lib/python2.6/site-packages/numpy/lib/__init__.py, line 23, in module __all__ += type_check.__all__ NameError: name 'type_check' is not defined I checked numpy/lib/__init__.py, and it does a bunch of imports like from type_check import * but not import type_check, which are needed to append to __all__. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about creating numpy arrays
[sorry, my last got cut off] On Thu, May 20, 2010 at 11:37 AM, Darren Dale dsdal...@gmail.com wrote: On Thu, May 20, 2010 at 10:44 AM, Benjamin Root ben.r...@ou.edu wrote: I gave two counterexamples of why. The examples you gave aren't counterexamples. See below... I'm not interested in arguing over semantics. I've discovered an issue with how numpy deals with lists of objects that derive from ndarray, and am concerned about the implications for classes that extend ndarray. On Wed, May 19, 2010 at 7:06 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, May 19, 2010 at 4:19 PM, josef.p...@gmail.com wrote: On Wed, May 19, 2010 at 4:08 PM, Darren Dale dsdal...@gmail.com wrote: I have a question about creation of numpy arrays from a list of objects, which bears on the Quantities project and also on masked arrays: import quantities as pq import numpy as np a, b = 2*pq.m,1*pq.s np.array([a, b]) array([ 12., 1.]) Why doesn't that create an object array? Similarly: Consider the use case of a person creating a 1-D numpy array: np.array([12.0, 1.0]) array([ 12., 1.]) How is python supposed to tell the difference between np.array([a, b]) and np.array([12.0, 1.0]) ? It can't, and there are plenty of times when one wants to explicitly initialize a small numpy array with a few discrete variables. m = np.ma.array([1], mask=[True]) m masked_array(data = [--], mask = [ True], fill_value = 99) np.array([m]) array([[1]]) Again, this is expected behavior. Numpy saw an array of an array, therefore, it produced a 2-D array. Consider the following: np.array([[12, 4, 1], [32, 51, 9]]) I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from that array of arrays. This has broader implications than just creating arrays, for example: np.sum([m, m]) 2 np.sum([a, b]) 13.0 If you wanted sums from each object, there are some better (i.e., more clear) ways to go about it. If you have a predetermined number of numpy-compatible objects, say a, b, c, then you can explicitly call the sum for each one: a_sum = np.sum(a) b_sum = np.sum(b) c_sum = np.sum(c) Which I think communicates the programmer's intention better than (for a numpy array, x, composed of a, b, c): object_sums = np.sum(x) # --- As a numpy user, I would expect a scalar out of this, not an array If you have an arbitrary number of objects (which is what I suspect you have), then one could easily produce an array of sums (for a list, x, of numpy-compatible objects) like so: object_sums = [np.sum(anObject) for anObject in x] Performance-wise, it should be no more or less efficient than having numpy somehow produce an array of sums from a single call to sum. Readability-wise, it makes more sense because when you are treating objects separately, a *list* of them is more intuitive than a numpy.array, which is more-or-less treated as a single mathematical entity. I hope that addresses your concerns. I appreciate the response, but you are arguing that it is not a problem, and I'm certain that it is. It may not be numpy It may not be numpy's problem, I can accept that. But it is definitely a problem for quantities. I'm trying to determine just how big a problem it is. I had hoped that one day quantities might become a part of numpy or scipy, but this appears to be a fundamental issue and it makes me doubt that inclusion would be appropriate. Thank you for the suggestion about calling the sum method instead of numpy's function. That is a reasonable workaround. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] question about creating numpy arrays
I have a question about creation of numpy arrays from a list of objects, which bears on the Quantities project and also on masked arrays: import quantities as pq import numpy as np a, b = 2*pq.m,1*pq.s np.array([a, b]) array([ 12., 1.]) Why doesn't that create an object array? Similarly: m = np.ma.array([1], mask=[True]) m masked_array(data = [--], mask = [ True], fill_value = 99) np.array([m]) array([[1]]) This has broader implications than just creating arrays, for example: np.sum([m, m]) 2 np.sum([a, b]) 13.0 Any thoughts? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about creating numpy arrays
On Wed, May 19, 2010 at 4:19 PM, josef.p...@gmail.com wrote: On Wed, May 19, 2010 at 4:08 PM, Darren Dale dsdal...@gmail.com wrote: I have a question about creation of numpy arrays from a list of objects, which bears on the Quantities project and also on masked arrays: import quantities as pq import numpy as np a, b = 2*pq.m,1*pq.s np.array([a, b]) array([ 12., 1.]) Why doesn't that create an object array? Similarly: m = np.ma.array([1], mask=[True]) m masked_array(data = [--], mask = [ True], fill_value = 99) np.array([m]) array([[1]]) This has broader implications than just creating arrays, for example: np.sum([m, m]) 2 np.sum([a, b]) 13.0 Any thoughts? These are array_like of floats, so why should it create anything else than an array of floats. I gave two counterexamples of why. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments
On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing efir...@hawaii.edu wrote: np.fix() no longer works for scalar arguments: In [1]:import numpy as np In [2]:np.version.version Out[2]:'2.0.0.dev8334' In [3]:np.fix(3.14) --- TypeError Traceback (most recent call last) /home/efiring/ipython console in module() /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x, y) 46 if y is None: 47 y = y1 --- 48 y[...] = nx.where(x = 0, y1, y2) 49 return y 50 TypeError: 'numpy.float64' object does not support item assignment Looks like r8293. Darren? Thanks, I'm looking into it. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments
On Sun, Apr 18, 2010 at 9:08 AM, Darren Dale dsdal...@gmail.com wrote: On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing efir...@hawaii.edu wrote: np.fix() no longer works for scalar arguments: In [1]:import numpy as np In [2]:np.version.version Out[2]:'2.0.0.dev8334' In [3]:np.fix(3.14) --- TypeError Traceback (most recent call last) /home/efiring/ipython console in module() /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x, y) 46 if y is None: 47 y = y1 --- 48 y[...] = nx.where(x = 0, y1, y2) 49 return y 50 TypeError: 'numpy.float64' object does not support item assignment Looks like r8293. Darren? Thanks, I'm looking into it. The old np.fix behavior is different from np.floor and np.ceil. np.fix(3.14) would return array(3.0), while np.floor(3.14) would return 3.0. Shall I fix it to conform with the old but inconsistent behavior of fix? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments
On Sun, Apr 18, 2010 at 9:28 AM, Darren Dale dsdal...@gmail.com wrote: On Sun, Apr 18, 2010 at 9:08 AM, Darren Dale dsdal...@gmail.com wrote: On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing efir...@hawaii.edu wrote: np.fix() no longer works for scalar arguments: In [1]:import numpy as np In [2]:np.version.version Out[2]:'2.0.0.dev8334' In [3]:np.fix(3.14) --- TypeError Traceback (most recent call last) /home/efiring/ipython console in module() /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x, y) 46 if y is None: 47 y = y1 --- 48 y[...] = nx.where(x = 0, y1, y2) 49 return y 50 TypeError: 'numpy.float64' object does not support item assignment Looks like r8293. Darren? Thanks, I'm looking into it. The old np.fix behavior is different from np.floor and np.ceil. np.fix(3.14) would return array(3.0), while np.floor(3.14) would return 3.0. Shall I fix it to conform with the old but inconsistent behavior of fix? I think this is the underlying issue: np.floor(np.array(3.14)) returns 3.0, not array(3.14). The current implementation of fix had already taken care to ensure that it was working with an array for the input. What is numpy's policy here? np.fix returned a len-0 ndarray even for scalar input, floor and ceil return scalars even for len-0 ndarrays. This inconsistency makes it difficult to make even small modifications to the numpy codebase. r8351 includes a one-line change that addresses Eric's report and is commensurate with the previous behavior of fix. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass
On Sat, Mar 27, 2010 at 10:23 PM, josef.p...@gmail.com wrote: subclasses of ndarray, like masked_arrays and quantities, and classes that delegate to array calculations, like pandas, can redefine anything. So there is not much that can be relied on if any subclass is allowed to be used inside a function e.g. quantities redefines sin, cos,... http://packages.python.org/quantities/user/issues.html#umath-functions Those functions were only intended to be used in the short term, until the ufuncs that ship with numpy included a mechanism that allowed quantity arrays to propagate the units. It would be nice to have a mechanism (like we have discussed briefly just recently on this list) where there is a single entry point to a given function like add, but subclasses can tweak the execution. We discussed the possibility of simplifying the wrapping scheme with a method like __handle_gfunc__. (I don't think this necessarily has to be limited to ufuncs.) I think a second method like __prepare_input__ is also necessary. Imagine something like: class GenericFunction: @property def executable(self): return self._executable def __init__(self, executable): self._executable = executable def __call__(self, *args, **kwargs): # find the input with highest priority, and then: args, kwargs = input.__prepare_input__(self, *args, **kwargs) return input.__handle_gfunc__(self, *args, **kwargs) # this is the core function to be passed to the generic class: def _add(a, b, out=None): # the generic, ndarray implementation. ... # here is the publicly exposed interface: add = GenericFunction(_add) # now my subclasses class MyArray(ndarray): # My class tweaks the execution of the function in __handle_gfunc__ def __prepare_input__(self, gfunc, *args, **kwargs): return mod_input[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): res = gfunc.executable(*args, **kwargs) # you could have called a different core func there return mod_output[gfunc](res, *args, **kwargs) class MyNextArray(MyArray): def __prepare_input__(self, gfunc, *args, **kwargs): # let the superclass do its thing: args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs) # now I can tweak it further: return mod_input_further[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): # let's defer to the superclass to handle calling the core function: res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs) # and now we have one more crack at the result before passing it back: return mod_output_further[gfunc](res, *args, **kwargs) If a gfunc is not recognized, the subclass might raise a NotImplementedError or it might just pass the original args, kwargs on through. I didn't write that part out because the example was already running long. But the point is that a single entry point could be used for any subclass, without having to worry about how to support every subclass. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()]
I'd like to use this thread to discuss possible improvements to generalize numpys functions. Sorry for double posting, but we will have a hard time keeping track of discussion about how to improve functions to deal with subclasses if they are spread across threads talking about warnings in masked arrays or masked arrays not dealing well with trapz. There is an additional bit at the end that was not discussed elsewhere. On Thu, Mar 18, 2010 at 8:14 AM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: Just *one* function to rule them all and on the subtype dump it. No __array_wrap__, __input_prepare__, or __array_prepare__, just something like __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing having the ufunc upper layer do nothing but decide which argument type will do all the rest of the work, casting, calling the low level ufunc base, providing buffers, wrapping, etc. Instead of pasting bits and pieces into the existing framework I would like to lay out a line of attack that ends up separating ufuncs into smaller pieces that provide low level routines that work on strided memory while leaving policy implementation to the subtype. There would need to be some default type (ndarray) when the functions are called on nested lists and scalars and I'm not sure of the best way to handle that. I'm just sort of thinking out loud, don't take it too seriously. Thanks for the clarification. I think I see how this could work: if ufuncs were callable instances of classes, __call__ would find the input with highest priority and pass itself and the input to that object's __handle_ufunc__. Now it is up to __handle_ufunc__ to determine whether and how to modify the input, call some method on the ufunc (like execute) to perform the buffer operation, then __handle_ufunc__ performs the cast, deals with metadata and returns the result. I skipped a step: initializing the output buffer. Would that be rolled into the ufunc execution, or should it be possible for __handle_ufunc__ to access the initialized buffer before execution occurs(__array_prepare__)? I think it is important to be able to perform the cast and calculate metadata before ufunc execution. If an error occurs, an exception can be raised before the ufunc operates on the arrays, which can modifies the data in place. We discussed the possibility of simplifying the wrapping scheme with a method like __handle_gfunc__. (I don't think this necessarily has to be limited to ufuncs.) I think a second method like __prepare_input__ is also necessary. Imagine something like: class GenericFunction: @property def executable(self): return self._executable def __init__(self, executable): self._executable = executable def __call__(self, *args, **kwargs): # find the input with highest priority, and then: args, kwargs = input.__prepare_input__(self, *args, **kwargs) return input.__handle_gfunc__(self, *args, **kwargs) # this is the core function to be passed to the generic class: def _add(a, b, out=None): # the generic, ndarray implementation. ... # here is the publicly exposed interface: add = GenericFunction(_add) # now my subclasses class MyArray(ndarray): # My class tweaks the execution of the function in __handle_gfunc__ def __prepare_input__(self, gfunc, *args, **kwargs): return mod_input[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): res = gfunc.executable(*args, **kwargs) # you could have called a different core func there return mod_output[gfunc](res, *args, **kwargs) class MyNextArray(MyArray): def __prepare_input__(self, gfunc, *args, **kwargs): # let the superclass do its thing: args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs) # now I can tweak it further: return mod_input_further[gfunc](*args, **kwargs) def __handle_gfunc__(self, gfunc, *args, **kwargs): # let's defer to the superclass to handle calling the core function: res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs) # and now we have one more crack at the result before passing it back: return mod_output_further[gfunc](res, *args, **kwargs) If a gfunc is not recognized, the subclass might raise a NotImplementedError or it might just pass the original args, kwargs on through. I didn't write that part out because the example was already running long. But the point is that a single entry point could be used for any subclass, without having to worry about how to support every subclass. It may still be necessary to be mindful to use asanyarray in the core functions, but if a subclass alters the behavior of some operation such that an operation needs to happen on an ndarray view of the data, __prepare_input__ provides an opportinuty to prepare such views. For example, in our
[Numpy-discussion] should ndarray implement __round__ for py3k?
A simple test in python 3: import numpy as np round(np.arange(10)) Traceback (most recent call last): File stdin, line 1, in module TypeError: type numpy.ndarray doesn't define __round__ method Here is some additional context: http://bugs.python.org/issue7261 Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris What bothers me here is the opposing desire to separate ufuncs from their ndarray dependency, having them operate on buffer objects instead. As I see it ufuncs would be split into layers, with a lower layer operating on buffer objects, and an upper layer tying them together with ndarrays where the business logic -- kinds, casting, etc -- resides. It is in that upper layer that what you are proposing would reside. Mind, I'm not sure that having matrices and masked arrays subclassing ndarray was the way to go, but given that they do one possible solution is to dump the whole mess onto the subtype with the highest priority. That subtype would then be responsible for casts and all the other stuff needed for the call and wrapping the result. There could be library routines to help with that. It seems to me that that would be the most general way to go. In that sense ndarrays themselves would just be another subtype with especially low priority. I'm sorry, I didn't understand your point. What you described sounds identical to how things are currently done. What distinction are you making, aside from operating on the buffer object? How would adding a method to modify the input to a ufunc complicate the situation? Just *one* function to rule them all and on the subtype dump it. No __array_wrap__, __input_prepare__, or __array_prepare__, just something like __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing having the ufunc upper layer do nothing but decide which argument type will do all the rest of the work, casting, calling the low level ufunc base, providing buffers, wrapping, etc. Instead of pasting bits and pieces into the existing framework I would like to lay out a line of attack that ends up separating ufuncs into smaller pieces that provide low level routines that work on strided memory while leaving policy implementation to the subtype. There would need to be some default type (ndarray) when the functions are called on nested lists and scalars and I'm not sure of the best way to handle that. I'm just sort of thinking out loud, don't take it too seriously. This is a seemingly simplified approach. I was taken with it last night but then I remembered that it will make subclassing difficult. A simple example can illustrate the problem. We have MaskedArray, which needs to customize some functions that operate on arrays or buffers, so we pass the function and the arguments to __handle_ufunc__ and it takes care of the whole shebang. But now I develop a MaskedQuantity that takes masked arrays and gives them the ability to handle units, and so it needs to customize those functions further. Maybe MaskedQuantity can modify the input passed to its __handle_ufunc__ and then pass everything on to super().__handle_ufunc__, such that MaskedQuantity does not have to reimplement MaskedArray's customizations to that particular function, but that is not enough flexibility for the general case. If a my subclass needs to call the low-level ufunc base, it can't rely on the superclass.__handle_ufunc__ because it *also* calls the ufunc base, so my subclass has to reimplement all of the superclass function customizations. The current scheme (__input_prepare__, ...) is better able to handle subclassing, although I agree that it could be improved. If the subclasses were responsible for calling the ufunc base, alternative bases could be provided (like the c routines for masked arrays). That still seems to require the high-level function to provide three or four entry points: 1) modify the input, 2) initialize the output (chance to deal with metadata), 3) call the function base, 4) finalize the output (deal with metadata that requires the ufunc results). Perhaps 2 and 4 would not both be needed, I'm not sure. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()]
On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM pgmdevl...@gmail.com wrote: On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. I think it is better to keep them separate, so we don't have one method that is trying to do too much. It would be easier to explain in the documentation. I may not have much time to look into this until after Monday. Is there a deadline we need to consider? I don't think this should go into 2.0, I think it needs more thought. Now that you mention it, I agree that it would be too rushed to try to get it in for 2.0. Concerning a later release, is there anything in particular that you think needs to be clarified or reconsidered? And 2.0 already has significant code churn. Is there any reason beyond a big hassle not to set/restore the error state around all the ufunc calls in ma? Beyond that, the PEP that you pointed to looks interesting. Maybe some sort of decorator around ufunc calls could also be made to work. I think the PEP is interesting, but it is languishing. There were some questions and criticisms on the mailing list that I do not think were satisfactorily addressed, and as far as I know the author of the PEP has not pursued the matter further. There was some interest on the python-dev mailing list in the numpy community's use case, but I think we need to consider what can be done now to meet the needs of ndarray subclasses. I don't see PEP 3124 happening in the near future. What I am proposing is a simple extension to our existing framework to let subclasses hook into ufuncs and customize their behavior based on the context of the operation (using the __array_priority__ of the inputs and/or outputs, and the identity of the ufunc). The steps I listed allow customization at the critical steps: prepare the input, prepare the output, populate the output (currently no proposal for customization here), and finalize the output. The only additional step proposed is to prepare the input. What bothers me here is the opposing desire to separate ufuncs from their ndarray dependency, having them operate on buffer objects instead. As I see it ufuncs would be split into layers, with a lower layer operating on buffer objects, and an upper layer tying them together with ndarrays where the business logic -- kinds, casting, etc -- resides. It is in that upper layer that what you are proposing would reside. Mind, I'm not sure that having matrices and masked arrays subclassing ndarray was the way to go, but given that they do one possible solution is to dump the whole mess onto the subtype with the highest priority. That subtype would then be responsible for casts and all the other stuff needed for the call and wrapping the result. There could be library routines to help with that. It seems to me that that would be the most general way to go. In that sense ndarrays themselves would just be another subtype with especially low priority. I'm sorry, I didn't understand your point. What you described sounds identical to how things are currently done. What distinction are you making, aside from operating on the buffer object? How would adding a method to modify the input to a ufunc complicate the situation? Just *one* function to rule them all and on the subtype dump
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Thu, Mar 18, 2010 at 5:12 PM, Eric Firing efir...@hawaii.edu wrote: Ryan May wrote: On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker chris.bar...@noaa.gov wrote: Gael Varoquaux wrote: On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote: sure -- that's kind of my point -- if EVERY numpy array were (potentially) masked, then folks would write code to deal with them appropriately. That's pretty much saying: I have a complicated problem and I want every one else to have to deal with the full complexity of it, even if they have a simple problem. Well -- I did say it was a fantasy... But I disagree -- having invalid data is a very common case. What we have now is a situation where we have two parallel systems, masked arrays and regular arrays. Each time someone does something new with masked arrays, they often find another missing feature, and have to solve that. Also, the fact that masked arrays are tacked on means that performance suffers. Case in point, I just found a bug in np.gradient where it forces the output to be an ndarray. (http://projects.scipy.org/numpy/ticket/1435). Easy fix that doesn't actually require any special casing for masked arrays, just making sure to use the proper function to create a new array of the same subclass as the input. However, now for any place that I can't patch I have to use a custom function until a fixed numpy is released. Maybe universal support for masked arrays (and masking invalid points) is a pipe dream, but every function in numpy should IMO deal properly with subclasses of ndarray. 1) This can't be done in general because subclasses can change things to the point where there is little one can count on. The matrix subclass, for example, redefines multiplication and iteration, making it difficult to write functions that will work for ndarrays or matrices. I'm more optimistic that it can be done in general, if we provide a mechanism where the subclass with highest priority can customize the execution of the function (ufunc or not). In principle, the subclass could even override the buffer operation, like in the case of matrices. It still can put a lot of responsibility on the authors of the subclass, but what is gained is a framework where np.add (for example) could yield the appropriate result for any subclass, as opposed to the current situation of needing to know which add function can be used for a particular type of input. All speculative, of course. I'll start throwing some examples together when I get a chance. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM pgmdevl...@gmail.com wrote: All, As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as invalid value in These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. I thought I could use the new __array_prepare__ method to intercept the call of a standard ufunc. After actual testing, that can't work. __array_prepare only help to prepare the *output* of the operation, not to change the input on the fly, just for this operation. Actually, you can modify the input in place, but it's usually not what you want. That is correct, __array_prepare__ is called just after the output array is created, but before the ufunc actually gets down to business. I have the same limitation in quantities you are now seeing with masked array, in my case I want the opportunity to rescale different but compatible quantities for the operation (without changing the original arrays in place, of course). Then, I tried to use __array_prepare__ to store the current error status in the input, force it to ignore divide/invalid errors and send the input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the error status, but as far as I understand, the ufunc is called is still called with the original error status. That means that if something goes wrong, your error status can stay stuck. Not a good idea either. I'm running out of ideas at this point. For the test suite, I'd suggest to disable the warnings in test_fix_invalid and test_basic_arithmetic. An additional issue is that if one of the error status is set to 'raise', the numpy ufunc will raise the exception (as expected), while its numpy.ma version will not. I'll put also a warning in the docs to that effect. Please send me your comments before I commit any changes. I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 10:11 AM, Ryan May rma...@gmail.com wrote: On Wed, Mar 17, 2010 at 7:19 AM, Darren Dale dsdal...@gmail.com wrote: Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Sounds like the textbook use case for the python 2.5/2.6 context manager. Pity we can't use it yet... (and I'm not sure it'd be easy to wrap around the calls here.) I don't think context managers would work. They would be implemented in one of the subclasses special methods and would thus go out of scope before the ufunc got around to performing the calculation that required the change in state. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 10:45 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM pgmdevl...@gmail.com wrote: All, As you're probably aware, the current test suite for numpy.ma raises some nagging warnings such as invalid value in These warnings are only issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the masked versions of the ufuncs temporarily set the numpy error status to 'ignore' before the operation takes place, and reset the status to its original value. I thought I could use the new __array_prepare__ method to intercept the call of a standard ufunc. After actual testing, that can't work. __array_prepare only help to prepare the *output* of the operation, not to change the input on the fly, just for this operation. Actually, you can modify the input in place, but it's usually not what you want. That is correct, __array_prepare__ is called just after the output array is created, but before the ufunc actually gets down to business. I have the same limitation in quantities you are now seeing with masked array, in my case I want the opportunity to rescale different but compatible quantities for the operation (without changing the original arrays in place, of course). Then, I tried to use __array_prepare__ to store the current error status in the input, force it to ignore divide/invalid errors and send the input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the error status, but as far as I understand, the ufunc is called is still called with the original error status. That means that if something goes wrong, your error status can stay stuck. Not a good idea either. I'm running out of ideas at this point. For the test suite, I'd suggest to disable the warnings in test_fix_invalid and test_basic_arithmetic. An additional issue is that if one of the error status is set to 'raise', the numpy ufunc will raise the exception (as expected), while its numpy.ma version will not. I'll put also a warning in the docs to that effect. Please send me your comments before I commit any changes. I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. I'm not a masked array user and not familiar with the specific problems here, but as an outsider it's beginning to look like one little fix after another. Yeah, I was concerned that criticism would come up. Is there some larger framework that would help here? I think there is: http://www.python.org/dev/peps/pep-3124/ Changes to the ufuncs themselves? Perhaps, if ufuncs were instances of a class that implemented __call__, it would be easier to include context management. Maybe this approach could be coupled with input_prepare, array_prepare and array_wrap to provide everything we need. There was some code for masked ufuncs on the c level posted a while back that I thought was interesting, would it help to have masked masked versions of the ufuncs? I think we need a solution that avoids implementing an entirely new set of ufuncs for specific subclasses. So on and so forth. It just looks like a larger design issue needs to be addressed here. I'm interested to hear other people's perspectives or suggestions. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM pgmdevl...@gmail.com wrote: On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. I think it is better to keep them separate, so we don't have one method that is trying to do too much. It would be easier to explain in the documentation. I may not have much time to look into this until after Monday. Is there a deadline we need to consider? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM pgmdevl...@gmail.com wrote: On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. I think it is better to keep them separate, so we don't have one method that is trying to do too much. It would be easier to explain in the documentation. I may not have much time to look into this until after Monday. Is there a deadline we need to consider? I don't think this should go into 2.0, I think it needs more thought. Now that you mention it, I agree that it would be too rushed to try to get it in for 2.0. Concerning a later release, is there anything in particular that you think needs to be clarified or reconsidered? And 2.0 already has significant code churn. Is there any reason beyond a big hassle not to set/restore the error state around all the ufunc calls in ma? Beyond that, the PEP that you pointed to looks interesting. Maybe some sort of decorator around ufunc calls could also be made to work. I think the PEP is interesting, but it is languishing. There were some questions and criticisms on the mailing list that I do not think were satisfactorily addressed, and as far as I know the author of the PEP has not pursued the matter further. There was some interest on the python-dev mailing list in the numpy community's use case, but I think we need to consider what can be done now to meet the needs of ndarray subclasses. I don't see PEP 3124 happening in the near future. What I am proposing is a simple extension to our existing framework to let subclasses hook into ufuncs and customize their behavior based on the context of the operation (using the __array_priority__ of the inputs and/or outputs, and the identity of the ufunc). The steps I listed allow customization at the critical steps: prepare the input, prepare the output, populate the output (currently no proposal for customization here), and finalize the output. The only additional step proposed is to prepare the input. In the long run, we could consider if ufuncs should be instances of a class, perhaps implemented in Cython. This way the ufunc will be able to pass itself to the special array methods as part of the context tuple, as is currently done. Maybe an alternative approach would be for ufuncs to provide methods where subclasses could register routines for the various steps I specified based on the types of the inputs, similar to the PEP. This way, the ufunc would determine the context based on the input (rather than the current way of the ufunc determining part of the context based on the input by inspecting __array_priority__ and then the input with highest priority determining the context based on the identity of the ufunc and the rest of the input.) This new (half baked) approach could be backward-compatible with the old one: if the combination of inputs isn't found in the registry, it would fall back on the existing input-/array_prepare array_wrap mechanisms (which in principle could then be deprecated, and at that point __array_priority__ might no longer be necessary). I don't see anything to indicate that we would regret implementing a special __input_prepare__ method down the road. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Warnings in numpy.ma.test()
On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale dsdal...@gmail.com wrote: On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM pgmdevl...@gmail.com wrote: On Mar 17, 2010, at 8:19 AM, Darren Dale wrote: I started thinking about a third method called __input_prepare__ that would be called on the way into the ufunc, which would allow you to intercept the input and pass a somehow modified copy back to the ufunc. The total flow would be: 1) Call myufunc(x, y[, z]) 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x', y' (or simply passes through x,y by default) 3) myufunc creates the output array z (if not specified) and calls ?.__array_prepare__(z, (myufunc, x, y, ...)) 4) myufunc finally gets around to performing the calculation 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns the result to the caller Is this general enough for your use case? I haven't tried to think about how to change some global state at one point and change it back at another, that seems like a bad idea and difficult to support. Sounds like a good plan. If we could find a way to merge the first two (__input_prepare__ and __array_prepare__), that'd be ideal. I think it is better to keep them separate, so we don't have one method that is trying to do too much. It would be easier to explain in the documentation. I may not have much time to look into this until after Monday. Is there a deadline we need to consider? I don't think this should go into 2.0, I think it needs more thought. Now that you mention it, I agree that it would be too rushed to try to get it in for 2.0. Concerning a later release, is there anything in particular that you think needs to be clarified or reconsidered? And 2.0 already has significant code churn. Is there any reason beyond a big hassle not to set/restore the error state around all the ufunc calls in ma? Beyond that, the PEP that you pointed to looks interesting. Maybe some sort of decorator around ufunc calls could also be made to work. I think the PEP is interesting, but it is languishing. There were some questions and criticisms on the mailing list that I do not think were satisfactorily addressed, and as far as I know the author of the PEP has not pursued the matter further. There was some interest on the python-dev mailing list in the numpy community's use case, but I think we need to consider what can be done now to meet the needs of ndarray subclasses. I don't see PEP 3124 happening in the near future. What I am proposing is a simple extension to our existing framework to let subclasses hook into ufuncs and customize their behavior based on the context of the operation (using the __array_priority__ of the inputs and/or outputs, and the identity of the ufunc). The steps I listed allow customization at the critical steps: prepare the input, prepare the output, populate the output (currently no proposal for customization here), and finalize the output. The only additional step proposed is to prepare the input. What bothers me here is the opposing desire to separate ufuncs from their ndarray dependency, having them operate on buffer objects instead. As I see it ufuncs would be split into layers, with a lower layer operating on buffer objects, and an upper layer tying them together with ndarrays where the business logic -- kinds, casting, etc -- resides. It is in that upper layer that what you are proposing would reside. Mind, I'm not sure that having matrices and masked arrays subclassing ndarray was the way to go, but given that they do one possible solution is to dump the whole mess onto the subtype with the highest priority. That subtype would then be responsible for casts and all the other stuff needed for the call and wrapping the result. There could be library routines to help with that. It seems to me that that would be the most general way to go. In that sense ndarrays themselves would just be another subtype with especially low priority. I'm sorry, I didn't understand your point. What you described sounds identical to how things are currently done. What distinction are you making, aside from operating on the buffer object? How would adding a method to modify the input to a ufunc complicate the situation? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] subclassing ndarray in python3
Now that the trunk has some support for python3, I am working on making Quantities work with python3 as well. I'm running into some problems related to subclassing ndarray that can be illustrated with a simple script, reproduced below. It looks like there is a problem with the reflected operations, I see problems with __rmul__ and __radd__, but not with __mul__ and __add__: import numpy as np class A(np.ndarray): def __new__(cls, *args, **kwargs): return np.ndarray.__new__(cls, *args, **kwargs) class B(A): def __mul__(self, other): return self.view(A).__mul__(other) def __rmul__(self, other): return self.view(A).__rmul__(other) def __add__(self, other): return self.view(A).__add__(other) def __radd__(self, other): return self.view(A).__radd__(other) a = A((10,)) b = B((10,)) print('A __mul__:') print(a.__mul__(2)) # ok print(a.view(np.ndarray).__mul__(2)) # ok print(a*2) # ok print('A __rmul__:') print(a.__rmul__(2)) # yields NotImplemented print(a.view(np.ndarray).__rmul__(2)) # yields NotImplemented print(2*a) # ok !!?? print('B __mul__:') print(b.__mul__(2)) # ok print(b.view(A).__mul__(2)) # ok print(b.view(np.ndarray).__mul__(2)) # ok print(b*2) # ok print('B __add__:') print(b.__add__(2)) # ok print(b.view(A).__add__(2)) # ok print(b.view(np.ndarray).__add__(2)) # ok print(b+2) # ok print('B __rmul__:') print(b.__rmul__(2)) # yields NotImplemented print(b.view(A).__rmul__(2)) # yields NotImplemented print(b.view(np.ndarray).__rmul__(2)) # yields NotImplemented print(2*b) # yields: TypeError: unsupported operand type(s) for *: 'int' and 'B' print('B __radd__:') print(b.__radd__(2)) # yields NotImplemented print(b.view(A).__radd__(2)) # yields NotImplemented print(b.view(np.ndarray).__radd__(2)) # yields NotImplemented print(2+b) # yields: TypeError: unsupported operand type(s) for +: 'int' and 'B' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] subclassing ndarray in python3
Hi Pauli, On Thu, Mar 11, 2010 at 3:38 PM, Pauli Virtanen p...@iki.fi wrote: Thanks for testing. I wish the test suite was more complete (hint! hint! :) I'll be happy to contribute, but lately I get a few 15-30 minute blocks a week for this kind of work (hence the short attempt to work on Quantities this morning), and its not likely to let up for about 3 weeks. Yes, probably explicitly defining __rmul__ for ndarray could be the right solution. Please file a bug report on this. Done: http://projects.scipy.org/numpy/ticket/1426 Cheers, and *thank you* for all you have already done to support python-3, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Fri, Feb 12, 2010 at 12:16 AM, David Cournapeau da...@silveregg.co.jp wrote: Charles R Harris wrote: I don't see any struct definitions there, it looks clean. Any struct defined outside numpy/core/include is fine to change at will as far as ABI is concerned anyway, so no need to check anything :) Thanks for the clarification. I just double checked the svn diff (r7308), and I did not touch anything in numpy/core/include. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
2010/2/11 Stéfan van der Walt ste...@sun.ac.za: On 11 February 2010 09:52, Charles R Harris charlesr.har...@gmail.com wrote: Simple, eh. The version should be 2.0. I'm going with the element of least surprise: no one will be surprised when 1.5 is released with ABI changes I'll buy you a doughnut if that turns out to be correct. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Thu, Feb 11, 2010 at 11:22 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Feb 11, 2010 at 8:12 PM, David Cournapeau da...@silveregg.co.jp wrote: Charles R Harris wrote: On Thu, Feb 11, 2010 at 7:00 PM, David Cournapeau da...@silveregg.co.jp mailto:da...@silveregg.co.jp wrote: josef.p...@gmail.com mailto:josef.p...@gmail.com wrote: scipy is relatively easy to compile, I was thinking also of h5py, pytables and pymc (b/c of pytables), none of them are importing with numpy 1.4.0 because of the cython issue. As I said, all of them will have to be regenerated with cython 0.12.1. There is no other solution, Wait, won't the structures be the same size? If they are then the cython check won't fail. Yes, but the structures are bigger (even after removing the datetime stuff, I had the cython warning when I did some tests). That's curious. It sounds like it isn't ABI compatible yet. Any idea of what was added? It would be helpful if the cython message gave a bit more information... Could it be related to __array_prepare__? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Thu, Feb 11, 2010 at 11:57 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Feb 11, 2010 at 9:39 PM, Darren Dale dsdal...@gmail.com wrote: On Thu, Feb 11, 2010 at 11:22 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Feb 11, 2010 at 8:12 PM, David Cournapeau da...@silveregg.co.jp wrote: Charles R Harris wrote: On Thu, Feb 11, 2010 at 7:00 PM, David Cournapeau da...@silveregg.co.jp mailto:da...@silveregg.co.jp wrote: josef.p...@gmail.com mailto:josef.p...@gmail.com wrote: scipy is relatively easy to compile, I was thinking also of h5py, pytables and pymc (b/c of pytables), none of them are importing with numpy 1.4.0 because of the cython issue. As I said, all of them will have to be regenerated with cython 0.12.1. There is no other solution, Wait, won't the structures be the same size? If they are then the cython check won't fail. Yes, but the structures are bigger (even after removing the datetime stuff, I had the cython warning when I did some tests). That's curious. It sounds like it isn't ABI compatible yet. Any idea of what was added? It would be helpful if the cython message gave a bit more information... Could it be related to __array_prepare__? Didn't __array_prepare__ go into 1.3? Did you add anything to a structure? No, it was included in 1.4: http://svn.scipy.org/svn/numpy/trunk/doc/release/1.4.0-notes.rst No, I don't think so. I added __array_prepare__ to array_methods[] in this file: http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/multiarray/methods.c Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Wed, Feb 10, 2010 at 3:31 PM, Travis Oliphant oliph...@enthought.com wrote: On Feb 8, 2010, at 4:08 PM, Darren Dale wrote: I definitely should have counted to 100 before sending that. It wasn't helpful and I apologize. I actually found this quite funny. I need to apologize if my previous email sounded like I was trying to silence other opinions, somehow. As Robert alluded to in a rather well-written email that touched on resolving disagreements, it can be hard to communicate that you are listening to opposing views despite the fact that your opinion has not changed. For what its worth, I feel I have had ample opportunity to make my concerns known, and at this point will leave it to others to do right by the numpy user community. We have a SciPy steering committee that should be reviewed again this year at the SciPy conference. As Robert said, we prefer not to have to use it to decide questions. I think it has been trotted out as a place holder for a NumPy steering committee which has never really existed as far as I know. NumPy decisions in the past have been made by me and other people who are writing the code. I think we have tried pretty hard to listen to all points of view before doing anything. Just a comment: I would like to point out that there is (necessarily) some arbitrary threshold to who is being recognized as people who are actively writing the code. Over the last year, I have posted fixes for multiple bugs and extended the ufunc wrapping mechanisms (__array_prepare__) which were included in numpy-1.4.0, and have also been developing the quantities package, which is intimately tied up with numpy's development. I don't think that makes me a major contributor like you or Chuck etc., but I am heavily invested in numpy's development and an active contributor. Maybe it would be worth considering an approach where the numpy user community occasionally nominates a few people to serve on some kind of steering committee along with the developers. Although if there is interest in or criticism of this idea, I don't think this is the right thread to discuss it. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 5:05 PM, Jarrod Millman mill...@berkeley.edu wrote: On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris charlesr.har...@gmail.com wrote: Should the release containing the datetime/hasobject changes be called a) 1.5.0 b) 2.0.0 My vote goes to b. You don't matter. Nor do I. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 5:05 PM, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 5:05 PM, Jarrod Millman mill...@berkeley.edu wrote: On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris charlesr.har...@gmail.com wrote: Should the release containing the datetime/hasobject changes be called a) 1.5.0 b) 2.0.0 My vote goes to b. You don't matter. Nor do I. I definitely should have counted to 100 before sending that. It wasn't helpful and I apologize. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern robert.k...@gmail.com wrote: Here's the problem that I don't think many people appreciate: logical arguments suck just as much as personal experience in answering these questions. You can make perfectly structured arguments until you are blue in the face, but without real data to premise them on, they are no better than the gut feelings. They can often be significantly worse if the strength of the logic gets confused with the strength of the premise. If I recall correctly, the convention of not breaking ABI compatibility in minor releases was established in response to the last ABI compatibility break. Am I wrong? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 18:43, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern robert.k...@gmail.com wrote: Here's the problem that I don't think many people appreciate: logical arguments suck just as much as personal experience in answering these questions. You can make perfectly structured arguments until you are blue in the face, but without real data to premise them on, they are no better than the gut feelings. They can often be significantly worse if the strength of the logic gets confused with the strength of the premise. If I recall correctly, the convention of not breaking ABI compatibility in minor releases was established in response to the last ABI compatibility break. Am I wrong? I'm not sure how this relates to the material quoted of me, but no, you're not wrong. Just trying to provide historical context to support the strength of the premise. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 20:50, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 18:43, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern robert.k...@gmail.com wrote: Here's the problem that I don't think many people appreciate: logical arguments suck just as much as personal experience in answering these questions. You can make perfectly structured arguments until you are blue in the face, but without real data to premise them on, they are no better than the gut feelings. They can often be significantly worse if the strength of the logic gets confused with the strength of the premise. If I recall correctly, the convention of not breaking ABI compatibility in minor releases was established in response to the last ABI compatibility break. Am I wrong? I'm not sure how this relates to the material quoted of me, but no, you're not wrong. Just trying to provide historical context to support the strength of the premise. The existence of the policy is not under question (anymore; I settled that with old email a while ago). The question is whether to change the policy. So I have gathered. I question whether the concerns that lead to that decision in the first place are somehow less important now. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Mon, Feb 8, 2010 at 10:35 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Feb 8, 2010 at 8:27 PM, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 10:24 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 21:23, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 20:50, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern robert.k...@gmail.com wrote: On Mon, Feb 8, 2010 at 18:43, Darren Dale dsdal...@gmail.com wrote: On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern robert.k...@gmail.com wrote: Here's the problem that I don't think many people appreciate: logical arguments suck just as much as personal experience in answering these questions. You can make perfectly structured arguments until you are blue in the face, but without real data to premise them on, they are no better than the gut feelings. They can often be significantly worse if the strength of the logic gets confused with the strength of the premise. If I recall correctly, the convention of not breaking ABI compatibility in minor releases was established in response to the last ABI compatibility break. Am I wrong? I'm not sure how this relates to the material quoted of me, but no, you're not wrong. Just trying to provide historical context to support the strength of the premise. The existence of the policy is not under question (anymore; I settled that with old email a while ago). The question is whether to change the policy. So I have gathered. I question whether the concerns that lead to that decision in the first place are somehow less important now. And we're back to gut feeling territory again. That's unfair. I can't win based on gut, you know how skinny I am. __ We haven't reached the extreme of the two physicists at SLAC who stepped outside to settle a point with fisticuffs. But with any luck we will get there ;) Really? That also happened here at CHESS a long time ago, only they didn't go outside to fight over who got to use the conference room. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Sat, Feb 6, 2010 at 10:16 PM, Travis Oliphant oliph...@enthought.com wrote: I will just work on trunk and assume that the next release will be ABI incompatible. At this point I would rather call the next version 1.5 than 2.0, though. When the date-time work is completed, then we could release an ABI-compatible-with-1.5 version 2.0. There may be repercussions if numpy starts deviating from its own conventions for what versions may introduce ABI incompatibilities. I attended a workshop recently where a number of scientists approached me and expressed interest in switching from IDL to python. Two of these were senior scientists leading large research groups and collaborations, both of whom had looked at python several years ago and decided they did not like the wild west nature (direct quote) of the scientific python community. I assured them that both the projects and community were maturing. At the time, I did not have to explain the situation concerning numpy-1.4.0, which, if it causes problems when they try to set up an environment to assess python, could put them off python for another 3 years, maybe even for good. It would be a lot easier to justify the disruption if one could say numpy-2.0 added support for some important features, so this disruption was unfortunate but necessary. Such disruptions are specified by major version changes, which as you can see are rare. In fact, there are no further major version changes envisioned at this time. That kind of statement might reassure a lot of people, including package maintainers etc. Regards, Darren P.S. I promise this will be my last post on the subject. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
I'm breaking my promise, after people wrote me offlist encouraging me to keep pushing my point of view. On Sun, Feb 7, 2010 at 8:23 PM, David Cournapeau da...@silveregg.co.jp wrote: Jarrod Millman wrote: Just to be clear, I would prefer to see the ABI-breaking release be called 2.0. I don't see why we have to get the release out in three weeks, though. I think it would be better to use this opportunity to take some time to make sure we get it right. As a compromise, what about the following: - remove ABI-incompatible changes for 1.4.x - release a 1.5.0 marked as experimental, with everything that Travis wants to put in. It would be a preview for python 3k as well, so it conveys the idea that it is experimental pretty well. Why can't this be called 2.0beta, with a __version__ like 1.9.96? I don't understand the reluctance to follow numpy's own established conventions. - the 1.6.x branch would be a polished 1.5.x. This could be called that 2.0.x instead of 1.6.x The advantages is that 1.5.0 ... or 2.0beta ... can be push relatively early, but we would still keep 1.4.0 as the stable release, against which every other binary installer should be built (scipy, mpl). Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Sat, Feb 6, 2010 at 8:29 AM, josef.p...@gmail.com wrote: On Sat, Feb 6, 2010 at 8:07 AM, Francesc Alted fal...@pytables.org wrote: A Saturday 06 February 2010 13:17:22 David Cournapeau escrigué: On Sat, Feb 6, 2010 at 4:07 PM, Travis Oliphant oliph...@enthought.com wrote: I think this plan is the least disruptive and satisfies the concerns of all parties in the discussion. The other plans that have been proposed do not address my concerns of keeping the date-time changes In that regard, your proposal is very similar to what was suggested at the beginning - the difference is only whether breaking at 1.4.x or 1.5.x. I'm thinking why should we so conservative in raising version numbers? Why not relabeling 1.4.0 to 2.0 and mark 1.4.0 as a broken release? Then, we can continue by putting everything except ABI breaking features in 1.4.1. With this, NumPy 2.0 will remain available for people wanting to be more on-the- bleeding-edge. Something similar to what has happened with Python 3.0, which has not prevented the 2.x series to evolve. How this sounds? I think breaking with 1.5 sounds good because it starts the second part of the 1.x series. 2.0 could be for the big overhaul that David has in mind, unless it will not be necessary anymore I don't understand why there is any debate about what to call a release that breaks ABI compatibility. Robert Kern already reminded the list of the Report from SciPy dated 2008-08-23: * The releases will be numbered major.minor.bugfix * There will be no ABI changes in minor releases * There will be no API changes in bugfix releases If numpy-2.0 suddenly shows up at sourceforge, people will either already be aware of the above convention, or if not they at least will be more likely to wonder what precipitated the jump and be more likely to read the release notes. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Sat, Feb 6, 2010 at 8:39 AM, David Cournapeau courn...@gmail.com wrote: On Sat, Feb 6, 2010 at 10:36 PM, Darren Dale dsdal...@gmail.com wrote: I don't understand why there is any debate about what to call a release that breaks ABI compatibility. Because it means datetime support will come late (in 2.0), and Travis wanted to get it early in. Why does something called 2.0 have to come late? Why can't whatever near-term numpy release that breaks ABI compatibility and includes datetime be called 2.0? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Fri, Feb 5, 2010 at 10:25 PM, Travis Oliphant oliph...@enthought.com wrote: On Feb 5, 2010, at 2:32 PM, Christopher Barker wrote: Hi folks, It sounds like a consensus has been reached to put out a 1.4.1 that is ABI compatible with 1.3.* This is not true. Consensus has not been reached. How many have registered opposition to the above proposal? I think 1.3.9 should be released and 1.4.1 should be ABI incompatible. And then another planned break in numpy ABI compatibility in the foreseeable future, for the other items that have been discussed in this thread? I am still inclined to agree with David and Chuck in this instance. Regards, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?
On Thu, Feb 4, 2010 at 3:21 AM, Francesc Alted fal...@pytables.org wrote: A Thursday 04 February 2010 08:46:01 Charles R Harris escrigué: Perhaps one way to articulate my perspective is the following: There are currently 2 groups of NumPy users: 1) those who have re-compiled all of their code for 1.4.0 2) those who haven't I think David has a better grip on that. There really are a lot of people who depend on binaries, and those binaries in turn depend on numpy. I would even say those folks are a majority, they are those who download the Mac and Windows versions of numpy. Yes, I think this is precisely the problem: people that are used to fetch binaries and want to use new NumPy, will be forced to upgrade all the other binary packages that depends on it. And these binary packagers (including me) are being forced to regenerate their binaries as soon as possible if they don't want their users to despair. I'm not saying that regenerating binaries is not possible, but that would require a minimum of anticipation. I'd be more comfortable with ABI-breaking releases to be announced at least with 6 months of anticipation. Then, a user is not likely going to change its *already* working environment until all the binary packages he depends on (scipy, matplotlib, pytables, h5py, numexpr, sympy...) have been *all* updated for dealing with the new ABI numpy, and that could be really a long time. With this (and ironically), an attempt to quickly introduce a new feature (in this case datetime, but it could have been whatever) in a release for allowing wider testing and adoption, will almost certainly result in a release that takes much longer to spread widely, and what is worst, generating a large frustration among users. Also, there was some discussion about wanting to make some other changes in numpy that would break ABI once, but allow new dtypes in the future without additional ABI breakage. Since ABI breakage is so disruptive, could we try to coordinate so a number of them can happen all at once, with plenty of warning to the community? Then this change, datetime, and hasobject can all be handled at the same time, and it could/should be released as numpy-2.0. Then when when numpy for py-3.0 is ready, which will presumably require ABI breakage, it could be called numpy-3.0. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy.test(): invalid value encountered in {isinf, divide, power, ...}
I haven't been following development on the trunk closely, so I apologize if this is a known issue. I didn't see anything relevant when I searched the list. I just updated my checkout of the trunk, cleaned out the old installation and build/, and reinstalled. When I run the test suite (without specifying the verbosity), I get a slew of warnings like: Warning: invalid value encountered in isinf Warning: invalid value encountered in isfinite I checked on both OS X 10.6 and gentoo linux, with similar results. The test suite reports ok at the end with 5 known failures and 4 skipped tests. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation
On Wed, Dec 30, 2009 at 9:26 AM, Ravi lists_r...@lavabit.com wrote: On Wednesday 30 December 2009 06:15:45 René Dudfield wrote: I agree with many things in that post. Except your conclusion on multiple versions of packages in isolation. Package isolation is like processes, and package sharing is like threads - and threads are evil! I don't think this is an appropriate analogy, and hyperbolic statements like threads are evil! are unlikely to persuade a scientific audience. You have stated this several times, but is there any evidence that this is the desire of the majority of users? In the scientific community, interactive experimentation is critical and users are typically not seasoned systems administrators. For such users, almost all packages installed after installing python itself are packages they use. In particular, all I want to do is to use apt/yum to get the packages (or ask my sysadmin, who rightfully has no interest in learning the intricacies of python package installation, to do so) and continue with my work. Packages-in-isolation is for people whose job is to run server farms, not interactive experimenters. I agree. Leave my python site-packages directory alone I say... especially don't let setuptools infect it :) There are already mechanisms in place for this. python setup.py install --user or easy_install --prefix=/usr/local for example. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation
On Wed, Dec 30, 2009 at 11:16 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale dsdal...@gmail.com wrote: Hi David, On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau courn...@gmail.com wrote: Executable: grin module: grin function: grin_main Executable: grind module: grin function: grind_main Have you thought at all about operations that are currently performed by post-installation scripts? For example, it might be desirable for the ipython or MayaVi windows installers to create a folder in the Start menu that contains links the the executable and the documentation. This is probably a secondary issue at this point in toydist's development, but I think it is an important feature in the long run. Also, have you considered support for package extras (package variants in Ports, allowing you to specify features that pull in additional dependencies like traits[qt4])? Enthought makes good use of them in ETS, and I think they would be worth keeping. Does this example covers what you have in mind ? I am not so familiar with this feature of setuptools: Name: hello Version: 1.0 Library: BuildRequires: paver, sphinx, numpy if os(windows) BuildRequires: pywin32 Packages: hello Extension: hello._bar sources: src/hellomodule.c if os(linux) Extension: hello._linux_backend sources: src/linbackend.c Note that instead of os(os_name), you can use flag(flag_name), where flag are boolean variables which can be user defined: http://github.com/cournape/toydist/blob/master/examples/simples/conditional/toysetup.info http://github.com/cournape/toydist/blob/master/examples/var_example/toysetup.info I should defer to the description of extras in the setuptools documentation. It is only a few paragraphs long: http://peak.telecommunity.com/DevCenter/setuptools#declaring-extras-optional-features-with-their-own-dependencies Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Py3 merge
On Sat, Dec 5, 2009 at 10:54 PM, David Cournapeau courn...@gmail.com wrote: On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen p...@iki.fi wrote: Hi, I'd like to commit my Py3 Numpy branch to SVN trunk soon: http://github.com/pv/numpy-work/commits/py3k Awesome - I think we should merge this ASAP. In particular, I would like to start fixing platforms-specific issues. Concerning nose, will there be any version which works on both py2 and py3 ? There is a development branch for python-3 here: svn checkout http://python-nose.googlecode.com/svn/branches/py3k Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] REMINDER: trunk is about to be frozen for 1.4.0
On Tue, Nov 17, 2009 at 8:55 PM, David Cournapeau courn...@gmail.com wrote: already done in r7743 :) Did you report it as a bug on trac, so that I close it as well, Oh, thanks! No, I forgot to report it on trac, I'll try to remember that in the future. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] REMINDER: trunk is about to be frozen for 1.4.0
Please consider applying this patch before freezing, or you can't do python setup.py develop with Distribute (at least not with Enthought's Enable): ndex: numpy/distutils/command/build_ext.py === --- numpy/distutils/command/build_ext.py(revision 7734) +++ numpy/distutils/command/build_ext.py(working copy) @@ -61,6 +61,7 @@ if self.distribution.have_run.get('build_clib'): log.warn('build_clib already run, it is too late to ' \ 'ensure in-place build of build_clib') +build_clib = self.distribution.get_command_obj('build_clib') else: build_clib = self.distribution.get_command_obj('build_clib') build_clib.inplace = 1 On Mon, Nov 16, 2009 at 4:29 AM, David Cournapeau da...@ar.media.kyoto-u.ac.jp wrote: Hi, A quick remainder: the trunk will be closed for 1.4.0 changes within a few hours. After that time, the trunk should only contain things which will be in 1.5.0, and the 1.4.0 changes will be in the 1.4.0 branch, which should contain only bug fixes. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy distutils and distribute
Please excuse the cross-post. I have installed distribute-0.6.8 and numpy-svn into my ~/.local/lib/python2.6/site-packages (using python setup.py install --user). I am now trying to install Enthought's Enable from a fresh svn checkout on ubuntu karmic: $ python setup.py develop --user [...] building library agg24_src sources building library kiva_src sources building extension enthought.kiva.agg._agg sources building extension enthought.kiva.agg._plat_support sources building data_files sources build_src: building npy-pkg config files running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib running build_ext build_clib already run, it is too late to ensure in-place build of build_clib Traceback (most recent call last): File setup.py, line 327, in module **config File /home/darren/.local/lib/python2.6/site-packages/numpy/distutils/core.py, line 186, in setup return old_setup(**new_attr) File /usr/lib/python2.6/distutils/core.py, line 152, in setup dist.run_commands() File /usr/lib/python2.6/distutils/dist.py, line 975, in run_commands self.run_command(cmd) File /usr/lib/python2.6/distutils/dist.py, line 995, in run_command cmd_obj.run() File /home/darren/.local/lib/python2.6/site-packages/numpy/distutils/command/build_ext.py, line 74, in run self.library_dirs.append(build_clib.build_clib) UnboundLocalError: local variable 'build_clib' referenced before assignment I am able to run python setup.py install --user. Incidentally, python setup.py develop --user worked for TraitsGui, EnthoughtBase, TraitsBackendQt4. I have been (sort of) following the discussion on distutils-sig. Thank you Robert, David, Pauli, for all your effort. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy distutils and distribute
On Sat, Nov 14, 2009 at 10:42 AM, Gökhan Sever gokhanse...@gmail.com wrote: On Sat, Nov 14, 2009 at 9:29 AM, Darren Dale dsdal...@gmail.com wrote: Please excuse the cross-post. I have installed distribute-0.6.8 and numpy-svn into my ~/.local/lib/python2.6/site-packages (using python setup.py install --user). I am now trying to install Enthought's Enable from a fresh svn checkout on ubuntu karmic: $ python setup.py develop --user [...] building library agg24_src sources building library kiva_src sources building extension enthought.kiva.agg._agg sources building extension enthought.kiva.agg._plat_support sources building data_files sources build_src: building npy-pkg config files running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib running build_ext build_clib already run, it is too late to ensure in-place build of build_clib Traceback (most recent call last): File setup.py, line 327, in module **config File /home/darren/.local/lib/python2.6/site-packages/numpy/distutils/core.py, line 186, in setup return old_setup(**new_attr) File /usr/lib/python2.6/distutils/core.py, line 152, in setup dist.run_commands() File /usr/lib/python2.6/distutils/dist.py, line 975, in run_commands self.run_command(cmd) File /usr/lib/python2.6/distutils/dist.py, line 995, in run_command cmd_obj.run() File /home/darren/.local/lib/python2.6/site-packages/numpy/distutils/command/build_ext.py, line 74, in run self.library_dirs.append(build_clib.build_clib) UnboundLocalError: local variable 'build_clib' referenced before assignment Darren, I had a similar installation error. Could you try the solution that was given in this thread? http://www.mail-archive.com/numpy-discussion@scipy.org/msg19798.html Thanks! Here is the diff, could someone with knowledge of numpy's distutils have a look and consider committing it? Index: numpy/distutils/command/build_ext.py === --- numpy/distutils/command/build_ext.py(revision 7734) +++ numpy/distutils/command/build_ext.py(working copy) @@ -61,6 +61,7 @@ if self.distribution.have_run.get('build_clib'): log.warn('build_clib already run, it is too late to ' \ 'ensure in-place build of build_clib') +build_clib = self.distribution.get_command_obj('build_clib') else: build_clib = self.distribution.get_command_obj('build_clib') build_clib.inplace = 1 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.4.0: Setting a firm release date for 1st december.
On Mon, Nov 2, 2009 at 3:29 AM, David Cournapeau courn...@gmail.com wrote: Hi, I think it is about time to release 1.4.0. Instead of proposing a release date, I am setting a firm date for 1st December, and 16th november to freeze the trunk. If someone wants a different date, you have to speak now. There are a few issues I would like to clear up: - Documentation for datetime, in particular for the public C API - Snow Leopard issues, if any Otherwise, I think there has been quite a lot of new features. If people want to add new functionalities or features, please do it soon, I wanted to get __input_prepare__ in for the 1.4 release, but I don't think I can get it in and tested by November 16. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy and C99
On Fri, Oct 23, 2009 at 9:29 AM, Pauli Virtanen pav...@iki.fi wrote: Fri, 23 Oct 2009 09:21:17 -0400, Darren Dale wrote: Can we use features of C99 in numpy? For example, can we use // style comments, and C99 for statements for (int i=0, ...) ? It would be much easier if we could, but so far we have strived for C89 compliance. So I guess the answer is no. Out of curiosity (I am relatively new to C), what is holding numpy back from embracing C99? Why adhere to a 20-year-old standard? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
On Tue, Oct 20, 2009 at 5:24 AM, Sebastian Walter sebastian.wal...@gmail.com wrote: I'm not very familiar with the underlying C-API of numpy, so this has to be taken with a grain of salt. The reason why I'm curious about the genericity is that it would be awesome to have: 1) ufuncs like sin, cos, exp... to work on arrays of any object (this works already) 2) funcs like dot, eig, etc, to work on arrays of objects( works for dot already, but not for eig) 3) ufuncs and funcs to work on any objects I think if you want to work on any object, you need something like the PEP I mentioned earlier. What I am proposing is to use the existing mechanism in numpy, check __array_priority__ to determine which input's __input_prepare__ to call. examples that would be nice to work are among others: * arrays of polynomials, i.e. arrays of objects * polynomials with tensor coefficients, object with underlying array structure I thought that the most elegant way to implement that would be to have all numpy functions try to call either 1) the class function with the same name as the numpy function 2) or if the class function is not implemented, the member function with the same name as the numpy function 3) if none exists, raise an exception E.g. 1) if isinstance(x) = Foo then numpy.sin(x) would call Foo.sin(x) if it doesn't know how to handle Foo How does it numpy.sin know if it knows how to handle Foo? numpy.sin will happily process the data of subclasses of ndarray, but if you give it a quantity with units of degrees it is going to return garbage and not care. 2) similarly, for arrays of objects of type Foo: x = np.array([Foo(1), Foo(2)]) Then numpy.sin(x) should try to return npy.array([Foo.sin(xi) for xi in x]) or in case Foo.sin is not implemented as class function, return : np.array([xi.sin() for xi in x]) I'm not going to comment on this, except to say that it is outside the scope of my proposal. Therefore, I somehow expected something like that: Quantity would derive from numpy.ndarray. When calling Quantity.__new__(cls) creates the member functions __add__, __imul__, sin, exp, ... where each function has a preprocessing part and a post processing part. After the preprocessing call the original ufuncs on the base class object, e.g. __add__ It is more complicated than that. Ufuncs don't call array methods, its the other way around. ndarray.__add__ calls numpy.add. If you have a custom operation to perform on numpy arrays, you write a ufunc, not a subclass. What you are proposing is a very significant change to numpy. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
Hi Travis, On Mon, Oct 19, 2009 at 6:29 PM, Travis Oliphant oliph...@enthought.com wrote: On Oct 17, 2009, at 7:49 AM, Darren Dale wrote: [...] When calling numpy functions: 1) __input_prepare__ provides an opportunity to operate on the inputs to yield versions that are compatible with the operation (they should obviously not be modified in place) 2) the output array is established 3) __array_prepare__ is used to determine the class of the output array, as well as any metadata that needs to be established before the operation proceeds 4) the ufunc performs its operations 5) __array_wrap__ provides an opportunity to update the output array based on the results of the computation Comments, criticisms? If PEP 3124^ were already a part of the standard library, that could serve as the basis for generalizing numpy's functions. But I think the PEP will not be approved in its current form, and it is unclear when and if the author will revisit the proposal. The scheme I'm imagining might be sufficient for our purposes. This seems like it could work. So, basically ufuncs will take any object as input and call it's __input__prepare__ method? This should return a sub-class of an ndarray? ufuncs would call __input_prepare__ on the input declaring the highest __array_priority__, just like ufuncs do with __array_wrap__, passing a tuple of inputs and the ufunc itself (provided for context). __input_prepare__ would return a tuple of inputs that the ufunc would use for computation, I'm not sure if these need to be arrays or not, I think I can give a better answer once I start the implementation (next few days I think). Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Another suggestion for making numpy's functions generic
On Mon, Oct 19, 2009 at 3:10 AM, Sebastian Walter sebastian.wal...@gmail.com wrote: On Sat, Oct 17, 2009 at 2:49 PM, Darren Dale dsdal...@gmail.com wrote: numpy's functions, especially ufuncs, have had some ability to support subclasses through the ndarray.__array_wrap__ method, which provides masked arrays or quantities (for example) with an opportunity to set the class and metadata of the output array at the end of an operation. An example is q1 = Quantity(1, 'meter') q2 = Quantity(2, 'meters') numpy.add(q1, q2) # yields Quantity(3, 'meters') At SciPy2009 we committed a change to the numpy trunk that provides a chance to determine the class and some metadata of the output *before* the ufunc performs its calculation, but after output array has been established (and its data is still uninitialized). Consider: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') numpy.add(q1, q2, q1) # or equivalently: # q1 += q2 With only __array_wrap__, the attempt to propagate the units happens after q1's data was updated in place, too late to raise an error, the data is now corrupted. __array_prepare__ solves that problem, an exception can be raised in time. Now I'd like to suggest one more improvement to numpy to make its functions more generic. Consider one more example: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'feet') numpy.add(q1, q2) In this case, I'd like an opportunity to operate on the input arrays on the way in to the ufunc, to rescale the second input to meters. I think it would be a hack to try to stuff this capability into __array_prepare__. One form of this particular example is already supported in quantities, q1 + q2, by overriding the __add__ method to rescale the second input, but there are ufuncs that do not have an associated special method. So I'd like to look into adding another check for a special method, perhaps called __input_prepare__. My time is really tight for the next month, so I'd rather not start if there are strong objections, but otherwise, I'd like to try to try to get it in in time for numpy-1.4. (Has a timeline been established?) I think it will be not too difficult to document this overall scheme: When calling numpy functions: 1) __input_prepare__ provides an opportunity to operate on the inputs to yield versions that are compatible with the operation (they should obviously not be modified in place) 2) the output array is established 3) __array_prepare__ is used to determine the class of the output array, as well as any metadata that needs to be established before the operation proceeds 4) the ufunc performs its operations 5) __array_wrap__ provides an opportunity to update the output array based on the results of the computation Comments, criticisms? If PEP 3124^ were already a part of the standard library, that could serve as the basis for generalizing numpy's functions. But I think the PEP will not be approved in its current form, and it is unclear when and if the author will revisit the proposal. The scheme I'm imagining might be sufficient for our purposes. I'm all for generic (u)funcs since they might come handy for me since I'm doing lots of operation on arrays of polynomials. I don't quite get the reasoning though. Could you correct me where I get it wrong? * the class Quantity derives from numpy.ndarray * Quantity overrides __add__, __mul__ etc. and you get the correct behaviour for q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') by raising an exception when performing q1+=q2 No, Quantity does not override __iadd__ to catch this. Quantity implements __array_prepare__ to perform the dimensional analysis based on the identity of the ufunc and the inputs, and set the class and dimensionality of the output array, or raise an error when dimensional analysis fails. This approach lets quantities support all ufuncs (in principle), not just built in numerical operations. It should also make it easier to subclass from MaskedArray, so we could have a MaskedQuantity without having to establish yet another suite of ufuncs specific to quantities or masked quantities. * The problem is that numpy.add(q1,q1,q2) would corrupt q1 before raising an exception That was solved by the addition of __array_prepare__ to numpy back in August. What I am proposing now is supporting operations on arrays that would be compatible if we had a chance to transform them on the way into the ufunc, like meter + foot. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Subclassing record array
On Sun, Oct 18, 2009 at 12:22 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Oct 17, 2009 at 9:13 AM, Loïc BERTHE berthe.l...@gmail.com wrote: Hi, I would like to create my own class of record array to deal with units. Here is the code I used, inspired from http://docs.scipy.org/doc/numpy-1.3.x/user/basics.subclassing.html#slightly-more-realistic-example-attribute-added-to-existing-array : [code] from numpy import * class BlocArray(rec.ndarray): Recarray with units and pretty print fmt_dict = {'S' : '%10s', 'f' : '%10.6G', 'i': '%10d'} def __new__(cls, data, titles=None, units=None): # guess format for each column data2 = [] for line in zip(*data) : try : data2.append(cast[int](line)) # integers except ValueError : try : data2.append(cast[float](line)) # reals except ValueError : data2.append(cast[str](line)) # characters # create the array dt = dtype(zip(titres, [line.dtype for line in data2])) obj = rec.array(data2, dtype=dt).view(cls) # add custom attributes obj.units = units or [] obj._fmt = .join(obj.fmt_dict[d[1][1]] for d in dt.descr) + '\n' obj._head = %10s *len(dt.names) % dt.names +'\n' obj._head += %10s *len(dt.names) % tuple('(%s)' % u for u in units) +'\n' # Finally, we must return the newly created object: return obj titles = ['Name', 'Nb', 'Price'] units = ['/', '/', 'Eur'] data = [['fish', '1', '12.25'], ['egg', '6', '0.85'], ['TV', 1, '125']] bloc = BlocArray(data, titles=titles, units=units) In [544]: bloc Out[544]: Name Nb Price (/) (/) (Eur) fish 1 12.25 egg 6 0.85 TV 1 125 [/code] It's almost working, but I have some isues : - I can't access data through indexing In [563]: bloc['Price'] /home/loic/Python/numpy/test.py in genexpr((r,)) 50 51 def __repr__(self): --- 52 return self._head + ''.join(self._fmt % tuple(r) for r in self) TypeError: 'numpy.float64' object is not iterable So I think that overloading the __repr__ method is not that easy - I can't access data through attributes now : In [564]: bloc.Nb AttributeError: 'BlocArray' object has no attribute 'Nb' - I can't use 'T' as field in theses array as the T method is already here as a shortcut for transpose Have you any hints to make this work ? On adding units in general, you might want to contact Darren Dale who has been working in that direction also and has added some infrastructure in svn to make it easier. He also gave a short presentation at scipy2009 on that problem, which has been worked on before. No sense in reinventing the wheel here. The units package I have been working on is called quantities. It is available at the python package index, and the project is hosted at launchpad as python-quantities. If quantities isn't a good fit, please let me know why. At least the code can provide some example of how to subclass ndarray. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_wrap__
On Wed, Sep 30, 2009 at 2:57 AM, Pauli Virtanen pav...@iki.fi wrote: Tue, 29 Sep 2009 14:55:44 -0400, Neal Becker wrote: This seems to work now, but I'm wondering if Charles is correct, that inheritance isn't such a great idea here. The advantage of inheritance is I don't have to implement forwarding all the functions, a pretty big advantage. (I wonder if there is some way to do some of these as a generic 'mixin'?) The usual approach is to use __getattr__, to forward many routines with little extra work. ... with a side effect of making the API opaque and breaking tab completion in ipython. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeaucourn...@gmail.com wrote: On Wed, Sep 9, 2009 at 9:37 AM, Darren Daledsdal...@gmail.com wrote: Hi David, I already gave my own opinion on py3k, which can be summarized as: - it is a huge effort, and no core numpy/scipy developer has expressed the urge to transition to py3k, since py3k does not bring much for scientific computing. - very few packages with a significant portion of C have been ported to my knowledge, hence very little experience on how to do it. AFAIK, only small packages have been ported. Even big, pure python projects have not been ported. The only big C project to have been ported is python itself, and it broke compatibility and used a different source tree than python 2. - it remains to be seen whether we can do the py3k support in the same source tree as the one use for python = 2.4. Having two source trees would make the effort even much bigger, well over the current developers capacity IMHO. The only area where I could see the PSF helping is the point 2: more documentation, more stories about 2-3 transition. I'm surprised to hear you say that. I would think additional developer and/or financial resources would be useful, for all of the reasons you listed. If there was enough resources to pay someone very familiar with numpy codebase for a long time, then yes, it could be useful - but I assume that's out of the question. This would be very expensive as it would requires several full months IMO. The PSF could help for the point 3, by porting other projects to py3k and documenting it. The only example I know so far is pycog2 (http://mail.python.org/pipermail/python-porting/2008-December/10.html). Paying people to do documentation about porting C code seems like a good way to spend money: it would be useful outside numpy community, and would presumably be less costly. Another topic concerning documentation is API compatibility. The python devs have requested projects not use the 2-3 transition as an excuse to change their APIs, but numpy is maybe a special case. I'm thinking about PEP3118. Is numpy going to transition to python 3 and then down the road transition again to the new buffer protocol? What is the strategy here? My underinformed impression is that there isn't one, since every time PEP3118 is considered in the context of the 2-3 transition somebody helpfully reminds the list that we aren't supposed to break APIs. Numpy is a critical python library, perhaps the transition presents an opportunity, if the community can yield a little on numpy's C api. For example, in the long run, what would it take to get numpy (or the core thereof) into the standard library, and can we take steps now in that direction? Would the numpy devs be receptive to comments from the python devs on the existing numpy codebase? I'm willing to pitch in and work on the transition, not because I need python-3 right now, but because the transition needs to happen and it would benefit everyone in the long run. But I would like to know that we are making the most of the opportunity, and have considered our options. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Wed, Sep 9, 2009 at 11:25 AM, Charles R Harrischarlesr.har...@gmail.com wrote: On Wed, Sep 9, 2009 at 7:15 AM, Darren Dale dsdal...@gmail.com wrote: On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeaucourn...@gmail.com wrote: On Wed, Sep 9, 2009 at 9:37 AM, Darren Daledsdal...@gmail.com wrote: Hi David, I already gave my own opinion on py3k, which can be summarized as: - it is a huge effort, and no core numpy/scipy developer has expressed the urge to transition to py3k, since py3k does not bring much for scientific computing. - very few packages with a significant portion of C have been ported to my knowledge, hence very little experience on how to do it. AFAIK, only small packages have been ported. Even big, pure python projects have not been ported. The only big C project to have been ported is python itself, and it broke compatibility and used a different source tree than python 2. - it remains to be seen whether we can do the py3k support in the same source tree as the one use for python = 2.4. Having two source trees would make the effort even much bigger, well over the current developers capacity IMHO. The only area where I could see the PSF helping is the point 2: more documentation, more stories about 2-3 transition. I'm surprised to hear you say that. I would think additional developer and/or financial resources would be useful, for all of the reasons you listed. If there was enough resources to pay someone very familiar with numpy codebase for a long time, then yes, it could be useful - but I assume that's out of the question. This would be very expensive as it would requires several full months IMO. The PSF could help for the point 3, by porting other projects to py3k and documenting it. The only example I know so far is pycog2 (http://mail.python.org/pipermail/python-porting/2008-December/10.html). Paying people to do documentation about porting C code seems like a good way to spend money: it would be useful outside numpy community, and would presumably be less costly. Another topic concerning documentation is API compatibility. The python devs have requested projects not use the 2-3 transition as an excuse to change their APIs, but numpy is maybe a special case. I'm thinking about PEP3118. Is numpy going to transition to python 3 and then down the road transition again to the new buffer protocol? What is the strategy here? My underinformed impression is that there isn't one, since every time PEP3118 is considered in the context of the 2-3 transition somebody helpfully reminds the list that we aren't supposed to break APIs. Numpy is a critical python library, perhaps the transition presents an opportunity, if the community can yield a little on numpy's C api. For example, in the long run, what would it take to get numpy (or the core thereof) into the standard library, and can we take steps now in that direction? Would the numpy devs be receptive to comments from the python devs on the existing numpy codebase? I'm willing to pitch in and work on the transition, not because I need python-3 right now, but because the transition needs to happen and it would benefit everyone in the long run. But I would like to know that we are making the most of the opportunity, and have considered our options. Making numpy more buffer centric is an interesting idea and might be where we want to go with the ufuncs, but the new buffer protocol didn't go in until python 2.6. If there was no rush I'd go with Fernando and wait until we could be all python 2.6 all the time. I wonder what such a timeframe would look like, what would decide when to require python-2.6 for future releases of packages. Could a maintenance-only branch be created for the numpy-1.4 or 1.5 series, and then future development require 2.6 or 3.1? However, if anyone has the time to work on getting the c-code up to snuff and finding out what the problems are I'm all for that. I have some notes on the transition in the src directory and if you do anything please keep them current. I will have a look, thank you for putting those notes together. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] question about future support for python-3
I'm not a core numpy developer and don't want to step on anybody's toes here. But I was wondering if anyone had considered approaching the Python Software Foundation about support to help get numpy working with python-3? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Hi David, On Tue, Sep 8, 2009 at 3:56 PM, David Warde-Farleyd...@cs.toronto.edu wrote: Hey Darren, On 8-Sep-09, at 3:21 PM, Darren Dale wrote: I'm not a core numpy developer and don't want to step on anybody's toes here. But I was wondering if anyone had considered approaching the Python Software Foundation about support to help get numpy working with python-3? It's a great idea, but word on the grapevine is they lost a LOT of money on PyCon 2009 due to lower than expected turnout (recession, etc.); worth a try, perhaps, but I wouldn't hold my breath. I'm blissfully ignorant of the grapevine. But if the numpy project could make use of additional resources to speed along the transition, and if the PSF is in a position to help (either now or in the future), both parties could benefit from such an arrangement. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Hi David, On Tue, Sep 8, 2009 at 8:08 PM, David Cournapeaucourn...@gmail.com wrote: On Wed, Sep 9, 2009 at 4:21 AM, Darren Daledsdal...@gmail.com wrote: I'm not a core numpy developer and don't want to step on anybody's toes here. But I was wondering if anyone had considered approaching the Python Software Foundation about support to help get numpy working with python-3? I already gave my own opinion on py3k, which can be summarized as: - it is a huge effort, and no core numpy/scipy developer has expressed the urge to transition to py3k, since py3k does not bring much for scientific computing. - very few packages with a significant portion of C have been ported to my knowledge, hence very little experience on how to do it. AFAIK, only small packages have been ported. Even big, pure python projects have not been ported. The only big C project to have been ported is python itself, and it broke compatibility and used a different source tree than python 2. - it remains to be seen whether we can do the py3k support in the same source tree as the one use for python = 2.4. Having two source trees would make the effort even much bigger, well over the current developers capacity IMHO. The only area where I could see the PSF helping is the point 2: more documentation, more stories about 2-3 transition. I'm surprised to hear you say that. I would think additional developer and/or financial resources would be useful, for all of the reasons you listed. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] segfaults when passing ndarray subclass to ufunc with out=None
Hi Stefan, I think Chuck applied the patch after I filed a ticket at the trac website. http://projects.scipy.org/numpy/ticket/1022 . I just tried running the script I posted with the most recent checkout and numpy raised an error instead of segfaulting, so I think this issue is clear. Thank you for following up. Darren 2009/8/30 Stéfan van der Walt ste...@sun.ac.za: Hi, Darren Is this problem still present? If so, we should fix it before 1.4 is released. Regards Stéfan -- Forwarded message -- From: Darren Dale dsdal...@gmail.com Date: 2009/3/8 Subject: Re: [Numpy-discussion] segfaults when passing ndarray subclass to ufunc with out=None To: Discussion of Numerical Python numpy-discussion@scipy.org On Sun, Feb 8, 2009 at 12:49 PM, Darren Dale dsdal...@gmail.com wrote: I am seeing some really strange behavior when I try to pass an ndarray subclass and out=None to numpy's ufuncs. This example will reproduce the problem with svn numpy, the first print statement yields 1 as expected, the second yields type 'builtin_function_or_method' and the third yields a segmentation fault: import numpy as np class MyArray(np.ndarray): __array_priority__ = 20 def __new__(cls): return np.asarray(1).view(cls).copy() def __repr__(self): return 'my_array' __str__ = __repr__ def __mul__(self, other): return super(MyArray, self).__mul__(other) def __rmul__(self, other): return super(MyArray, self).__rmul__(other) mine = MyArray() print np.multiply(1, 1, None) x = np.multiply(mine, mine, None) print type(x) print x I think I might have found a fix for this. The following patch allows my script to run without a segfault: $ svn diff Index: umath_ufunc_object.inc === --- umath_ufunc_object.inc (revision 6566) +++ umath_ufunc_object.inc (working copy) @@ -3212,13 +3212,10 @@ output_wrap[i] = wrap; if (j nargs) { obj = PyTuple_GET_ITEM(args, j); - if (obj == Py_None) { - continue; - } if (PyArray_CheckExact(obj)) { output_wrap[i] = Py_None; } - else { + else if (obj != Py_None) { PyObject *owrap = PyObject_GetAttrString(obj,__array_wrap__); incref = 0; if (!(owrap) || !(PyCallable_Check(owrap))) { That call to continue skipped this bit of code in the loop, which is apparently important: if (incref) { Py_XINCREF(output_wrap[i]); } I've tested the trunk on 64 bit linux, with and without this patch applied, and I get the same result in both cases: 1 known failure, 11 skips. Is there any chance someone could consider applying this patch before 1.3 ships? Darren ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- In our description of nature, the purpose is not to disclose the real essence of the phenomena but only to track down, so far as it is possible, relations between the manifold aspects of our experience - Niels Bohr It is a bad habit of physicists to take their most successful abstractions to be real properties of our world. - N. David Mermin Once we have granted that any physical theory is essentially only a model for the world of experience, we must renounce all hope of finding anything like the correct theory ... simply because the totality of experience is never accessible to us. - Hugh Everett III ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Merge of date-time branch completed
On Fri, Aug 28, 2009 at 12:47 PM, Travis Oliphantoliph...@enthought.com wrote: Hello folks, In keeping with the complaint that the pace of NumPy development is too fast, I've finished the merge of the datetime branch to the core. The trunk builds and all the (previous) tests pass for me. There are several tasks remaining to be done (the current status is definitely still alpha): * write many unit tests for the desired behavior (especially for the many different kinds of dates supported) * finish coercion between datetimes and timedeltas with different frequencies * improve the ufuncs that support datetime and timedelta so that they look at the frequency information. I haven't been following development on datetime. Can you use __array_prepare__ and __array_wrap__ to do this? __array_prepare__ was committed to the trunk during the scipy sprints. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Commit privileges for Darren Dale
On Mon, Jul 27, 2009 at 12:06 AM, Charles R Harrischarlesr.har...@gmail.com wrote: Hi All, In case it got buried in the thread, Darren is asking for commit privileges. I think it's a good idea. Thank you for saying so. What can I do to help move this to the point where I can actually start committing? I would like to get my array_prepare patch into svn soon so I can bundle a new beta of Quantities in time for scipy-09. I'm going to be on vacation July 31-August 9, should I wait until I get back before checking it in (assuming I'm granted commit rights)? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Commit privileges for Darren Dale
On Tue, Jul 28, 2009 at 10:59 AM, David Cournapeaucourn...@gmail.com wrote: On Tue, Jul 28, 2009 at 9:47 PM, Darren Daledsdal...@gmail.com wrote: On Mon, Jul 27, 2009 at 12:06 AM, Charles R Harrischarlesr.har...@gmail.com wrote: Hi All, In case it got buried in the thread, Darren is asking for commit privileges. I think it's a good idea. Thank you for saying so. What can I do to help move this to the point where I can actually start committing? I would like to get my array_prepare patch into svn soon so I can bundle a new beta of Quantities in time for scipy-09. I'm going to be on vacation July 31-August 9, should I wait until I get back before checking it in (assuming I'm granted commit rights)? Why not putting your code under a svn branch or a bzr/git/whatever import of the trunk, take care of it, and then committing it after you come back ? I am unfortunately not in position to comment on your code, I am not familiar on this part of numpy, but I would like someone else to take 'responsibility' that your code is OK if possible. I would also feel more comfortable if someone knowledgeable would have a look, since I don't have a lot of experience with the c api. But I'm having difficulty soliciting a response to my requests for review. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] suggestion for generalizing numpy functions
On Thu, Jul 23, 2009 at 12:54 PM, Darren Daledsdal...@gmail.com wrote: On Tue, Jul 21, 2009 at 10:11 AM, Darren Daledsdal...@gmail.com wrote: On Tue, Jul 21, 2009 at 7:44 AM, Darren Daledsdal...@gmail.com wrote: 2009/7/20 Stéfan van der Walt ste...@sun.ac.za: Hi Chuck 2009/7/17 Charles R Harris charlesr.har...@gmail.com: PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful. Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two. I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation. Here is a better patch, which includes a few additional tests and adds some documentation. It also attempts to improve the docstring and sphinx docs for __array_wrap__, which may have been a little bit misleading. There is also some whitespace cleanup in a few places. Would someone please review my work and commit the patch if it is acceptable? Pierre or Travis, would either of you have a chance to look over the implementation and the documentation changes, since you two seem to be most familiar with ufuncs and subclassing ndarray? It looks like part of my patch has been clobbered by changes introduced in svn 7184-7191. What else should I be doing so a patch like this can be committed relatively quickly? Could I please obtain commit privileges so I can commit this feature to svn myself? Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion