Re: [Numpy-discussion] Update on using scons to build numpy
Matthew Brett wrote: Hi, A quick email to give an update on my work to build numpy with scons. I've finished a few days ago to make my former work a separate package from numpy: it was more work than I expected because of bootstrapping issues, but I can now build numpy again with the new package on Linux. Just to thank you very much for your work on this. Thanks. If you are willing to test, please submit any bugs in launchpad, or on this ML, cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Nose testing for numpy
Hi, I've just finished moving the scipy tests over to nose. Thinking about it, it seems to me to be a good idea to do the same for numpy. The advantages of doing this now are that numpy and scipy would be in parallel, that we can continue to have one testing system for both, and that it would be clear to both numpy and scipy developers that they should not use NumpyTest but the nose test framework. At the moment, you can still find yourself using the numpy test framework in scipy, and the tests will work - but it should be deprecated. Ideally, to make a clean switch, I think using numpytest should raise an error. One issue is that the scipy nose test labels use decorators, for which the standard syntax requires python 2.4. To avoid this, we can just apply the decorators with the def test_func(): pass test_func = dec.slow(test_func) syntax. It's a rather moot point, as the decorators are mainly used to define test level, there's only one extra test found for numpy.test(10) compared to numpy.test(). I think I could do the port in quite a short time. Any thoughts? Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] casting
Robert Kern wrote: Neal Becker wrote: numpy frequently refers to 'casting'. I'm not sure if that term is ever defined. I believe it has the same meaning as in C. In that case, it is unfortunately used to mean 2 different things. There are casts that do not change the underlying bits (such as a pointer cast), and there are casts that actually convert to different bits (such as float - double). I think numpy means the latter. When an array where the underlying data is one type, a cast to another type means actually reallocating and converting the data. Yes, that is usually what people mean when they use _casting_ in the context of numpy. It is the more frequently performed operation of the two. The former can be accomplished with the .view(dtype) method of ndarrays. It often occurs that I have an algorithm that can take any integral type, because it is written with c++ templates. In that case, I don't want to use PyArray_FROMANY, because I don't want to unecessarily convert the array data. Instead, I'd like to inquire what is the preferred type of the data. The solution I'm exploring is to use a function I call 'preferred_array_type'. This uses the __array_struct__ interface to find the native data type. I chose to use this interface, because then it will work with both numpy arrays and other array-like types. Any thoughts on all of this? I'm not sure what you mean by preferred type of the data. Do you mean the dtype of the array as it comes in? There are several functions and function macros in the numpy C API which take differing amounts of information. For example, * PyArray_FROM_O(PyObject*onj) just takes an object. * PyArray_FROM_OF(PyObject* obj, int req) takes an object and flags like NPY_CONTIGUOUS. * PyArray_FROM_OT(PyObject* obj, int typenum) takes an object and a type number. * PyArray_FROM_OTF(PyObject* obj, int typenum, int req) takes an object, type, and flags. Let me try again to explain. I don't want to convert to some type first - that would be a waste. I need to find out what is the native data type of the input array first. Also, I'd like to allow that the input is not a PyArray, but could be something conforming to __array_struct__ interface. So, I need to find the native data type _first_, _then_ call the appropriate PyArray_FROM... Further, I don't believe this requirement is unique. I would think it would be needed for any time when a user wants to create a function that can accept a numpy array, and would like to avoid unnecessary data conversion. This is particularly true when the underlying function is using c++ templates to allow the data type to be a template parameter (and so can operate on any - or a range - of data types). ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] casting
Neal Becker wrote: Robert Kern wrote: Neal Becker wrote: numpy frequently refers to 'casting'. I'm not sure if that term is ever defined. I believe it has the same meaning as in C. In that case, it is unfortunately used to mean 2 different things. There are casts that do not change the underlying bits (such as a pointer cast), and there are casts that actually convert to different bits (such as float - double). I think numpy means the latter. When an array where the underlying data is one type, a cast to another type means actually reallocating and converting the data. Yes, that is usually what people mean when they use _casting_ in the context of numpy. It is the more frequently performed operation of the two. The former can be accomplished with the .view(dtype) method of ndarrays. It often occurs that I have an algorithm that can take any integral type, because it is written with c++ templates. In that case, I don't want to use PyArray_FROMANY, because I don't want to unecessarily convert the array data. Instead, I'd like to inquire what is the preferred type of the data. The solution I'm exploring is to use a function I call 'preferred_array_type'. This uses the __array_struct__ interface to find the native data type. I chose to use this interface, because then it will work with both numpy arrays and other array-like types. Any thoughts on all of this? I'm not sure what you mean by preferred type of the data. Do you mean the dtype of the array as it comes in? There are several functions and function macros in the numpy C API which take differing amounts of information. For example, * PyArray_FROM_O(PyObject*onj) just takes an object. * PyArray_FROM_OF(PyObject* obj, int req) takes an object and flags like NPY_CONTIGUOUS. * PyArray_FROM_OT(PyObject* obj, int typenum) takes an object and a type number. * PyArray_FROM_OTF(PyObject* obj, int typenum, int req) takes an object, type, and flags. Let me try again to explain. I don't want to convert to some type first - that would be a waste. I need to find out what is the native data type of the input array first. Also, I'd like to allow that the input is not a PyArray, but could be something conforming to __array_struct__ interface. So, I need to find the native data type _first_, _then_ call the appropriate PyArray_FROM... I'm sorry, I still think we're talking past each other. What do you mean by native data type? If you just want to get an ndarray without specifying a type, use PyArray_FROM_O(). That's what it's for. You don't need to know the data type beforehand. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] casting
I'm sorry, I still think we're talking past each other. What do you mean by native data type? If you just want to get an ndarray without specifying a type, use PyArray_FROM_O(). That's what it's for. You don't need to know the data type beforehand. What I have wanted in the past (and what I thought Neal was after) is a way to choose which function to call according to the typecode of the data as it is currently in memory. I don't want to convert (or cast or even touch the data) but just call a type specific function instead. C++ templates can take some of the tedium out of that, but in some cases algorithms may be different too. Guessing which sort algorithm to use springs to mind. Rather than saying give me the right kind of array, I think there is an interest in saying choose which function is the best for this data. Something like: PyArrayObject* array = PyArray_FROM_O( (PyObject*) O ); type = array - descr - type_num ; switch (type){ caseNPY_BYTE : signed_func(array); caseNPY_UBYTE : unsigned_func(array); // etc It sort of implies having a C++ type hierarchy for numpy arrays and casting array to be a PyFloatArray or PyDoubleArray etc? The extra confusion might be due to the way arrays can be laid out in memory - indexing into array slices is not always obvious. Also if you want to make sure your inner loop goes over the fast index you might want an algorithm which reads the strides when it runs. Sorry if I've only added to the confusion. Cheers, Jon ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] casting
Jon Wright wrote: I'm sorry, I still think we're talking past each other. What do you mean by native data type? If you just want to get an ndarray without specifying a type, use PyArray_FROM_O(). That's what it's for. You don't need to know the data type beforehand. What I have wanted in the past (and what I thought Neal was after) is a way to choose which function to call according to the typecode of the data as it is currently in memory. I don't want to convert (or cast or even touch the data) but just call a type specific function instead. C++ templates can take some of the tedium out of that, but in some cases algorithms may be different too. Guessing which sort algorithm to use springs to mind. Rather than saying give me the right kind of array, I think there is an interest in saying choose which function is the best for this data. Something like: PyArrayObject* array = PyArray_FROM_O( (PyObject*) O ); type = array - descr - type_num ; switch (type){ caseNPY_BYTE : signed_func(array); caseNPY_UBYTE : unsigned_func(array); // etc It sort of implies having a C++ type hierarchy for numpy arrays and casting array to be a PyFloatArray or PyDoubleArray etc? The extra confusion might be due to the way arrays can be laid out in memory - indexing into array slices is not always obvious. Also if you want to make sure your inner loop goes over the fast index you might want an algorithm which reads the strides when it runs. Sorry if I've only added to the confusion. Cheers, Jon This is close to what I'm doing. If I really can handle any type, then FROM_O is fine. Commonly, it's a little more complicated. Here is an example, in pseudo-code (real code is in c++): if (native_type_of (x) is int or long): do_something_with (convert_to_long_array(x)) elif (native_type_of (x) is complex): do_something_with (convert_to_complex_array (x)) In the above, native_type_of means: if (hasattr (x, __array_struct__)): array_if = PyCObject_AsVoidPtr (x.__array_struct__) return (map_array_if_to_typenum (array_if)) So this means for any numpy array or any type supporting __array_struct__ protocol, find out the native data type. I don't want to use FROM_O here, because I really can only handle certain types. If I used FROM_O, then after calling FROM_O, if the type was not one I could handle, I'd have to call FromAny and convert it. Or, in the case above, if I was given an array of int32 which I'd want to handle as long (int64), I'd have to convert it again. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Nose testing for numpy
On Jan 14, 2008 5:21 AM, Matthew Brett [EMAIL PROTECTED] wrote: Hi, I've just finished moving the scipy tests over to nose. Thinking about it, it seems to me to be a good idea to do the same for numpy. Any thoughts? A big +1 from me. Cheers, f ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] RFC: out of range slice indexes
I've never liked that python silently ignores slices with out of range indexes. I believe this is a source of bugs (it has been for me). It goes completely counter to the python philosophy. I vote to ban them from numpy. from numpy import array x = array (xrange (10)) x[11] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds x[:12] = 2 x array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) len (x) 10 Silently ignoring the error x[:12] is a bad idea, IMO. If it meant to _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm not advocating that position). I believe that out of bounds indexes should always throw IndexError. We can't change that in Python now, but maybe we can in numpy. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Nose testing for numpy
An added advantage is that is makes it much easier to run doctests: numpy.test(doctests=True) On Jan 14, 2008 11:36 AM, Fernando Perez [EMAIL PROTECTED] wrote: On Jan 14, 2008 5:21 AM, Matthew Brett [EMAIL PROTECTED] wrote: Hi, I've just finished moving the scipy tests over to nose. Thinking about it, it seems to me to be a good idea to do the same for numpy. Any thoughts? A big +1 from me. Cheers, f ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] RFC: out of range slice indexes
Neal Becker wrote: I've never liked that python silently ignores slices with out of range indexes. I believe this is a source of bugs (it has been for me). It goes completely counter to the python philosophy. I vote to ban them from numpy. from numpy import array x = array (xrange (10)) x[11] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds x[:12] = 2 x array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) len (x) 10 Silently ignoring the error x[:12] is a bad idea, IMO. If it meant to _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm not advocating that position). I believe that out of bounds indexes should always throw IndexError. We can't change that in Python now, but maybe we can in numpy. -1. Regardless of the merits if we had a blank slate, there is code that depends on this, specifically my code. It simplifies certain operations that would otherwise need tedious special case-handling. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] RFC: out of range slice indexes
Neal Becker wrote: Robert Kern wrote: Neal Becker wrote: I've never liked that python silently ignores slices with out of range indexes. I believe this is a source of bugs (it has been for me). It goes completely counter to the python philosophy. I vote to ban them from numpy. from numpy import array x = array (xrange (10)) x[11] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds x[:12] = 2 x array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) len (x) 10 Silently ignoring the error x[:12] is a bad idea, IMO. If it meant to _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm not advocating that position). I believe that out of bounds indexes should always throw IndexError. We can't change that in Python now, but maybe we can in numpy. -1. Regardless of the merits if we had a blank slate, there is code that depends on this, specifically my code. It simplifies certain operations that would otherwise need tedious special case-handling. For example? def ichunk(arr, chunk_size=10): for i in range(0, len(arr), chunk_size): yield arr[i:i+chunk_size] -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] RFC: out of range slice indexes
Robert Kern wrote: Neal Becker wrote: I've never liked that python silently ignores slices with out of range indexes. I believe this is a source of bugs (it has been for me). It goes completely counter to the python philosophy. I vote to ban them from numpy. from numpy import array x = array (xrange (10)) x[11] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds x[:12] = 2 x array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) len (x) 10 Silently ignoring the error x[:12] is a bad idea, IMO. If it meant to _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm not advocating that position). I believe that out of bounds indexes should always throw IndexError. We can't change that in Python now, but maybe we can in numpy. -1. Regardless of the merits if we had a blank slate, there is code that depends on this, specifically my code. It simplifies certain operations that would otherwise need tedious special case-handling. For example? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] RFC: out of range slice indexes
On Jan 14, 2008 12:37 PM, Neal Becker [EMAIL PROTECTED] wrote: I've never liked that python silently ignores slices with out of range indexes. I believe this is a source of bugs (it has been for me). It goes completely counter to the python philosophy. I vote to ban them from numpy. from numpy import array x = array (xrange (10)) x[11] Traceback (most recent call last): File stdin, line 1, in module IndexError: index out of bounds x[:12] = 2 x array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) len (x) 10 Silently ignoring the error x[:12] is a bad idea, IMO. If it meant to _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm not advocating that position). I believe that out of bounds indexes should always throw IndexError. We can't change that in Python now, but maybe we can in numpy. http://projects.scipy.org/mailman/listinfo/numpy-discussion -1. Regardless of the possible merit of this on it's face, I think this is an area we should maintain compatibility with Python sequences. Not to mention that it would likely break a bunch of code, including my own. -- . __ . |-\ . . [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Nose testing for numpy
Matthew Brett wrote: Hi, I've just finished moving the scipy tests over to nose. Thinking about it, it seems to me to be a good idea to do the same for numpy. We talked about this at the SciPy Sprint. Eventually, we will get there. However, if we do it before 1.0.5, it will require nose to run the NumPy tests. I'm concerned to make this kind of change, prior to 1.1 So, the strategy is to support what scipy needs for nose testing inside of NumPy for now and wait until 1.1 to move over to requiring nose for all of NumPy tests. Yes, it is not as clean, but I really hesitate making a whole-sale switch at version 1.0.5 -Travis O. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Nose testing for numpy
Hi, We talked about this at the SciPy Sprint. Eventually, we will get there. However, if we do it before 1.0.5, it will require nose to run the NumPy tests. I'm concerned to make this kind of change, prior to 1.1 Ah, sorry, I heard of the conclusion, but had thought it was due to the 2.4 dependency. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] casting
Neal Becker wrote: Jon Wright wrote: I'm sorry, I still think we're talking past each other. What do you mean by native data type? If you just want to get an ndarray without specifying a type, use PyArray_FROM_O(). That's what it's for. You don't need to know the data type beforehand. What I have wanted in the past (and what I thought Neal was after) is a way to choose which function to call according to the typecode of the data as it is currently in memory. I don't want to convert (or cast or even touch the data) but just call a type specific function instead. C++ templates can take some of the tedium out of that, but in some cases algorithms may be different too. Guessing which sort algorithm to use springs to mind. Rather than saying give me the right kind of array, I think there is an interest in saying choose which function is the best for this data. Something like: PyArrayObject* array = PyArray_FROM_O( (PyObject*) O ); type = array - descr - type_num ; switch (type){ caseNPY_BYTE : signed_func(array); caseNPY_UBYTE : unsigned_func(array); // etc It sort of implies having a C++ type hierarchy for numpy arrays and casting array to be a PyFloatArray or PyDoubleArray etc? The extra confusion might be due to the way arrays can be laid out in memory - indexing into array slices is not always obvious. Also if you want to make sure your inner loop goes over the fast index you might want an algorithm which reads the strides when it runs. Sorry if I've only added to the confusion. Cheers, Jon This is close to what I'm doing. If I really can handle any type, then FROM_O is fine. Commonly, it's a little more complicated. Here is an example, in pseudo-code (real code is in c++): if (native_type_of (x) is int or long): do_something_with (convert_to_long_array(x)) elif (native_type_of (x) is complex): do_something_with (convert_to_complex_array (x)) In the above, native_type_of means: if (hasattr (x, __array_struct__)): array_if = PyCObject_AsVoidPtr (x.__array_struct__) return (map_array_if_to_typenum (array_if)) So this means for any numpy array or any type supporting __array_struct__ protocol, find out the native data type. I don't want to use FROM_O here, because I really can only handle certain types. If I used FROM_O, then after calling FROM_O, if the type was not one I could handle, I'd have to call FromAny and convert it. Or, in the case above, if I was given an array of int32 which I'd want to handle as long (int64), I'd have to convert it again. Okay, I think I see now. I'm not sure what numpy could do to make your code more elegant. I would recommend just using PyArray_FROM_O() or PyArray_EnsureArray() to get a real ndarray object. They should not copy when the object is already an ndarray or if it satisfies the array interface; however, it will also handle the cases when you have a Python-level __array_interface__ or just nested sequences, too. You can look at the PyArray_Descr directly, dispatch the types that you can handle directly, and then convert for the types which you can't. You will not get any extraneous conversions or copies of data. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion