Re: [Numpy-discussion] Update on using scons to build numpy

2008-01-14 Thread David Cournapeau
Matthew Brett wrote:
 Hi,

   
 A quick email to give an update on my work to build numpy with
 scons. I've finished a few days ago to make my former work a separate
 package from numpy: it was more work than I expected because of
 bootstrapping issues, but I can now build numpy again with the new
 package on Linux.
 

 Just to thank you very much for your work on this.
   
Thanks. If you are willing to test, please submit any bugs in launchpad, 
or on this ML,

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Nose testing for numpy

2008-01-14 Thread Matthew Brett
Hi,

I've just finished moving the scipy tests over to nose.

Thinking about it, it seems to me to be a good idea to do the same for numpy.

The advantages of doing this now are that numpy and scipy would be in
parallel, that we can continue to have one testing system for both,
and that it would be clear to both numpy and scipy developers that
they should not use NumpyTest but the nose test framework.  At the
moment, you can still find yourself using the numpy test framework in
scipy, and the tests will work - but it should be deprecated.
Ideally, to make a clean switch, I think using numpytest should raise
an error.

One issue is that the scipy nose test labels use decorators, for which
the standard syntax requires python 2.4.  To avoid this, we can just
apply the decorators with the

def test_func(): pass
test_func = dec.slow(test_func)

syntax.  It's a rather moot point, as the decorators are mainly used
to define test level, there's only one extra test found for
numpy.test(10) compared to numpy.test().

I think I could do the port in quite a short time.

Any thoughts?

Matthew
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] casting

2008-01-14 Thread Neal Becker
Robert Kern wrote:

 Neal Becker wrote:
 numpy frequently refers to 'casting'.  I'm not sure if that term is ever
 defined.  I believe it has the same meaning as in C.  In that case, it is
 unfortunately used to mean 2 different things.  There are casts that do
 not change the underlying bits (such as a pointer cast), and there are
 casts that actually convert to different bits (such as float - double).
 
 I think numpy means the latter.  When an array where the underlying data
 is one type, a cast to another type means actually reallocating and
 converting the data.
 
 Yes, that is usually what people mean when they use _casting_ in the
 context of numpy. It is the more frequently performed operation of the
 two. The former can be accomplished with the .view(dtype) method of
 ndarrays.
 
 It often occurs that I have an algorithm that can take any integral type,
 because it is written with c++ templates.  In that case, I don't want to
 use PyArray_FROMANY, because I don't want to unecessarily convert the
 array
 data.  Instead, I'd like to inquire what is the preferred type of the
 data.
 
 The solution I'm exploring is to use a function I
 call 'preferred_array_type'.  This uses the __array_struct__ interface to
 find the native data type.  I chose to use this interface, because then
 it will work with both numpy arrays and other array-like types.
 
 Any thoughts on all of this?
 
 I'm not sure what you mean by preferred type of the data. Do you mean
 the dtype of the array as it comes in? There are several functions and
 function macros in the numpy C API which take differing amounts of
 information. For example,
 
   * PyArray_FROM_O(PyObject*onj) just takes an object.
   * PyArray_FROM_OF(PyObject* obj, int req) takes an object and flags like
 NPY_CONTIGUOUS.
   * PyArray_FROM_OT(PyObject* obj, int typenum) takes an object and a type
 number.
   * PyArray_FROM_OTF(PyObject* obj, int typenum, int req) takes an object,
   type,
 and flags.
 

Let me try again to explain.  I don't want to convert to some type first -
that would be a waste.  I need to find out what is the native data type of
the input array first.  Also, I'd like to allow that the input is not a
PyArray, but could be something conforming to __array_struct__ interface. 
So, I need to find the native data type _first_, _then_ call the
appropriate PyArray_FROM...

Further, I don't believe this requirement is unique.  I would think it would
be needed for any time when a user wants to create a function that can
accept a numpy array, and would like to avoid unnecessary data conversion. 
This is particularly true when the underlying function is using c++
templates to allow the data type to be a template parameter (and so can
operate on any - or a range - of data types).

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] casting

2008-01-14 Thread Robert Kern
Neal Becker wrote:
 Robert Kern wrote:
 
 Neal Becker wrote:
 numpy frequently refers to 'casting'.  I'm not sure if that term is ever
 defined.  I believe it has the same meaning as in C.  In that case, it is
 unfortunately used to mean 2 different things.  There are casts that do
 not change the underlying bits (such as a pointer cast), and there are
 casts that actually convert to different bits (such as float - double).

 I think numpy means the latter.  When an array where the underlying data
 is one type, a cast to another type means actually reallocating and
 converting the data.
 Yes, that is usually what people mean when they use _casting_ in the
 context of numpy. It is the more frequently performed operation of the
 two. The former can be accomplished with the .view(dtype) method of
 ndarrays.

 It often occurs that I have an algorithm that can take any integral type,
 because it is written with c++ templates.  In that case, I don't want to
 use PyArray_FROMANY, because I don't want to unecessarily convert the
 array
 data.  Instead, I'd like to inquire what is the preferred type of the
 data.

 The solution I'm exploring is to use a function I
 call 'preferred_array_type'.  This uses the __array_struct__ interface to
 find the native data type.  I chose to use this interface, because then
 it will work with both numpy arrays and other array-like types.

 Any thoughts on all of this?
 I'm not sure what you mean by preferred type of the data. Do you mean
 the dtype of the array as it comes in? There are several functions and
 function macros in the numpy C API which take differing amounts of
 information. For example,

   * PyArray_FROM_O(PyObject*onj) just takes an object.
   * PyArray_FROM_OF(PyObject* obj, int req) takes an object and flags like
 NPY_CONTIGUOUS.
   * PyArray_FROM_OT(PyObject* obj, int typenum) takes an object and a type
 number.
   * PyArray_FROM_OTF(PyObject* obj, int typenum, int req) takes an object,
   type,
 and flags.

 
 Let me try again to explain.  I don't want to convert to some type first -
 that would be a waste.  I need to find out what is the native data type of
 the input array first.  Also, I'd like to allow that the input is not a
 PyArray, but could be something conforming to __array_struct__ interface. 
 So, I need to find the native data type _first_, _then_ call the
 appropriate PyArray_FROM...

I'm sorry, I still think we're talking past each other. What do you mean by 
native data type? If you just want to get an ndarray without specifying a 
type, use PyArray_FROM_O(). That's what it's for. You don't need to know the 
data type beforehand.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth.
   -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] casting

2008-01-14 Thread Jon Wright
 I'm sorry, I still think we're talking past each other. What do you mean by 
 native data type? If you just want to get an ndarray without specifying a 
 type, use PyArray_FROM_O(). That's what it's for. You don't need to know the 
 data type beforehand.

What I have wanted in the past (and what I thought Neal was after) is a 
way to choose which function to call according to the typecode of the 
data as it is currently in memory. I don't want to convert (or cast or 
even touch the data) but just call a type specific function instead. C++ 
templates can take some of the tedium out of that, but in some cases 
algorithms may be different too. Guessing which sort algorithm to use 
springs to mind.

Rather than saying give me the right kind of array, I think there is 
an interest in saying choose which function is the best for this data. 
Something like:

   PyArrayObject* array = PyArray_FROM_O( (PyObject*) O );
   type = array - descr - type_num ;
   switch (type){
  caseNPY_BYTE   : signed_func(array);
  caseNPY_UBYTE  : unsigned_func(array);
  // etc

It sort of implies having a C++ type hierarchy for numpy arrays and 
casting array to be a PyFloatArray or PyDoubleArray etc?

The extra confusion might be due to the way arrays can be laid out in 
memory - indexing into array slices is not always obvious. Also if you 
want to make sure your inner loop goes over the fast index you might 
want an algorithm which reads the strides when it runs.

Sorry if I've only added to the confusion.

Cheers,

Jon


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] casting

2008-01-14 Thread Neal Becker
Jon Wright wrote:

 I'm sorry, I still think we're talking past each other. What do you mean
 by native data type? If you just want to get an ndarray without
 specifying a type, use PyArray_FROM_O(). That's what it's for. You don't
 need to know the data type beforehand.
 
 What I have wanted in the past (and what I thought Neal was after) is a
 way to choose which function to call according to the typecode of the
 data as it is currently in memory. I don't want to convert (or cast or
 even touch the data) but just call a type specific function instead. C++
 templates can take some of the tedium out of that, but in some cases
 algorithms may be different too. Guessing which sort algorithm to use
 springs to mind.
 
 Rather than saying give me the right kind of array, I think there is
 an interest in saying choose which function is the best for this data.
 Something like:
 
PyArrayObject* array = PyArray_FROM_O( (PyObject*) O );
type = array - descr - type_num ;
switch (type){
   caseNPY_BYTE   : signed_func(array);
   caseNPY_UBYTE  : unsigned_func(array);
   // etc
 
 It sort of implies having a C++ type hierarchy for numpy arrays and
 casting array to be a PyFloatArray or PyDoubleArray etc?
 
 The extra confusion might be due to the way arrays can be laid out in
 memory - indexing into array slices is not always obvious. Also if you
 want to make sure your inner loop goes over the fast index you might
 want an algorithm which reads the strides when it runs.
 
 Sorry if I've only added to the confusion.
 
 Cheers,
 
 Jon

This is close to what I'm doing.  If I really can handle any type, then
FROM_O is fine.

Commonly, it's a little more complicated.

Here is an example, in pseudo-code (real code is in c++):  

if (native_type_of (x) is int or long):
  do_something_with (convert_to_long_array(x))
elif (native_type_of (x) is complex):
  do_something_with (convert_to_complex_array (x))

In the above, native_type_of means:
if (hasattr (x, __array_struct__)):
   array_if = PyCObject_AsVoidPtr (x.__array_struct__)
   return (map_array_if_to_typenum (array_if))

So this means for any numpy array or any type supporting __array_struct__
protocol, find out the native data type.

I don't want to use FROM_O here, because I really can only handle certain
types.  If I used FROM_O, then after calling FROM_O, if the type was not
one I could handle, I'd have to call FromAny and convert it.  Or, in the
case above, if I was given an array of int32 which I'd want to handle as
long (int64), I'd have to convert it again.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nose testing for numpy

2008-01-14 Thread Fernando Perez
On Jan 14, 2008 5:21 AM, Matthew Brett [EMAIL PROTECTED] wrote:
 Hi,

 I've just finished moving the scipy tests over to nose.

 Thinking about it, it seems to me to be a good idea to do the same for numpy.

 Any thoughts?

A big +1 from me.

Cheers,

f
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] RFC: out of range slice indexes

2008-01-14 Thread Neal Becker
I've never liked that python silently ignores slices with out of range
indexes.  I believe this is a source of bugs (it has been for me).  It goes
completely counter to the python philosophy.

I vote to ban them from numpy.
 from numpy import array
 x = array (xrange (10))
 x[11]
Traceback (most recent call last):
  File stdin, line 1, in module
IndexError: index out of bounds
 x[:12] = 2
 x
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
 len (x)
10

Silently ignoring the error x[:12] is a bad idea, IMO.  If it meant to
_extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm
not advocating that position).

I believe that out of bounds indexes should always throw IndexError.  We
can't change that in Python now, but maybe we can in numpy.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nose testing for numpy

2008-01-14 Thread Matthew Brett
An added advantage is that is makes it much easier to run doctests:

numpy.test(doctests=True)


On Jan 14, 2008 11:36 AM, Fernando Perez [EMAIL PROTECTED] wrote:
 On Jan 14, 2008 5:21 AM, Matthew Brett [EMAIL PROTECTED] wrote:
  Hi,
 
  I've just finished moving the scipy tests over to nose.
 
  Thinking about it, it seems to me to be a good idea to do the same for 
  numpy.

  Any thoughts?

 A big +1 from me.

 Cheers,

 f
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] RFC: out of range slice indexes

2008-01-14 Thread Robert Kern
Neal Becker wrote:
 I've never liked that python silently ignores slices with out of range
 indexes.  I believe this is a source of bugs (it has been for me).  It goes
 completely counter to the python philosophy.
 
 I vote to ban them from numpy.
 from numpy import array
 x = array (xrange (10))
 x[11]
 Traceback (most recent call last):
   File stdin, line 1, in module
 IndexError: index out of bounds
 x[:12] = 2
 x
 array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
 len (x)
 10
 
 Silently ignoring the error x[:12] is a bad idea, IMO.  If it meant to
 _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm
 not advocating that position).
 
 I believe that out of bounds indexes should always throw IndexError.  We
 can't change that in Python now, but maybe we can in numpy.

-1. Regardless of the merits if we had a blank slate, there is code that 
depends 
on this, specifically my code. It simplifies certain operations that would 
otherwise need tedious special case-handling.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth.
   -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] RFC: out of range slice indexes

2008-01-14 Thread Robert Kern
Neal Becker wrote:
 Robert Kern wrote:
 
 Neal Becker wrote:
 I've never liked that python silently ignores slices with out of range
 indexes.  I believe this is a source of bugs (it has been for me).  It
 goes completely counter to the python philosophy.

 I vote to ban them from numpy.
 from numpy import array
 x = array (xrange (10))
 x[11]
 Traceback (most recent call last):
   File stdin, line 1, in module
 IndexError: index out of bounds
 x[:12] = 2
 x
 array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
 len (x)
 10

 Silently ignoring the error x[:12] is a bad idea, IMO.  If it meant to
 _extend_ x to have lenght 12, at least _that_ would be reasonable (but
 I'm not advocating that position).

 I believe that out of bounds indexes should always throw IndexError.  We
 can't change that in Python now, but maybe we can in numpy.
 -1. Regardless of the merits if we had a blank slate, there is code that
 depends on this, specifically my code. It simplifies certain operations
 that would otherwise need tedious special case-handling.
 
 For example?

def ichunk(arr, chunk_size=10):
 for i in range(0, len(arr), chunk_size):
 yield arr[i:i+chunk_size]

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth.
   -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] RFC: out of range slice indexes

2008-01-14 Thread Neal Becker
Robert Kern wrote:

 Neal Becker wrote:
 I've never liked that python silently ignores slices with out of range
 indexes.  I believe this is a source of bugs (it has been for me).  It
 goes completely counter to the python philosophy.
 
 I vote to ban them from numpy.
 from numpy import array
 x = array (xrange (10))
 x[11]
 Traceback (most recent call last):
   File stdin, line 1, in module
 IndexError: index out of bounds
 x[:12] = 2
 x
 array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
 len (x)
 10
 
 Silently ignoring the error x[:12] is a bad idea, IMO.  If it meant to
 _extend_ x to have lenght 12, at least _that_ would be reasonable (but
 I'm not advocating that position).
 
 I believe that out of bounds indexes should always throw IndexError.  We
 can't change that in Python now, but maybe we can in numpy.
 
 -1. Regardless of the merits if we had a blank slate, there is code that
 depends on this, specifically my code. It simplifies certain operations
 that would otherwise need tedious special case-handling.
 

For example?

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] RFC: out of range slice indexes

2008-01-14 Thread Timothy Hochberg
On Jan 14, 2008 12:37 PM, Neal Becker [EMAIL PROTECTED] wrote:

 I've never liked that python silently ignores slices with out of range
 indexes.  I believe this is a source of bugs (it has been for me).  It
 goes
 completely counter to the python philosophy.

 I vote to ban them from numpy.
  from numpy import array
  x = array (xrange (10))
  x[11]
 Traceback (most recent call last):
  File stdin, line 1, in module
 IndexError: index out of bounds
  x[:12] = 2
  x
 array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
  len (x)
 10

 Silently ignoring the error x[:12] is a bad idea, IMO.  If it meant to
 _extend_ x to have lenght 12, at least _that_ would be reasonable (but I'm
 not advocating that position).

 I believe that out of bounds indexes should always throw IndexError.  We
 can't change that in Python now, but maybe we can in numpy.
 http://projects.scipy.org/mailman/listinfo/numpy-discussion



-1. Regardless of the possible merit of this on it's face, I think this is
an area we should maintain compatibility with Python sequences. Not to
mention that it would likely break a bunch of code, including my own.



-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nose testing for numpy

2008-01-14 Thread Travis E. Oliphant
Matthew Brett wrote:
 Hi,

 I've just finished moving the scipy tests over to nose.

 Thinking about it, it seems to me to be a good idea to do the same for numpy.
   
We talked about this at the SciPy Sprint.  Eventually, we will get 
there.  However, if we do it before 1.0.5, it will require nose to run 
the NumPy tests.   I'm concerned to make this kind of change, prior to 1.1

So, the strategy is to support what scipy needs for nose testing inside 
of NumPy for now and wait until 1.1 to move over to requiring nose for 
all of NumPy tests.

Yes, it is not as clean, but I really hesitate making a whole-sale 
switch at version 1.0.5

-Travis O.


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nose testing for numpy

2008-01-14 Thread Matthew Brett
Hi,

 We talked about this at the SciPy Sprint.  Eventually, we will get
 there.  However, if we do it before 1.0.5, it will require nose to run
 the NumPy tests.   I'm concerned to make this kind of change, prior to 1.1

Ah, sorry, I heard of the conclusion, but had thought it was due to
the 2.4 dependency.

Matthew
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] casting

2008-01-14 Thread Robert Kern
Neal Becker wrote:
 Jon Wright wrote:
 
 I'm sorry, I still think we're talking past each other. What do you mean
 by native data type? If you just want to get an ndarray without
 specifying a type, use PyArray_FROM_O(). That's what it's for. You don't
 need to know the data type beforehand.
 What I have wanted in the past (and what I thought Neal was after) is a
 way to choose which function to call according to the typecode of the
 data as it is currently in memory. I don't want to convert (or cast or
 even touch the data) but just call a type specific function instead. C++
 templates can take some of the tedium out of that, but in some cases
 algorithms may be different too. Guessing which sort algorithm to use
 springs to mind.

 Rather than saying give me the right kind of array, I think there is
 an interest in saying choose which function is the best for this data.
 Something like:

PyArrayObject* array = PyArray_FROM_O( (PyObject*) O );
type = array - descr - type_num ;
switch (type){
   caseNPY_BYTE   : signed_func(array);
   caseNPY_UBYTE  : unsigned_func(array);
   // etc

 It sort of implies having a C++ type hierarchy for numpy arrays and
 casting array to be a PyFloatArray or PyDoubleArray etc?

 The extra confusion might be due to the way arrays can be laid out in
 memory - indexing into array slices is not always obvious. Also if you
 want to make sure your inner loop goes over the fast index you might
 want an algorithm which reads the strides when it runs.

 Sorry if I've only added to the confusion.

 Cheers,

 Jon
 
 This is close to what I'm doing.  If I really can handle any type, then
 FROM_O is fine.
 
 Commonly, it's a little more complicated.
 
 Here is an example, in pseudo-code (real code is in c++):  
 
 if (native_type_of (x) is int or long):
   do_something_with (convert_to_long_array(x))
 elif (native_type_of (x) is complex):
   do_something_with (convert_to_complex_array (x))
 
 In the above, native_type_of means:
 if (hasattr (x, __array_struct__)):
array_if = PyCObject_AsVoidPtr (x.__array_struct__)
return (map_array_if_to_typenum (array_if))
 
 So this means for any numpy array or any type supporting __array_struct__
 protocol, find out the native data type.
 
 I don't want to use FROM_O here, because I really can only handle certain
 types.  If I used FROM_O, then after calling FROM_O, if the type was not
 one I could handle, I'd have to call FromAny and convert it.  Or, in the
 case above, if I was given an array of int32 which I'd want to handle as
 long (int64), I'd have to convert it again.

Okay, I think I see now. I'm not sure what numpy could do to make your code 
more 
elegant. I would recommend just using PyArray_FROM_O() or PyArray_EnsureArray() 
to get a real ndarray object. They should not copy when the object is already 
an 
ndarray or if it satisfies the array interface; however, it will also handle 
the 
cases when you have a Python-level __array_interface__ or just nested 
sequences, 
too. You can look at the PyArray_Descr directly, dispatch the types that you 
can 
handle directly, and then convert for the types which you can't. You will not 
get any extraneous conversions or copies of data.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth.
   -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion