Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-07 Thread Ryan May
Charles R Harris wrote:
 
 
 On Jan 7, 2008 8:47 AM, Ryan May [EMAIL PROTECTED] mailto:[EMAIL 
 PROTECTED] wrote:
 
 Stuart Brorson wrote:
  I realize NumPy != Matlab, but I'd wager that most users would think
  that this is the natural behavior..
  Well, that behavior won't happen. We won't mutate the dtype of
 the array because
  of assignment. Matlab has copy(-on-write) semantics for things
 like slices while
  we have view semantics. We can't safely do the reallocation of
 memory [1].
 
  That's fair enough.  But then I think NumPy should consistently
  typecheck all assignmetns and throw an exception if the user attempts
  an assignment which looses information.
 
 
 Yeah, there's no doubt in my mind that this is a bug, if for no other
 reason than this inconsistency:
 
 
 One place where Numpy differs from MatLab is the way memory is handled.
 MatLab is always generating new arrays, so for efficiency it is worth
 preallocating arrays and then filling in the parts. This is not the case
 in Numpy where lists can be used for things that grow and subarrays are
 views. Consequently, preallocating arrays in Numpy should be rare and
 used when either the values have to be generated explicitly, which is
 what you see when using the indexes in your first example. As to
 assignment between arrays, it is a mixed question. The problem again is
 memory usage. For large arrays, it makes since to do automatic
 conversions, as is also the case in functions taking output arrays,
 because the typecast can be pushed down into C where it is time and
 space efficient, whereas explicitly converting the array uses up
 temporary space. However, I can imagine an explicit typecast function,
 something like
 
 a[...] = typecast(b)
 
 that would replace the current behavior. I think the typecast function
 could be implemented by returning a view of b with a castable flag set
 to true, that should supply enough information for the assignment
 operator to do its job. This might be a good addition for Numpy 1.1.

While that seems like an ok idea, I'm still not sure what's wrong with
raising an exception when there will be information loss.  The exception
is already raised with standard python complex objects.  I can think of
many times in my code where explicit looping is a necessity, so
pre-allocating the array is the only way to go.

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-07 Thread Zachary Pincus
 For large arrays, it makes since to do automatic
 conversions, as is also the case in functions taking output arrays,
 because the typecast can be pushed down into C where it is time and
 space efficient, whereas explicitly converting the array uses up
 temporary space. However, I can imagine an explicit typecast  
 function,
 something like

 a[...] = typecast(b)

 that would replace the current behavior. I think the typecast  
 function
 could be implemented by returning a view of b with a castable flag  
 set
 to true, that should supply enough information for the assignment
 operator to do its job. This might be a good addition for Numpy 1.1.

 While that seems like an ok idea, I'm still not sure what's wrong with
 raising an exception when there will be information loss.  The  
 exception
 is already raised with standard python complex objects.  I can  
 think of
 many times in my code where explicit looping is a necessity, so
 pre-allocating the array is the only way to go.

The issue Charles is dealing with here is how to *suppress* the  
proposed exception in cases (as the several that I described) where  
the information loss is explicitly desired.

With what's currently in numpy now, you would have to do something  
like this:
A[...] = B.astype(A.dtype)
to set a portion of A to B, unless you are *certain* that A and B are  
of compatible types.

This is ugly and also bug-prone, seeing as how there's some violation  
of the don't-repeat-yourself principle. (I.e. A is mentioned twice,  
and to change the code to use a different array, you need to change  
the variable name A twice.)

Moreover, and worse, the phrase 'A = B.astype(A.dtype)' creates and  
destroys a temporary array the same size as B. It's equivalent to:
temp = B.astype(A.dtype)
A[...] = temp

Which is no good if B is very large. Currently, the type conversion  
in 'A[...] = B' cases is handled implicitly, deep down in the C code  
where it is very very fast, and no temporary array is made.

Charles suggests a 'typecast' operator that would set a flag on the  
array B so that trying to convert it would *not* raise an exception,  
allowing for the fast, C-level conversion. (This assumes your  
proposed change in which by default such conversions would raise  
exceptions.) This 'typecast' operation wouldn't do anything but set a  
flag, so it doesn't create a temporary array or do any extra work.

But still, every time that you are not *certain* what the type of a  
result from a given operation is, any code that looks like:
A[i] = calculate(...)
will need to look like this instead:
A[i] = typecast(calculate(...))

I agree with others that such assignments aren't highly common, but  
there will be broken code from this. And as Charles demonstrates,  
getting the details right of how to implement such a change is non- 
trivial.

Zach

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-07 Thread Anne Archibald
On 07/01/2008, Charles R Harris [EMAIL PROTECTED] wrote:

 One place where Numpy differs from MatLab is the way memory is handled.
 MatLab is always generating new arrays, so for efficiency it is worth
 preallocating arrays and then filling in the parts. This is not the case in
 Numpy where lists can be used for things that grow and subarrays are views.
 Consequently, preallocating arrays in Numpy should be rare and used when
 either the values have to be generated explicitly, which is what you see
 when using the indexes in your first example. As to assignment between
 arrays, it is a mixed question. The problem again is memory usage. For large
 arrays, it makes since to do automatic conversions, as is also the case in
 functions taking output arrays, because the typecast can be pushed down into
 C where it is time and space efficient, whereas explicitly converting the
 array uses up temporary space. However, I can imagine an explicit typecast
 function, something like

 a[...] = typecast(b)

 that would replace the current behavior. I think the typecast function could
 be implemented by returning a view of b with a castable flag set to true,
 that should supply enough information for the assignment operator to do its
 job. This might be a good addition for Numpy 1.1.

This is introducing a fairly complex mechanism to cover, as far as I
can see, two cases: conversion of complex to real, and conversion of
float to integer. Conversion of complex to real can already be done
explicitly without a temporary:

a[...] = b.real

That leaves only conversion of float to integer.

Does this single case cause enough confusion to warrant an exception
and a way around it?

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-07 Thread Charles R Harris
On Jan 7, 2008 12:08 PM, Anne Archibald [EMAIL PROTECTED] wrote:

 On 07/01/2008, Charles R Harris [EMAIL PROTECTED] wrote:
 
  One place where Numpy differs from MatLab is the way memory is handled.
  MatLab is always generating new arrays, so for efficiency it is worth
  preallocating arrays and then filling in the parts. This is not the case
 in
  Numpy where lists can be used for things that grow and subarrays are
 views.
  Consequently, preallocating arrays in Numpy should be rare and used when
  either the values have to be generated explicitly, which is what you see
  when using the indexes in your first example. As to assignment between
  arrays, it is a mixed question. The problem again is memory usage. For
 large
  arrays, it makes since to do automatic conversions, as is also the case
 in
  functions taking output arrays, because the typecast can be pushed down
 into
  C where it is time and space efficient, whereas explicitly converting
 the
  array uses up temporary space. However, I can imagine an explicit
 typecast
  function, something like
 
  a[...] = typecast(b)
 
  that would replace the current behavior. I think the typecast function
 could
  be implemented by returning a view of b with a castable flag set to
 true,
  that should supply enough information for the assignment operator to do
 its
  job. This might be a good addition for Numpy 1.1.

 This is introducing a fairly complex mechanism to cover, as far as I
 can see, two cases: conversion of complex to real, and conversion of
 float to integer. Conversion of complex to real can already be done
 explicitly without a temporary:

 a[...] = b.real

 That leaves only conversion of float to integer.

 Does this single case cause enough confusion to warrant an exception
 and a way around it?


I think it could be something like C++ where implicit downcasts in general
are flagged. If we went that way, conversions like uint32 to int32 would
also be flagged. The question is how much care we want to take with types.
Since Numpy is much more type oriented than, say, MatLab, such type safety
might be a useful addition. Python scalar types would remain castable, so
you wouldn't have to typecast every darn thing.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-05 Thread Stuart Brorson
 I wasn't at all arguing that having complex data chopped and
 downcast into an int or float container was the right thing to do.

 indeed, it is an clearly bad thing to do -- but a bug magnet? I'm not so
 sure, surely, anyone that had any idea at all what they were doing with
 complex numbers would notice it right away.

 To the OP: Did you waste any time thinking this was working right? Or
 was your time spent figuring out why numpy wold do such a thing? which
 is wasted time none the less.

Thanks for the question.  I spent about 1/2 hour looking at my other
code trying to figure out why I was getting strange results.  Of
course, I suspected my own code to be the culpret, since NumPy is a
mature package, right?.

Then, when I looked closely at the return array given by NumPy, I
noticed that it was real, when I was working with complex numbers.  I
said to myself WTF?.  Then a little more investigating revealed that
NumPy was silently truncating my complex array to real.

I respectfully submit that silently chopping the imaginary part *is* a
magnet for bugs, since many dumb NumPy users (like me) aren't
necessarily aware of the typecasting rules.  We're only thinking about
doing math, after all!

Stuart Brorson
Interactive Supercomputing, inc.
135 Beaver Street | Waltham | MA | 02452 | USA
http://www.interactivesupercomputing.com/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-05 Thread Zachary Pincus
 There are two related hierarchies of datatypes: different kinds  
 (integer,
 floating point, complex floating point) and different precisions  
 within a given
 kind (int8, int16, int32, int64). The term downcasting should  
 probably be
 reserved for the latter only.

 It seems to me that Zach and Scott are defending downcasting of  
 precisions
 within a given kind. It does not necessarily follow that the  
 behavior we choose
 for dtypes within a given kind should be the behavior when we are  
 dealing with
 dtypes across different kinds. We can keep the precision  
 downcasting behavior
 that you want while raising an error when one attempts to assign a  
 complex
 number into a floating point array.

Actually, my points were explicitly about cross-kind conversions! A  
lot of image-processing code features heavy use of converting integer  
images to float for intermediate calculations, and then rescaling and  
down-converting back to the original type for display, etc.

In fact, scipy.ndimage makes use of this, allowing for operations  
with output into user-specified arrays, or arrays of user-specified  
dtype, while (I believe) carrying out some of the intermediate  
calculations at higher precision.

As such, pretty much any code that takes a user-specified array and  
assigns to it the result of a (potentially) mixed-mode calculation  
would need to be changed from:

A[i] = calculate(B)
to
A[i] = calculate(B).astype(A.dtype)

with the proliferation of temp arrays that that entails.

I think, but am not sure, that there is a lot of code out there that  
does this, intentionally, which would be broken by this change.  
Explicit is indeed better than implicit, but in this case, finding  
all of the places where mixed-mode conversions happen and tracking  
them down could be a pain on the same scale as const chasing in C++,  
where fixing the error in one place makes it reappear elsewhere in  
the chain of usage, leading to long, painful, and often totally  
pointless debugging sessions.

In the specific case of converting from complex to anything else, I  
can see the desire for a warning to prevent data loss. But converting  
from float to integer is a lot more routine and is used a lot in the  
kind of work that I do, at least.

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Stuart Brorson
On Fri, 4 Jan 2008, Stuart Brorson wrote:

 I just discovered this today.  It looks like a bug to me.  Please
 flame me mercilessly if I am wrong!  :-)

FWIW, here's what Matlab does:

 A = rand(1, 4) + rand(1, 4)*i

A =

   Columns 1 through 3

0.7833 + 0.7942i   0.6808 + 0.0592i   0.4611 + 0.6029i

   Column 4

0.5678 + 0.0503i

 B = zeros(1, 4)

B =

  0 0 0 0

 for idx=1:4; B(idx) = A(idx); end
 B

B =

   Columns 1 through 3

0.7833 + 0.7942i   0.6808 + 0.0592i   0.4611 + 0.6029i

   Column 4

0.5678 + 0.0503i


I realize NumPy != Matlab, but I'd wager that most users would think
that this is the natural behavior..


Stuart Brorson
Interactive Supercomputing, inc.
135 Beaver Street | Waltham | MA | 02452 | USA
http://www.interactivesupercomputing.com/

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Stuart Brorson
 I just discovered this today.  It looks like a bug to me.  Please
 flame me mercilessly if I am wrong!  :-)

H after a little more playing around, I think it's indeed true
that NumPy does a typecast to make the resulting assignment have the
same type as the LHS, regardless of the type of the RHS.  Below I
attach another example, which shows this behavior.

As a naive user, I would not expect that my variables would get
silently typecast in an assignment like this.  But is this behavior
Pythonic?  I'm not sure.  Normally, the Pythonic thing to do when
assigning non-conforming variables is to barf -- throw an exception.
At least that's what I would expect.

Comments?

Stuart Brorson
Interactive Supercomputing, inc.
135 Beaver Street | Waltham | MA | 02452 | USA
http://www.interactivesupercomputing.com/


---  session log  -

In [77]: A = 10*numpy.random.rand(4)

In [78]: B = numpy.zeros((4))

In [79]: B.dtype='int64'

In [80]:

In [80]: for i in range(4):
:   B[i] = A[i]
:

In [81]: A
Out[81]: array([ 1.71327285,  3.48128855,  7.51404178,  8.96775254])

In [82]: B
Out[82]: array([1, 3, 7, 8], dtype=int64)

---  /session log  -
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Pierre GM
On Friday 04 January 2008 16:08:54 Stuart Brorson wrote:
 Sometimes you need to initialize an array using zeros() before doing
 an assignment to it in a loop.  If you assign a complex value to the
 initialized array, the imaginary part of the array is dropped.  Does
 NumPy do a silent type-cast which causes this behavior?  Is this
 typecast a feature?

By default, empty arrays are initialized as float. If you try to force a 
complex into it, yes, the imaginary part will be dropped. Use the dtype 
argument of zeros or ones to specify the type of your array, for example:

B =  numpy.zeros((4,), dtype=numpy.complex_)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Stuart Brorson
NumPy gurus --

I just discovered this today.  It looks like a bug to me.  Please
flame me mercilessly if I am wrong!  :-)

Sometimes you need to initialize an array using zeros() before doing
an assignment to it in a loop.  If you assign a complex value to the
initialized array, the imaginary part of the array is dropped.  Does
NumPy do a silent type-cast which causes this behavior?  Is this
typecast a feature?

Below I attach a session log showing the bug.  Note that I have boiled
down my complex code to this simple case for ease of comprehension.  [1] 
I will also input this bug into the tracking system.

By the way, this is NumPy 1.0.4:

In [39]: numpy.__version__
Out[39]: '1.0.4'

Cheers,

Stuart Brorson
Interactive Supercomputing, inc.
135 Beaver Street | Waltham | MA | 02452 | USA
http://www.interactivesupercomputing.com/


--  session log  

In [29]: A = numpy.random.rand(4) + 1j*numpy.random.rand(4)

In [30]: B = numpy.zeros((4))

In [31]:

In [31]: for i in range(4):
:   B[i] = A[i]
:

In [32]: A
Out[32]: 
array([ 0.12150180+0.00577893j,  0.39792515+0.03607227j,
 0.61933379+0.04506978j,  0.56751678+0.24576083j])

In [33]: B
Out[33]: array([ 0.1215018 ,  0.39792515,  0.61933379,  0.56751678])

---  /session log  ---


[1]  Yes, I know that I should use vectorized code as often as
possible, and that this example is not vectorized.  This is a simple
example illustrating the problem.  Moreover, many times the
computation you wish to perform can't easily be vectorized, leaving
the nasty old for loop as the only choice..
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Robert Kern
Stuart Brorson wrote:
 On Fri, 4 Jan 2008, Stuart Brorson wrote:
 
 I just discovered this today.  It looks like a bug to me.  Please
 flame me mercilessly if I am wrong!  :-)
 
 FWIW, here's what Matlab does:
 
 A = rand(1, 4) + rand(1, 4)*i
 
 A =
 
Columns 1 through 3
 
 0.7833 + 0.7942i   0.6808 + 0.0592i   0.4611 + 0.6029i
 
Column 4
 
 0.5678 + 0.0503i
 
 B = zeros(1, 4)
 
 B =
 
   0 0 0 0
 
 for idx=1:4; B(idx) = A(idx); end
 B
 
 B =
 
Columns 1 through 3
 
 0.7833 + 0.7942i   0.6808 + 0.0592i   0.4611 + 0.6029i
 
Column 4
 
 0.5678 + 0.0503i
 
 
 I realize NumPy != Matlab, but I'd wager that most users would think
 that this is the natural behavior..

Well, that behavior won't happen. We won't mutate the dtype of the array because
of assignment. Matlab has copy(-on-write) semantics for things like slices while
we have view semantics. We can't safely do the reallocation of memory [1].

[1] Well, we do have a .resize() method which will do the reallocation and raise
an exception if there are views lingering about. However, this is only done when
explicitly asked for because this is a feature that is useful in a limited
number of situations. We will not allow it to be done implicitly.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Stuart Brorson
 I realize NumPy != Matlab, but I'd wager that most users would think
 that this is the natural behavior..

 Well, that behavior won't happen. We won't mutate the dtype of the array 
 because
 of assignment. Matlab has copy(-on-write) semantics for things like slices 
 while
 we have view semantics. We can't safely do the reallocation of memory [1].

That's fair enough.  But then I think NumPy should consistently
typecheck all assignmetns and throw an exception if the user attempts
an assignment which looses information.

If you point me to a file where assignments are done (particularly
from array elements to array elements) I can see if I can figure out
how to fix it  then submit a patch.  But I won't promise anything!
My brain hurts already after analyzing this feature.   :-)

Cheers,

Stuart Brorson
Interactive Supercomputing, inc.
135 Beaver Street | Waltham | MA | 02452 | USA
http://www.interactivesupercomputing.com/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Scott Ransom
On Friday 04 January 2008 05:17:56 pm Stuart Brorson wrote:
  I realize NumPy != Matlab, but I'd wager that most users would
  think that this is the natural behavior..
 
  Well, that behavior won't happen. We won't mutate the dtype of the
  array because of assignment. Matlab has copy(-on-write) semantics
  for things like slices while we have view semantics. We can't
  safely do the reallocation of memory [1].

 That's fair enough.  But then I think NumPy should consistently
 typecheck all assignmetns and throw an exception if the user attempts
 an assignment which looses information.

 If you point me to a file where assignments are done (particularly
 from array elements to array elements) I can see if I can figure out
 how to fix it  then submit a patch.  But I won't promise anything!
 My brain hurts already after analyzing this feature.   :-)

There is a long history in numeric/numarray/numpy about this feature.  
And for many of us, it really is a feature -- it prevents the automatic 
upcasting of arrays, which is very important if your arrays are huge 
(i.e. comparable in size to your system memory).

For instance in astronomy, where very large 16-bit integer or 32-bit 
float images or data-cubes are common, if you upcast your 32-bit floats 
accidentally because you are doing double precision math (i.e. the 
default in Python) near them, that can cause the program to swap out or 
die horribly.  In fact, this exact example is one of the reasons why 
the Space Telescope people initially developed numarray.  numpy has 
kept that model.  I agree, though, that when using very mixed types 
(i.e. complex and ints, for example), the results can be confusing.

Scott

-- 
Scott M. RansomAddress:  NRAO
Phone:  (434) 296-0320   520 Edgemont Rd.
email:  [EMAIL PROTECTED] Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Gael Varoquaux
On Fri, Jan 04, 2008 at 03:53:41PM -0700, Charles R Harris wrote:
Matlab support for different types was sort of kludged on. Matlab was
intended for computational convenience, not control of data types, and
started out as pretty much all doubles and double complex, it didn't
(doesn't?) even have integers.

In has integers arrays, and does not upcast to double arrays. My
experience is that you are better off avoiding these (unless you talk to
hardware, which is why I use them) because many things won't work with
them.

Cheers,

Gaƫl
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Timothy Hochberg
On Jan 4, 2008 3:28 PM, Scott Ransom [EMAIL PROTECTED] wrote:

 On Friday 04 January 2008 05:17:56 pm Stuart Brorson wrote:
   I realize NumPy != Matlab, but I'd wager that most users would
   think that this is the natural behavior..
  
   Well, that behavior won't happen. We won't mutate the dtype of the
   array because of assignment. Matlab has copy(-on-write) semantics
   for things like slices while we have view semantics. We can't
   safely do the reallocation of memory [1].
 
  That's fair enough.  But then I think NumPy should consistently
  typecheck all assignmetns and throw an exception if the user attempts
  an assignment which looses information.
 
  If you point me to a file where assignments are done (particularly
  from array elements to array elements) I can see if I can figure out
  how to fix it  then submit a patch.  But I won't promise anything!
  My brain hurts already after analyzing this feature.   :-)

 There is a long history in numeric/numarray/numpy about this feature.
 And for many of us, it really is a feature -- it prevents the automatic
 upcasting of arrays, which is very important if your arrays are huge
 (i.e. comparable in size to your system memory).

 For instance in astronomy, where very large 16-bit integer or 32-bit
 float images or data-cubes are common, if you upcast your 32-bit floats
 accidentally because you are doing double precision math (i.e. the
 default in Python) near them, that can cause the program to swap out or
 die horribly.  In fact, this exact example is one of the reasons why
 the Space Telescope people initially developed numarray.  numpy has
 kept that model.  I agree, though, that when using very mixed types
 (i.e. complex and ints, for example), the results can be confusing.



This isn't a very compelling argument in this case. The concern the numarray
people were addressing was the upcasting of precision. However, there are
two related hierarchies in numpy, one is the kind[1] of data, roughly: bool,
int, float, complex. Each kind has various precisions. The numarray folks
were concerned with avoiding upcasting of precision, not with avoiding
upcasting up kinds. And, I can't see much (any?) justification for allowing
automagic downcasting of kind, complex-float being the most egregious,
other than backwards compatibility. This is clearly an opportunity for
confusion and likely a magnet for bug. And, I've yet to see any useful
examples to support this behaviour. I imagine that their are some benifits,
but I doubt that they are compelling enough to justify the current
behaviour.




[1] I can't recall if this is the official terminology; I'm away from my
home computer at the moment and it's hard for me to check. The idea should
be clear however,



-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Zachary Pincus
Hello all,

 That's well and good.  But NumPy should *never* automatically -- and
 silently -- chop the imaginary part off your complex array elements,
 particularly if you are just doing an innocent assignment!
 Doing something drastic like silently throwing half your data away can
 lead to all kinds of bugs in code written by somebody who is unaware
 of this behavior (i.e. most people)!

 It sounds to me like the right thing is to throw an exception instead
 of downcasting a data object.

I'm not sure that I agree! I'd rather not have to litter my code with  
casting operations every time I wanted to down-convert data types  
(creating yet more temporary arrays...) via assignment. e.g.:

A[i] = calculate(B).astype(A.dtype)
vs.
A[i] = calculate(B)

Further, writing functions to operate on generic user-provided output  
arrays (or arrays of user-provided dtype; both of which are common  
e.g. in scipy.ndimage) becomes more bug-prone, as every assignment  
would need to be modified as above.

This change would break a lot of the image-processing code I've  
written (where one often does computations in float or even double,  
and then re-scales and down-converts the result to integer for  
display), for example.

I guess that this could be rolled in via the geterr/seterr mechanism,  
and could by default just print warnings. I agree that silent  
truncation can be troublesome, but not having to spell every  
conversion out in explicit ( and error-prone) detail is also pretty  
useful. (And arguably more pythonic.)

Zach
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Robert Kern
Zachary Pincus wrote:
 Hello all,
 
 That's well and good.  But NumPy should *never* automatically -- and
 silently -- chop the imaginary part off your complex array elements,
 particularly if you are just doing an innocent assignment!
 Doing something drastic like silently throwing half your data away can
 lead to all kinds of bugs in code written by somebody who is unaware
 of this behavior (i.e. most people)!

 It sounds to me like the right thing is to throw an exception instead
 of downcasting a data object.
 
 I'm not sure that I agree! I'd rather not have to litter my code with  
 casting operations every time I wanted to down-convert data types  
 (creating yet more temporary arrays...) via assignment. e.g.:
 
 A[i] = calculate(B).astype(A.dtype)
 vs.
 A[i] = calculate(B)
 
 Further, writing functions to operate on generic user-provided output  
 arrays (or arrays of user-provided dtype; both of which are common  
 e.g. in scipy.ndimage) becomes more bug-prone, as every assignment  
 would need to be modified as above.
 
 This change would break a lot of the image-processing code I've  
 written (where one often does computations in float or even double,  
 and then re-scales and down-converts the result to integer for  
 display), for example.
 
 I guess that this could be rolled in via the geterr/seterr mechanism,  
 and could by default just print warnings. I agree that silent  
 truncation can be troublesome, but not having to spell every  
 conversion out in explicit ( and error-prone) detail is also pretty  
 useful. (And arguably more pythonic.)

There's some confusion in the conversation here. Tim already identified it, but
since it's continuing, I'm going to reiterate.

There are two related hierarchies of datatypes: different kinds (integer,
floating point, complex floating point) and different precisions within a given
kind (int8, int16, int32, int64). The term downcasting should probably be
reserved for the latter only.

It seems to me that Zach and Scott are defending downcasting of precisions
within a given kind. It does not necessarily follow that the behavior we choose
for dtypes within a given kind should be the behavior when we are dealing with
dtypes across different kinds. We can keep the precision downcasting behavior
that you want while raising an error when one attempts to assign a complex
number into a floating point array.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Nasty bug using pre-initialized arrays

2008-01-04 Thread Chris Barker
Scott Ransom wrote:
 I wasn't at all arguing that having complex data chopped and
 downcast into an int or float container was the right thing to do.

indeed, it is an clearly bad thing to do -- but a bug magnet? I'm not so 
sure, surely, anyone that had any idea at all what they were doing with 
complex numbers would notice it right away.

To the OP: Did you waste any time thinking this was working right? Or 
was your time spent figuring out why numpy wold do such a thing? which 
is wasted time none the less.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion