Re: [Numpy-discussion] Numpy 1.6 schedule (was: Numpy 2.0 schedule)

Travis Oliphant Sat, 05 Mar 2011 20:13:23 -0800

On Mar 5, 2011, at 5:10 PM, Mark Wiebe wrote:

> On Thu, Mar 3, 2011 at 10:54 PM, Ralf Gommers <[email protected]> 
> wrote:
> <snip>
>  
> >>> I've had a look at the bug tracker, here's a list of tickets for 1.6:
> >>> #1748 (blocker: regression for astype('str'))
> >>> #1619 (issue with dtypes, with patch)
> >>> #1749 (distutils, py 3.2)
> >>> #1601 (distutils, py 3.2)
> >>> #1622 (Solaris segfault, with patch)
> >>> #1713 (Solaris segfault)
> >>> #1631 (Solaris segfault)
> 
> The distutils tickets are resolved.
> 
> >>> Proposed schedule:
> >>> March 15: beta 1
> >>> March 28: rc 1
> >>> April 17: rc 2 (if needed)
> >>> April 24: final release
> 
> Any comments on the schedule or tickets?
> 
> That all looks fine to me. There are a few things that I've changed in the 
> core that could stand some discussion before being finalized in 1.6, mostly 
> due to what was required to make things work without depending on the data 
> type enumeration order. The combination of the numpy and scipy tests were 
> pretty effective, but as Travis mentioned my changes are fairly invasive.
> 
> * When copying array to array, structured types now copy based on field names 
> instead of positions, effectively behaving like a 'dict' instead of a 
> 'labeled tuple'. This behaviour is more intuitive to me, and several fixed 
> bugs such as dtype comparison completely ignoring the structured type data 
> suggest that this changes an area of numpy that has been used in a more 
> limited fashion. It might be worthwhile to introduce a tuple-style flag in a 
> future version which causes data to be copied by position instead of by name, 
> as it is likely useful in some contexts.


This is a semantic change that does make me a tiny bit nervous.    Structured 
arrays are actually used quite a bit in the wild, and so this could raise some 
errors.     What I don't know is how often sub-parts of a structured arrays get 
copied into other structured arrays with a different order to the fields.    
From what I gather, Mark's changes would allow this case and do an arguably 
useful thing.    Previously, a copy was only allowed if the structured array 
contained the same fields in the same order.     It seems like this is a 
relaxation of a rule and should not raise any errors (unless extant code was 
relying on the previous errors for some reason). 

> 
> * Array memory layouts are preserved in many cases. This means that if a, b 
> are Fortran ordered, a+b will be as well. It could be made more pervasive, 
> for example ndarray.copy defaults to C-order, and that could be changed to 
> 'K' to preserve the memory layout by default. Any comments about that?

I like this change quite a bit, but it has similar potential "expectation" 
issues.   I think the default should be changed to 'K' in NumPy 2.0, but 
perhaps we should preserve C-order for now to avoid the subtle breakages that 
might occur based on changed expectations.    What are others thoughts? 

> 
> * The ufunc uses a more consistent algorithm for loop selection. The previous 
> algorithm was ad hoc and lacked symmetry, while the new algorithm is based on 
> a simple minimization definition. This change exposed a bug in scipy's 
> ndimage, which did not handle all of the numpy data type enums properly, so 
> its possible there is more code out there which will be affected similarly.

This change has me the most nervous.  I'm looking forward to the more 
consistent algorithm.  As I said, the algorithm presently used as been there 
since Numeric in 1995 (I modified it only a little bit to handle scalar-array 
casting rules a bit differently).    This kind of change will have different 
corner cases and this should be understood before a release.    

I'm also wondering what happened to the optional arguments to ufuncs (are they 
still there)?   One of these allowed you to choose the loop 
yourself and bypass the selection algorithm.

> 
> In general, I've used the implementation strategy of substituting my code 
> into the core critical paths of numpy to maximize the amount of exercise it 
> gets. While this creates more short-term hiccups as we are seeing now, it 
> also means the new functionality conforms to the current system better and is 
> much more stable since it is getting well tested.

Thanks again for all the good core-algorithm work, Mark.  You have being doing 
a great job. 

-Travis



> 
> -Mark
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
[email protected]
1-512-536-1057
http://www.enthought.com

_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.6 schedule (was: Numpy 2.0 schedule)

Reply via email to