Hi, On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: >> Hi, >> >> We were teaching today, and found ourselves getting very confused >> about ravel and shape in numpy. >> >> Summary >> -------------- >> >> There are two separate ideas needed to understand ordering in ravel and >> reshape: >> >> Idea 1): ravel / reshape can proceed from the last axis to the first, >> or the first to the last. This is "ravel index ordering" >> Idea 2) The physical layout of the array (on disk or in memory) can be >> "C" or "F" contiguous or neither. >> This is "memory ordering" >> >> The index ordering is usually (but see below) orthogonal to the memory >> ordering. >> >> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >> index ordering, and this mixes the two ideas and is confusing. >> >> What the current situation looks like >> ---------------------------------------------------- >> >> Specifically, we've been rolling this around 4 experienced numpy users >> and we all predicted at least one of the results below wrongly. >> >> This was what we knew, or should have known: >> >> In [2]: import numpy as np >> >> In [3]: arr = np.arange(10).reshape((2, 5)) >> >> In [5]: arr.ravel() >> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> So, the 'ravel' operation unravels over the last axis (1) first, >> followed by axis 0. >> >> So far so good (even if the opposite to MATLAB, Octave). >> >> Then we found the 'order' flag to ravel: >> >> In [10]: arr.flags >> Out[10]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [11]: arr.ravel('C') >> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> But we soon got confused. How about this? >> >> In [12]: arr_F = np.array(arr, order='F') >> >> In [13]: arr_F.flags >> Out[13]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [16]: arr_F >> Out[16]: >> array([[0, 1, 2, 3, 4], >> [5, 6, 7, 8, 9]]) >> >> In [17]: arr_F.ravel('C') >> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >> ordering, but is to do with *index* ordering. >> >> And in fact, we can ask for memory ordering specifically: >> >> In [22]: arr.ravel('K') >> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [23]: arr_F.ravel('K') >> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> In [24]: arr.ravel('A') >> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [25]: arr_F.ravel('A') >> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> There are some confusions to get into with the 'order' flag to reshape >> as well, of the same type. >> >> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >> >> This is very confusing. We think the index ordering and memory >> ordering ideas need to be separated, and specifically, we should avoid >> using "C" and "F" to refer to index ordering. >> >> Proposal >> ------------- >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? >> > > Personally I think it is clear enough and that "Z" and "N" would confuse > me just as much (though I am used to the other names). Also "Z" and "N" > would seem more like aliases, which would also make sense in the memory > order context. > If anything, I would prefer renaming the arguments iteration_order and > memory_order, but it seems overdoing it...
I am not sure what you mean - at the moment there is one argument called 'order' that can refer to iteration order or memory order. Are you proposing two arguments? > Maybe the documentation could just be checked if it is always clear > though. I.e. maybe it does not use "iteration" or "memory" order > consistently (though I somewhat feel it is usually clear that it must be > iteration order, since no numpy function cares about the input memory > order as they will just do a copy if necessary). Do you really mean this? Numpy is full of 'order=' flags that refer to memory. Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion