Hi, On Sat, Mar 30, 2013 at 1:57 PM, <josef.p...@gmail.com> wrote: > On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett <matthew.br...@gmail.com> > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 4:14 AM, <josef.p...@gmail.com> wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett <matthew.br...@gmail.com> >>> wrote: >>>> >>>> Hi, >>>> >>>> We were teaching today, and found ourselves getting very confused >>>> about ravel and shape in numpy. >>>> >>>> Summary >>>> -------------- >>>> >>>> There are two separate ideas needed to understand ordering in ravel and >>>> reshape: >>>> >>>> Idea 1): ravel / reshape can proceed from the last axis to the first, >>>> or the first to the last. This is "ravel index ordering" >>>> Idea 2) The physical layout of the array (on disk or in memory) can be >>>> "C" or "F" contiguous or neither. >>>> This is "memory ordering" >>>> >>>> The index ordering is usually (but see below) orthogonal to the memory >>>> ordering. >>>> >>>> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >>>> index ordering, and this mixes the two ideas and is confusing. >>>> >>>> What the current situation looks like >>>> ---------------------------------------------------- >>>> >>>> Specifically, we've been rolling this around 4 experienced numpy users >>>> and we all predicted at least one of the results below wrongly. >>>> >>>> This was what we knew, or should have known: >>>> >>>> In [2]: import numpy as np >>>> >>>> In [3]: arr = np.arange(10).reshape((2, 5)) >>>> >>>> In [5]: arr.ravel() >>>> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>> >>>> So, the 'ravel' operation unravels over the last axis (1) first, >>>> followed by axis 0. >>>> >>>> So far so good (even if the opposite to MATLAB, Octave). >>>> >>>> Then we found the 'order' flag to ravel: >>>> >>>> In [10]: arr.flags >>>> Out[10]: >>>> C_CONTIGUOUS : True >>>> F_CONTIGUOUS : False >>>> OWNDATA : False >>>> WRITEABLE : True >>>> ALIGNED : True >>>> UPDATEIFCOPY : False >>>> >>>> In [11]: arr.ravel('C') >>>> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>> >>>> But we soon got confused. How about this? >>>> >>>> In [12]: arr_F = np.array(arr, order='F') >>>> >>>> In [13]: arr_F.flags >>>> Out[13]: >>>> C_CONTIGUOUS : False >>>> F_CONTIGUOUS : True >>>> OWNDATA : True >>>> WRITEABLE : True >>>> ALIGNED : True >>>> UPDATEIFCOPY : False >>>> >>>> In [16]: arr_F >>>> Out[16]: >>>> array([[0, 1, 2, 3, 4], >>>> [5, 6, 7, 8, 9]]) >>>> >>>> In [17]: arr_F.ravel('C') >>>> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>> >>>> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >>>> ordering, but is to do with *index* ordering. >>>> >>>> And in fact, we can ask for memory ordering specifically: >>>> >>>> In [22]: arr.ravel('K') >>>> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>> >>>> In [23]: arr_F.ravel('K') >>>> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >>>> >>>> In [24]: arr.ravel('A') >>>> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>>> >>>> In [25]: arr_F.ravel('A') >>>> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >>>> >>>> There are some confusions to get into with the 'order' flag to reshape >>>> as well, of the same type. >>>> >>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >>>> >>>> This is very confusing. We think the index ordering and memory >>>> ordering ideas need to be separated, and specifically, we should avoid >>>> using "C" and "F" to refer to index ordering. >>>> >>>> Proposal >>>> ------------- >>>> >>>> * Deprecate the use of "C" and "F" meaning backwards and forwards >>>> index ordering for ravel, reshape >>>> * Prefer "Z" and "N", being graphical representations of unraveling in >>>> 2 dimensions, axis1 first and axis0 first respectively (excellent >>>> naming idea by Paul Ivanov) >>>> >>>> What do y'all think? >>>> >>>> Cheers, >>>> >>>> Matthew >>>> Paul Ivanov >>>> JB Poline >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> I always thought "F" and "C" are easy to understand, I always thought about >>> the content and never about the memory when using it. >> >> I can only say that 4 out of 4 experienced numpy developers found >> themselves unable to predict the behavior of these functions before >> they saw the output. >> >> The problem is always that explaining something makes it clearer for a >> moment, but, for those who do not have the explanation or who have >> forgotten it, at least among us here, the outputs were generating >> groans and / or high fives as we incorrectly or correctly guessed what >> was going to happen. >> >> I think the only way to find out whether this really is confusing or >> not, is to put someone in front of these functions without any >> explanation and ask them to predict what is going to come out of the >> various inputs and flags. Or to try and teach it, which was the >> problem we were having. > > changing the names doesn't make it easier to understand. > I think the confusion is because the new A and K refer to existing memory > > > ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I > don't remember having seen any weird cases. > ------------ > > I always thought of "order" in array creation is the way we want to > have the memory layout of the *target* array and has nothing to do > with existing memory layout (creating view or copy as needed).
In the case of ravel of course F and C in memory aren't relevant. 'F' and 'C' don't refer to target memory layout at all in 'reshape': In [26]: a = np.arange(10).reshape((2, 5)) In [28]: a.reshape((2, 5), order='F').flags Out[28]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False So I think that distinction actively confusing in this case, and more evidence that this is not the right name for what we mean. Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion