Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 9:37 PM, wrote: >> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett >> wrote: >>> Hi, >>> >>> On Sat, Mar 30, 2013 at 7:02 PM, wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 7:50 PM, wrote: >> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >> wrote: >>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >>> wrote: On Sat, Mar 30, 2013 at 2:20 PM, wrote: > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> wrote: >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett wrote: > > Ravel and reshape use the tems 'C' and 'F" in the sense of index > ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should > avoid > using "C" and "F" to refer to index ordering. > > Proposal > - > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of > unraveling > in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? I always thought "F" and "C" are easy to understand, I always thought about the content and never about the memory when using it. >> >> changing the names doesn't make it easier to understand. >> I think the confusion is because the new A and K refer to existing >> memory >> I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. >> >> I think you are overcomplicating things or phrased it as a "trick >> question" > > I don't know what you mean by trick question - was there something > over-complicated in the example? I deliberately didn't include > various much more confusing examples in "reshape". I meant making the "candidates" think about memory instead of just column versus row stacking. >>> >>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >>> array, it was an image, with time as the 4th dimension (N time >>> points). Raveling and reshaping 3D and 4D arrays is a common thing >>> to do in neuroimaging, as you can imagine. >>> >>> A student asked what he would get back from raveling this array, a >>> concatenated time series, or something spatial? >>> >>> We showed (I'd worked it out by this time) that the first N values >>> were the time series given by [0, 0, 0, :]. >>> >>> He said - "Oh - I see - so the data is stored as a whole lot of time >>> series one by one, I thought it would be stored as a series of >>> images'. >>> >>> Ironically, this was a Fortran-ordered array in memory, and he was wrong. >>> >>> So, I think the idea of memory ordering and index ordering is very >>> easy to confuse, and comes up naturally. >>> >>> I would like, as a teacher, to be able to say something like: >>> >>> This is what C memory layout is (it's the memory layout that gives >>> arr.flags.C_CONTIGUOUS=True) >>> This is what F memory layout is (it's the memory layout that gives >>> arr.flags.F_CONTIGUOUS=True) >>> It's rather easy to get something that is neither C or F memory layout >>> Numpy does many memory layouts. >>> Ravel and reshape and numpy in general do not care (normally) about C >>> or F layouts, they only care about index ordering. >>> >>> My point, that I'm repeating, is that my job is made harder by >>> 'arr.ravel('F')'. >> >> But once you know that ravel and reshape don't care about memory, the >> ravel is easy to predict (maybe not easy to visualize in 4-D): > > But this assumes that you already know that there's such a thing as > memory layout, and there's such a thing as index ordering, and that > 'C' and 'F' in ravel refer to index ordering. Once you have that, > you're golden. I'm arguing it's markedly harder to get this > distinction, and keep it in mind, and teach it, if we are using the > 'C' and 'F" names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about m
Re: [Numpy-discussion] Indexing bug
Message: 2 Date: Sat, 30 Mar 2013 11:13:35 -0700 From: Jaime Fern?ndez del R?o Subject: Re: [Numpy-discussion] Indexing bug? To: Discussion of Numerical Python Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Sat, Mar 30, 2013 at 11:01 AM, Ivan Oseledets wrote: > I am using numpy 1.6.1, > and encountered a wierd fancy indexing bug: > > import numpy as np > c = np.random.randn(10,200,10); > > In [29]: print c[[0,1],:200,:2].shape > (2, 200, 2) > > In [30]: print c[[0,1],:200,[0,1]].shape > (2, 200) > > It means, that here fancy indexing is not working right for a 3d array. > On Sat, Mar 30, 2013 at 11:01 AM, Ivan Oseledets wrote: > I am using numpy 1.6.1, > and encountered a wierd fancy indexing bug: > > import numpy as np > c = np.random.randn(10,200,10); > > In [29]: print c[[0,1],:200,:2].shape > (2, 200, 2) > > In [30]: print c[[0,1],:200,[0,1]].shape > (2, 200) > > It means, that here fancy indexing is not working right for a 3d array. > --> It is working fine, review the docs: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing In your return, item [0, :] is c[0, :, 0] and item[1, :]is c[1, :, 1]. If you want a return of shape (2, 200, 2) where item [i, :, j] is c[i, :, j] you could use slicing: c[:2, :200, :2] or something more elaborate like: c[np.arange(2)[:, None, None], np.arange(200)[:, None], np.arange(2)] Jaime ---> Oh! So it is not a bug, it is a feature, which is completely incompatible with other array based languages (MATLAB and Fortran). To me, I can not find a single explanation why it is so in numpy. Taking submatrices from a matrix is a common operation and the syntax above is very natural to take submatrices, not a weird diagonal stuff. i.e., c = np.random.randn(100,100) d = c[[0,3],[2,3]] should NOT produce two numbers! (and you can not do it using slices!) In MATLAB and Fortran c(indi,indj) will produce a 2 x 2 matrix. How it can be done in numpy (and why the complications?) So, please consider this message as a feature request. Ivan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 9:37 PM, wrote: > On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 7:02 PM, wrote: >>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett >>> wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, wrote: > On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle > wrote: >> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >> wrote: >>> >>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >>> >> wrote: >>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> wrote: >>> > >>> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >>> > ordering. >>> > >>> > This is very confusing. We think the index ordering and memory >>> > ordering ideas need to be separated, and specifically, we should >>> > avoid >>> > using "C" and "F" to refer to index ordering. >>> > >>> > Proposal >>> > - >>> > >>> > * Deprecate the use of "C" and "F" meaning backwards and forwards >>> > index ordering for ravel, reshape >>> > * Prefer "Z" and "N", being graphical representations of >>> > unraveling >>> > in >>> > 2 dimensions, axis1 first and axis0 first respectively (excellent >>> > naming idea by Paul Ivanov) >>> > >>> > What do y'all think? >>> >>> I always thought "F" and "C" are easy to understand, I always >>> thought >>> about >>> the content and never about the memory when using it. >>> >> >>> >> changing the names doesn't make it easier to understand. >>> >> I think the confusion is because the new A and K refer to existing >>> >> memory >>> >> >>> >>> I disagree, I think it's confusing, but I have evidence, and that is >>> that four out of four of us tested ourselves and got it wrong. >>> >>> Perhaps we are particularly dumb or poorly informed, but I think it's >>> rash to assert there is no problem here. > > I think you are overcomplicating things or phrased it as a "trick > question" I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in "reshape". >>> >>> I meant making the "candidates" think about memory instead of just >>> column versus row stacking. >> >> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >> array, it was an image, with time as the 4th dimension (N time >> points). Raveling and reshaping 3D and 4D arrays is a common thing >> to do in neuroimaging, as you can imagine. >> >> A student asked what he would get back from raveling this array, a >> concatenated time series, or something spatial? >> >> We showed (I'd worked it out by this time) that the first N values >> were the time series given by [0, 0, 0, :]. >> >> He said - "Oh - I see - so the data is stored as a whole lot of time >> series one by one, I thought it would be stored as a series of >> images'. >> >> Ironically, this was a Fortran-ordered array in memory, and he was wrong. >> >> So, I think the idea of memory ordering and index ordering is very >> easy to confuse, and comes up naturally. >> >> I would like, as a teacher, to be able to say something like: >> >> This is what C memory layout is (it's the memory layout that gives >> arr.flags.C_CONTIGUOUS=True) >> This is what F memory layout is (it's the memory layout that gives >> arr.flags.F_CONTIGUOUS=True) >> It's rather easy to get something that is neither C or F memory layout >> Numpy does many memory layouts. >> Ravel and reshape and numpy in general do not care (normally) about C >> or F layouts, they only care about index ordering. >> >> My point, that I'm repeating, is that my job is made harder by >> 'arr.ravel('F')'. > > But once you know that ravel and reshape don't care about memory, the > ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F" names for both things. > order=C: stack the last dimension, N, time series of one 3d pixels, > then stack the time series of the next pixel... > process pixels by depth and the row by row (like old TVs) > > I assume you did this because your underlying array is C contiguous. > so your ravel('C') is a c-contiguous view (instead of some weird > strides or a copy) So
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 7:02 PM, wrote: >> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett >> wrote: >>> Hi, >>> >>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle wrote: > On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett > wrote: >> >> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >> wrote: >> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> wrote: >> > >> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >> > ordering. >> > >> > This is very confusing. We think the index ordering and memory >> > ordering ideas need to be separated, and specifically, we should >> > avoid >> > using "C" and "F" to refer to index ordering. >> > >> > Proposal >> > - >> > >> > * Deprecate the use of "C" and "F" meaning backwards and forwards >> > index ordering for ravel, reshape >> > * Prefer "Z" and "N", being graphical representations of unraveling >> > in >> > 2 dimensions, axis1 first and axis0 first respectively (excellent >> > naming idea by Paul Ivanov) >> > >> > What do y'all think? >> >> I always thought "F" and "C" are easy to understand, I always >> thought >> about >> the content and never about the memory when using it. >> >> >> >> changing the names doesn't make it easier to understand. >> >> I think the confusion is because the new A and K refer to existing >> >> memory >> >> >> >> I disagree, I think it's confusing, but I have evidence, and that is >> that four out of four of us tested ourselves and got it wrong. >> >> Perhaps we are particularly dumb or poorly informed, but I think it's >> rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a "trick question" >>> >>> I don't know what you mean by trick question - was there something >>> over-complicated in the example? I deliberately didn't include >>> various much more confusing examples in "reshape". >> >> I meant making the "candidates" think about memory instead of just >> column versus row stacking. > > To be specific, we were teaching about reshaping a (I, J, K, N) 4D > array, it was an image, with time as the 4th dimension (N time > points). Raveling and reshaping 3D and 4D arrays is a common thing > to do in neuroimaging, as you can imagine. > > A student asked what he would get back from raveling this array, a > concatenated time series, or something spatial? > > We showed (I'd worked it out by this time) that the first N values > were the time series given by [0, 0, 0, :]. > > He said - "Oh - I see - so the data is stored as a whole lot of time > series one by one, I thought it would be stored as a series of > images'. > > Ironically, this was a Fortran-ordered array in memory, and he was wrong. > > So, I think the idea of memory ordering and index ordering is very > easy to confuse, and comes up naturally. > > I would like, as a teacher, to be able to say something like: > > This is what C memory layout is (it's the memory layout that gives > arr.flags.C_CONTIGUOUS=True) > This is what F memory layout is (it's the memory layout that gives > arr.flags.F_CONTIGUOUS=True) > It's rather easy to get something that is neither C or F memory layout > Numpy does many memory layouts. > Ravel and reshape and numpy in general do not care (normally) about C > or F layouts, they only care about index ordering. > > My point, that I'm repeating, is that my job is made harder by > 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): order=C: stack the last dimension, N, time series of one 3d pixels, then stack the time series of the next pixel... process pixels by depth and the row by row (like old TVs) I assume you did this because your underlying array is C contiguous. so your ravel('C') is a c-contiguous view (instead of some weird strides or a copy) I usually prefer time in the first dimension, and stack order=F, then I can start at the front, stack all time periods of the first pixel, keep going and work pixels down the columns, first page, next page, ... (and I hope I have a F-contiguous array, so my raveled array is also F-contiguous.) (note: I'm bringing memory back in as optimization, but not to predict the stacking) Josef (I think brains are designed for Fortran order and C-ordering in numpy is a accident, except, reading a Western language book is neither) > > Cheers, > > Matthew > ___
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 9:05 PM, wrote: > On Sat, Mar 30, 2013 at 11:43 PM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 7:02 PM, wrote: >>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett >>> wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, wrote: > On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle > wrote: >> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >> wrote: >>> >>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >>> >> wrote: >>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> wrote: >>> > >>> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >>> > ordering. >>> > >>> > This is very confusing. We think the index ordering and memory >>> > ordering ideas need to be separated, and specifically, we should >>> > avoid >>> > using "C" and "F" to refer to index ordering. >>> > >>> > Proposal >>> > - >>> > >>> > * Deprecate the use of "C" and "F" meaning backwards and forwards >>> > index ordering for ravel, reshape >>> > * Prefer "Z" and "N", being graphical representations of >>> > unraveling >>> > in >>> > 2 dimensions, axis1 first and axis0 first respectively (excellent >>> > naming idea by Paul Ivanov) >>> > >>> > What do y'all think? >>> >>> I always thought "F" and "C" are easy to understand, I always >>> thought >>> about >>> the content and never about the memory when using it. >>> >> >>> >> changing the names doesn't make it easier to understand. >>> >> I think the confusion is because the new A and K refer to existing >>> >> memory >>> >> >>> >>> I disagree, I think it's confusing, but I have evidence, and that is >>> that four out of four of us tested ourselves and got it wrong. >>> >>> Perhaps we are particularly dumb or poorly informed, but I think it's >>> rash to assert there is no problem here. > > I think you are overcomplicating things or phrased it as a "trick > question" I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in "reshape". >>> >>> I meant making the "candidates" think about memory instead of just >>> column versus row stacking. >>> I don't think I ever get confused about reshape "F" in 2d. >>> But when I work with 3d or larger ndim nd-arrays, I always have to >>> try an example to check my intuition (in general not just reshape). >>> > ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C" and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that "C" and "F" do often refer to memory layout elsewhere in numpy. >>> >>> I guess that wasn't so helpful. >>> (emphasis on *target*, There are very few places where an order >>> keyword refers to *existing* memory layout. >> >> It is helpful because it shows how easy it is to get confused between >> memory order and index order. >> >>> What's reverse index order? >> >> I am not being clear, sorry about that: >> >> import numpy as np >> >> def ravel_iter_last_fastest(arr): >> res = [] >> for i in range(arr.shape[0]): >> for j in range(arr.shape[1]): >> for k in range(arr.shape[2]): >> # Iterating over last dimension fastest >> res.append(arr[i, j, k]) >> return np.array(res) >> >> >> def ravel_iter_first_fastest(arr): >> res = [] >> for k in range(arr.shape[2]): >> for j in range(arr.shape[1]): >> for i in range(arr.shape[0]): >> # Iterating over first dimension fastest >> res.append(arr[i, j, k]) >> return np.array(res) > > good example > > that's just C and F order in the terminology of numpy > http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#controlling-iteration-order > (independent of memory) > http://docs.scipy.org/doc/numpy/reference/generated/numpy.flatiter.html#numpy.flatiter > > I don't think we want to rename a large part of the basic terminology of numpy Sometimes two ideas get conflated together, and it seems natural to keep together, until people get confused, and you realize that there are two separate ideas. For example here's a quote from the 'flatiter' doc : Iteration is done in C-contiguous style Now - that seems really ugly to me. For example, 'contiguous' should not be in that sentence, although it's easy to see why it
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 11:43 PM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 7:02 PM, wrote: >> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett >> wrote: >>> Hi, >>> >>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle wrote: > On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett > wrote: >> >> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >> wrote: >> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> wrote: >> > >> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >> > ordering. >> > >> > This is very confusing. We think the index ordering and memory >> > ordering ideas need to be separated, and specifically, we should >> > avoid >> > using "C" and "F" to refer to index ordering. >> > >> > Proposal >> > - >> > >> > * Deprecate the use of "C" and "F" meaning backwards and forwards >> > index ordering for ravel, reshape >> > * Prefer "Z" and "N", being graphical representations of unraveling >> > in >> > 2 dimensions, axis1 first and axis0 first respectively (excellent >> > naming idea by Paul Ivanov) >> > >> > What do y'all think? >> >> I always thought "F" and "C" are easy to understand, I always >> thought >> about >> the content and never about the memory when using it. >> >> >> >> changing the names doesn't make it easier to understand. >> >> I think the confusion is because the new A and K refer to existing >> >> memory >> >> >> >> I disagree, I think it's confusing, but I have evidence, and that is >> that four out of four of us tested ourselves and got it wrong. >> >> Perhaps we are particularly dumb or poorly informed, but I think it's >> rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a "trick question" >>> >>> I don't know what you mean by trick question - was there something >>> over-complicated in the example? I deliberately didn't include >>> various much more confusing examples in "reshape". >> >> I meant making the "candidates" think about memory instead of just >> column versus row stacking. >> I don't think I ever get confused about reshape "F" in 2d. >> But when I work with 3d or larger ndim nd-arrays, I always have to >> try an example to check my intuition (in general not just reshape). >> >>> ravel F and C have *nothing* to do with memory layout. >>> >>> We do agree on this of course - but you said in an earlier mail that >>> you thought of 'C" and 'F' as referring to target memory layout (which >>> they don't in this case) so I think we also agree that "C" and "F" do >>> often refer to memory layout elsewhere in numpy. >> >> I guess that wasn't so helpful. >> (emphasis on *target*, There are very few places where an order >> keyword refers to *existing* memory layout. > > It is helpful because it shows how easy it is to get confused between > memory order and index order. > >> What's reverse index order? > > I am not being clear, sorry about that: > > import numpy as np > > def ravel_iter_last_fastest(arr): > res = [] > for i in range(arr.shape[0]): > for j in range(arr.shape[1]): > for k in range(arr.shape[2]): > # Iterating over last dimension fastest > res.append(arr[i, j, k]) > return np.array(res) > > > def ravel_iter_first_fastest(arr): > res = [] > for k in range(arr.shape[2]): > for j in range(arr.shape[1]): > for i in range(arr.shape[0]): > # Iterating over first dimension fastest > res.append(arr[i, j, k]) > return np.array(res) good example that's just C and F order in the terminology of numpy http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#controlling-iteration-order (independent of memory) http://docs.scipy.org/doc/numpy/reference/generated/numpy.flatiter.html#numpy.flatiter I don't think we want to rename a large part of the basic terminology of numpy Josef > > > a = np.arange(24).reshape((2, 3, 4)) > > print np.all(a.ravel('C') == ravel_iter_last_fastest(a)) > print np.all(a.ravel('F') == ravel_iter_first_fastest(a)) > > By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above. I > guess one could argue that this was not 'reverse' but 'forward' index > ordering, but I am not arguing about which is better, or those names, > only that it's the order of indices that differs, not the memory > layout, and that these ideas need to be kept separate. > > Cheers, > > Matthew >
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:02 PM, wrote: > On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 7:50 PM, wrote: >>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >>> wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett wrote: > > On Sat, Mar 30, 2013 at 2:20 PM, wrote: > > On Sat, Mar 30, 2013 at 4:57 PM, wrote: > >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > >> wrote: > >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: > On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett > wrote: > > > > Ravel and reshape use the tems 'C' and 'F" in the sense of index > > ordering. > > > > This is very confusing. We think the index ordering and memory > > ordering ideas need to be separated, and specifically, we should > > avoid > > using "C" and "F" to refer to index ordering. > > > > Proposal > > - > > > > * Deprecate the use of "C" and "F" meaning backwards and forwards > > index ordering for ravel, reshape > > * Prefer "Z" and "N", being graphical representations of unraveling > > in > > 2 dimensions, axis1 first and axis0 first respectively (excellent > > naming idea by Paul Ivanov) > > > > What do y'all think? > > I always thought "F" and "C" are easy to understand, I always thought > about > the content and never about the memory when using it. > >> > >> changing the names doesn't make it easier to understand. > >> I think the confusion is because the new A and K refer to existing > >> memory > >> > > I disagree, I think it's confusing, but I have evidence, and that is > that four out of four of us tested ourselves and got it wrong. > > Perhaps we are particularly dumb or poorly informed, but I think it's > rash to assert there is no problem here. >>> >>> I think you are overcomplicating things or phrased it as a "trick question" >> >> I don't know what you mean by trick question - was there something >> over-complicated in the example? I deliberately didn't include >> various much more confusing examples in "reshape". > > I meant making the "candidates" think about memory instead of just > column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - "Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:02 PM, wrote: > On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 7:50 PM, wrote: >>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >>> wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett wrote: > > On Sat, Mar 30, 2013 at 2:20 PM, wrote: > > On Sat, Mar 30, 2013 at 4:57 PM, wrote: > >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > >> wrote: > >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: > On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett > wrote: > > > > Ravel and reshape use the tems 'C' and 'F" in the sense of index > > ordering. > > > > This is very confusing. We think the index ordering and memory > > ordering ideas need to be separated, and specifically, we should > > avoid > > using "C" and "F" to refer to index ordering. > > > > Proposal > > - > > > > * Deprecate the use of "C" and "F" meaning backwards and forwards > > index ordering for ravel, reshape > > * Prefer "Z" and "N", being graphical representations of unraveling > > in > > 2 dimensions, axis1 first and axis0 first respectively (excellent > > naming idea by Paul Ivanov) > > > > What do y'all think? > > I always thought "F" and "C" are easy to understand, I always thought > about > the content and never about the memory when using it. > >> > >> changing the names doesn't make it easier to understand. > >> I think the confusion is because the new A and K refer to existing > >> memory > >> > > I disagree, I think it's confusing, but I have evidence, and that is > that four out of four of us tested ourselves and got it wrong. > > Perhaps we are particularly dumb or poorly informed, but I think it's > rash to assert there is no problem here. >>> >>> I think you are overcomplicating things or phrased it as a "trick question" >> >> I don't know what you mean by trick question - was there something >> over-complicated in the example? I deliberately didn't include >> various much more confusing examples in "reshape". > > I meant making the "candidates" think about memory instead of just > column versus row stacking. > I don't think I ever get confused about reshape "F" in 2d. > But when I work with 3d or larger ndim nd-arrays, I always have to > try an example to check my intuition (in general not just reshape). > >> >>> ravel F and C have *nothing* to do with memory layout. >> >> We do agree on this of course - but you said in an earlier mail that >> you thought of 'C" and 'F' as referring to target memory layout (which >> they don't in this case) so I think we also agree that "C" and "F" do >> often refer to memory layout elsewhere in numpy. > > I guess that wasn't so helpful. > (emphasis on *target*, There are very few places where an order > keyword refers to *existing* memory layout. It is helpful because it shows how easy it is to get confused between memory order and index order. > What's reverse index order? I am not being clear, sorry about that: import numpy as np def ravel_iter_last_fastest(arr): res = [] for i in range(arr.shape[0]): for j in range(arr.shape[1]): for k in range(arr.shape[2]): # Iterating over last dimension fastest res.append(arr[i, j, k]) return np.array(res) def ravel_iter_first_fastest(arr): res = [] for k in range(arr.shape[2]): for j in range(arr.shape[1]): for i in range(arr.shape[0]): # Iterating over first dimension fastest res.append(arr[i, j, k]) return np.array(res) a = np.arange(24).reshape((2, 3, 4)) print np.all(a.ravel('C') == ravel_iter_last_fastest(a)) print np.all(a.ravel('F') == ravel_iter_first_fastest(a)) By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above. I guess one could argue that this was not 'reverse' but 'forward' index ordering, but I am not arguing about which is better, or those names, only that it's the order of indices that differs, not the memory layout, and that these ideas need to be kept separate. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 7:50 PM, wrote: >> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >> wrote: >>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >>> wrote: On Sat, Mar 30, 2013 at 2:20 PM, wrote: > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> wrote: >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett wrote: > > Ravel and reshape use the tems 'C' and 'F" in the sense of index > ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should > avoid > using "C" and "F" to refer to index ordering. > > Proposal > - > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of unraveling > in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? I always thought "F" and "C" are easy to understand, I always thought about the content and never about the memory when using it. >> >> changing the names doesn't make it easier to understand. >> I think the confusion is because the new A and K refer to existing >> memory >> I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. >> >> I think you are overcomplicating things or phrased it as a "trick question" > > I don't know what you mean by trick question - was there something > over-complicated in the example? I deliberately didn't include > various much more confusing examples in "reshape". I meant making the "candidates" think about memory instead of just column versus row stacking. I don't think I ever get confused about reshape "F" in 2d. But when I work with 3d or larger ndim nd-arrays, I always have to try an example to check my intuition (in general not just reshape). > >> ravel F and C have *nothing* to do with memory layout. > > We do agree on this of course - but you said in an earlier mail that > you thought of 'C" and 'F' as referring to target memory layout (which > they don't in this case) so I think we also agree that "C" and "F" do > often refer to memory layout elsewhere in numpy. I guess that wasn't so helpful. (emphasis on *target*, There are very few places where an order keyword refers to *existing* memory layout. So I'm not tempted to think about existing memory layout when I see ``order``. Also my examples might have confused the issue: ravel and reshape, with C and F are easy to understand without ever looking at memory issues. memory only comes into play when we want to know whether we get a view or copy. The examples were only for the cases when I do care about this. ) > >> I think it's not confusing for beginners that have no idea and never think >> about memory layout. >> I've never seen any problems with it in statsmodels and I have seen >> many developers (GSOC) that are pretty new to python and numpy. >> (I didn't check the repo history to verify, so IIRC) > > Usually you don't need to know what reshape or ravel did because you > are likely to reshape again and that will use the same algorithm. > > For example, I didn't know that that ravel worked in reverse index > order, started explaining it wrong, and had to check. I use ravel and > reshape a lot, and have not run into this problem because either a) I > didn't test my code properly or b) I did reshape after ravel / reshape > and it reversed what I did first time. So, I don't think it's "we > haven't noticed any problems" is a good argument in the face of > "several experienced developers got it wrong when trying to guess what > it did". What's reverse index order? In the case of statsmodels, we do care about the stacking order. When we use reshape(..., order='F') or ravel('F'), it's only because we want to have a specific array (not memory) layout (and/or because the raveled array came from R) (aside: 2 cases - for 2d parameter vectors, we ravel and reshape often, and we changed our convention to Fortran order, (parameter in rows, equations in columns, IIRC) The interpretation of the results depends on which way we ravel or reshape. - for panel data (time versus individuals), we need to build matching kronecker product arrays which are block-diagonal if the stacking/``order`` is the right way. None of the cases cares about memory lay
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:50 PM, wrote: > On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle > wrote: >> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >> wrote: >>> >>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >>> >> wrote: >>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> wrote: >>> > >>> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >>> > ordering. >>> > >>> > This is very confusing. We think the index ordering and memory >>> > ordering ideas need to be separated, and specifically, we should >>> > avoid >>> > using "C" and "F" to refer to index ordering. >>> > >>> > Proposal >>> > - >>> > >>> > * Deprecate the use of "C" and "F" meaning backwards and forwards >>> > index ordering for ravel, reshape >>> > * Prefer "Z" and "N", being graphical representations of unraveling >>> > in >>> > 2 dimensions, axis1 first and axis0 first respectively (excellent >>> > naming idea by Paul Ivanov) >>> > >>> > What do y'all think? >>> >>> I always thought "F" and "C" are easy to understand, I always thought >>> about >>> the content and never about the memory when using it. >>> >> >>> >> changing the names doesn't make it easier to understand. >>> >> I think the confusion is because the new A and K refer to existing >>> >> memory >>> >> >>> >>> I disagree, I think it's confusing, but I have evidence, and that is >>> that four out of four of us tested ourselves and got it wrong. >>> >>> Perhaps we are particularly dumb or poorly informed, but I think it's >>> rash to assert there is no problem here. > > I think you are overcomplicating things or phrased it as a "trick question" I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in "reshape". > ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C" and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that "C" and "F" do often refer to memory layout elsewhere in numpy. > I think it's not confusing for beginners that have no idea and never think > about memory layout. > I've never seen any problems with it in statsmodels and I have seen > many developers (GSOC) that are pretty new to python and numpy. > (I didn't check the repo history to verify, so IIRC) Usually you don't need to know what reshape or ravel did because you are likely to reshape again and that will use the same algorithm. For example, I didn't know that that ravel worked in reverse index order, started explaining it wrong, and had to check. I use ravel and reshape a lot, and have not run into this problem because either a) I didn't test my code properly or b) I did reshape after ravel / reshape and it reversed what I did first time. So, I don't think it's "we haven't noticed any problems" is a good argument in the face of "several experienced developers got it wrong when trying to guess what it did". > Even if N, Z were clearer in this case (which I don't think it is and which > I have no idea what it should stand for), you would have to go for every > use of ``order`` in numpy to check whether it should be N or F or Z or C, > and then users would have to check which order name convention is > used in a specific function. Right - and this would be silly if and only if it made sense to conflate memory layout and index ordering. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle wrote: > On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett > wrote: >> >> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >> wrote: >> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> wrote: >> > >> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >> > ordering. >> > >> > This is very confusing. We think the index ordering and memory >> > ordering ideas need to be separated, and specifically, we should >> > avoid >> > using "C" and "F" to refer to index ordering. >> > >> > Proposal >> > - >> > >> > * Deprecate the use of "C" and "F" meaning backwards and forwards >> > index ordering for ravel, reshape >> > * Prefer "Z" and "N", being graphical representations of unraveling >> > in >> > 2 dimensions, axis1 first and axis0 first respectively (excellent >> > naming idea by Paul Ivanov) >> > >> > What do y'all think? >> >> I always thought "F" and "C" are easy to understand, I always thought >> about >> the content and never about the memory when using it. >> >> >> >> changing the names doesn't make it easier to understand. >> >> I think the confusion is because the new A and K refer to existing >> >> memory >> >> >> >> I disagree, I think it's confusing, but I have evidence, and that is >> that four out of four of us tested ourselves and got it wrong. >> >> Perhaps we are particularly dumb or poorly informed, but I think it's >> rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a "trick question" ravel F and C have *nothing* to do with memory layout. I think it's not confusing for beginners that have no idea and never think about memory layout. I've never seen any problems with it in statsmodels and I have seen many developers (GSOC) that are pretty new to python and numpy. (I didn't check the repo history to verify, so IIRC) Even if N, Z were clearer in this case (which I don't think it is and which I have no idea what it should stand for), you would have to go for every use of ``order`` in numpy to check whether it should be N or F or Z or C, and then users would have to check which order name convention is used in a specific function. Josef > > > I got all four correct. I think the concept --- at least for ravel --- is > pretty simple: would you like to read the data off in C ordering or Fortran > ordering. Since the output array is one-dimensional, its ordering is > irrelevant. > > I don't understand the 'Z' / 'N' suggestion at all. Are they part of some > pneumonic? > > I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy > already suffers from too much bikeshedding with names --- I rarely am able > to pull out a script I wrote using NumPy even a few years ago and have it > immediately work. > > Cheers, > Brad > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 4:31 PM, Bradley M. Froehle wrote: > On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett > wrote: >> >> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >> wrote: >> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> wrote: >> > >> > Ravel and reshape use the tems 'C' and 'F" in the sense of index >> > ordering. >> > >> > This is very confusing. We think the index ordering and memory >> > ordering ideas need to be separated, and specifically, we should >> > avoid >> > using "C" and "F" to refer to index ordering. >> > >> > Proposal >> > - >> > >> > * Deprecate the use of "C" and "F" meaning backwards and forwards >> > index ordering for ravel, reshape >> > * Prefer "Z" and "N", being graphical representations of unraveling >> > in >> > 2 dimensions, axis1 first and axis0 first respectively (excellent >> > naming idea by Paul Ivanov) >> > >> > What do y'all think? >> >> I always thought "F" and "C" are easy to understand, I always thought >> about >> the content and never about the memory when using it. >> >> >> >> changing the names doesn't make it easier to understand. >> >> I think the confusion is because the new A and K refer to existing >> >> memory >> >> >> >> I disagree, I think it's confusing, but I have evidence, and that is >> that four out of four of us tested ourselves and got it wrong. >> >> Perhaps we are particularly dumb or poorly informed, but I think it's >> rash to assert there is no problem here. > > > I got all four correct. Then you are smarted and or better informed than we were. I hope you didn't read my explanation before you tested yourself. Of course if you did read my email first I'd expect you and I to get the answer right first time. If you didn't read my email first, and didn't think too hard about it, and still got all the examples right, and you'd get other more confusing examples right that use reshape, then I'd add you as a data point on the other side to the four data points we got yesterday. > I think the concept --- at least for ravel --- is > pretty simple: would you like to read the data off in C ordering or Fortran > ordering. Since the output array is one-dimensional, its ordering is > irrelevant. Right - hence my confidence that Josef's sense of thinking of the 'C' and 'F' being target array output was not a good way to think of it in this case. It is in the case of arr.tostring() though. > I don't understand the 'Z' / 'N' suggestion at all. Are they part of some > pneumonic? Think of the way you'd read off the elements using reverse (last-first) index order for a 2D array, you might imagine something like a Z. > I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy > already suffers from too much bikeshedding with names --- I rarely am able > to pull out a script I wrote using NumPy even a few years ago and have it > immediately work. I wish we could drop bike-shedding - it's a completely useless word because one person's bike-shedding is another person's necessary clarification. You think this clarification isn't necessary and you think this discussion is bike-shedding. I'm not suggesting dropping the 'F' and 'C', obviously - can I call that a 'straw man'? I am suggesting changing the name to something much clearer, leaving that name clearly explained in the docs, and leaving 'C' and 'F" as functional synonyms for a very long time. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett wrote: > On Sat, Mar 30, 2013 at 2:20 PM, wrote: > > On Sat, Mar 30, 2013 at 4:57 PM, wrote: > >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > wrote: > >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: > On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett < > matthew.br...@gmail.com> wrote: > > > > Ravel and reshape use the tems 'C' and 'F" in the sense of index > ordering. > > > > This is very confusing. We think the index ordering and memory > > ordering ideas need to be separated, and specifically, we should > avoid > > using "C" and "F" to refer to index ordering. > > > > Proposal > > - > > > > * Deprecate the use of "C" and "F" meaning backwards and forwards > > index ordering for ravel, reshape > > * Prefer "Z" and "N", being graphical representations of unraveling > in > > 2 dimensions, axis1 first and axis0 first respectively (excellent > > naming idea by Paul Ivanov) > > > > What do y'all think? > > I always thought "F" and "C" are easy to understand, I always thought > about > the content and never about the memory when using it. > >> > >> changing the names doesn't make it easier to understand. > >> I think the confusion is because the new A and K refer to existing > memory > >> > > I disagree, I think it's confusing, but I have evidence, and that is > that four out of four of us tested ourselves and got it wrong. > > Perhaps we are particularly dumb or poorly informed, but I think it's > rash to assert there is no problem here. > I got all four correct. I think the concept --- at least for ravel --- is pretty simple: would you like to read the data off in C ordering or Fortran ordering. Since the output array is one-dimensional, its ordering is irrelevant. I don't understand the 'Z' / 'N' suggestion at all. Are they part of some pneumonic? I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy already suffers from too much bikeshedding with names --- I rarely am able to pull out a script I wrote using NumPy even a few years ago and have it immediately work. Cheers, Brad ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, 2013-03-30 at 12:45 -0700, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg > wrote: > > On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: > >> Hi, > >> > >> We were teaching today, and found ourselves getting very confused > >> about ravel and shape in numpy. > >> > >> > >> What do y'all think? > >> > > > > Personally I think it is clear enough and that "Z" and "N" would confuse > > me just as much (though I am used to the other names). Also "Z" and "N" > > would seem more like aliases, which would also make sense in the memory > > order context. > > If anything, I would prefer renaming the arguments iteration_order and > > memory_order, but it seems overdoing it... > > I am not sure what you mean - at the moment there is one argument > called 'order' that can refer to iteration order or memory order. Are > you proposing two arguments? > Yes that is what I meant. The reason that it is not convincing to me is that if I write `np.reshape(arr, ..., order='Z')`, I may be tempted to also write `np.copy(arr, order='Z')`. I don't see anything against allowing 'Z' as a more memorable 'C' (I also used to forget which was which), but I don't really see enforcing a different _value_ on the same named argument making it clearer. Renaming the argument itself would seem more sensible to me right now, but I cannot think of a decent name, so I would prefer trying to clarify the documentation if necessary. > > Maybe the documentation could just be checked if it is always clear > > though. I.e. maybe it does not use "iteration" or "memory" order > > consistently (though I somewhat feel it is usually clear that it must be > > iteration order, since no numpy function cares about the input memory > > order as they will just do a copy if necessary). > > Do you really mean this? Numpy is full of 'order=' flags that refer to > memory. > I somewhat imagined there were more iteration order flags and I basically count empty/ones/.../copy as basically one "array creation" monster... > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 2:20 PM, wrote: > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> wrote: >>> Hi, >>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett wrote: > > Hi, > > We were teaching today, and found ourselves getting very confused > about ravel and shape in numpy. > > Summary > -- > > There are two separate ideas needed to understand ordering in ravel and > reshape: > > Idea 1): ravel / reshape can proceed from the last axis to the first, > or the first to the last. This is "ravel index ordering" > Idea 2) The physical layout of the array (on disk or in memory) can be > "C" or "F" contiguous or neither. > This is "memory ordering" > > The index ordering is usually (but see below) orthogonal to the memory > ordering. > > The 'ravel' and 'reshape' commands use "C" and "F" in the sense of > index ordering, and this mixes the two ideas and is confusing. > > What the current situation looks like > > > Specifically, we've been rolling this around 4 experienced numpy users > and we all predicted at least one of the results below wrongly. > > This was what we knew, or should have known: > > In [2]: import numpy as np > > In [3]: arr = np.arange(10).reshape((2, 5)) > > In [5]: arr.ravel() > Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > So, the 'ravel' operation unravels over the last axis (1) first, > followed by axis 0. > > So far so good (even if the opposite to MATLAB, Octave). > > Then we found the 'order' flag to ravel: > > In [10]: arr.flags > Out[10]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [11]: arr.ravel('C') > Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > But we soon got confused. How about this? > > In [12]: arr_F = np.array(arr, order='F') > > In [13]: arr_F.flags > Out[13]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [16]: arr_F > Out[16]: > array([[0, 1, 2, 3, 4], >[5, 6, 7, 8, 9]]) > > In [17]: arr_F.ravel('C') > Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > Right - so the flag 'C' to ravel, has got nothing to do with *memory* > ordering, but is to do with *index* ordering. > > And in fact, we can ask for memory ordering specifically: > > In [22]: arr.ravel('K') > Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [23]: arr_F.ravel('K') > Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > In [24]: arr.ravel('A') > Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [25]: arr_F.ravel('A') > Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > There are some confusions to get into with the 'order' flag to reshape > as well, of the same type. > > Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should avoid > using "C" and "F" to refer to index ordering. > > Proposal > - > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of unraveling in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? > > Cheers, > > Matthew > Paul Ivanov > JB Poline > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought "F" and "C" are easy to understand, I always thought about the content and never about the memory when using it. >>> >>> I can only say that 4 out of 4 experienced numpy developers found >>> themselves unable to predict the behavior of these functions before >>> they saw the output. >>> >>> The problem is always that explaining something makes it clearer for a >>> moment, but, for those who do not have the explanation or who have >>> forgotten it, at least among us here, the outputs were generating >>> groans and / or high fives as we incorrectly or correctly guessed what >>> was going to happen. >>> >>> I think the only way to find out whether this really is confusing or >>> not, is to put someone in front of these functio
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 1:57 PM, wrote: > On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is "ravel index ordering" Idea 2) The physical layout of the array (on disk or in memory) can be "C" or "F" contiguous or neither. This is "memory ordering" The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use "C" and "F" in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using "C" and "F" to refer to index ordering. Proposal - * Deprecate the use of "C" and "F" meaning backwards and forwards index ordering for ravel, reshape * Prefer "Z" and "N", being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> I always thought "F" and "C" are easy to understand, I always thought about >>> the content and never about the memory when using it. >> >> I can only say that 4 out of 4 experienced numpy developers found >> themselves unable to predict the behavior of these functions before >> they saw the output. >> >> The problem is always that explaining something makes it clearer for a >> moment, but, for those who do not have the explanation or who have >> forgotten it, at least among us here, the outputs were generating >> groans and / or high fives as we incorrectly or correctly guessed what >> was going to happen. >> >> I think the only way to find out whether this really is confusing or >> not, is to put someone in front of these functions without any >> explanation and ask them to predict what is going to come out of the >> various inputs and flags. Or to try and teach it, which was the >> problem we were having. > > changin
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 4:57 PM, wrote: > On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > wrote: >> Hi, >> >> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is "ravel index ordering" Idea 2) The physical layout of the array (on disk or in memory) can be "C" or "F" contiguous or neither. This is "memory ordering" The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use "C" and "F" in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using "C" and "F" to refer to index ordering. Proposal - * Deprecate the use of "C" and "F" meaning backwards and forwards index ordering for ravel, reshape * Prefer "Z" and "N", being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> I always thought "F" and "C" are easy to understand, I always thought about >>> the content and never about the memory when using it. >> >> I can only say that 4 out of 4 experienced numpy developers found >> themselves unable to predict the behavior of these functions before >> they saw the output. >> >> The problem is always that explaining something makes it clearer for a >> moment, but, for those who do not have the explanation or who have >> forgotten it, at least among us here, the outputs were generating >> groans and / or high fives as we incorrectly or correctly guessed what >> was going to happen. >> >> I think the only way to find out whether this really is confusing or >> not, is to put someone in front of these functions without any >> explanation and ask them to predict what is going to come out of the >> various inputs and flags. Or to try and teach it, which was the >> problem we were having. > > changing the
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett wrote: > Hi, > > On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> We were teaching today, and found ourselves getting very confused >>> about ravel and shape in numpy. >>> >>> Summary >>> -- >>> >>> There are two separate ideas needed to understand ordering in ravel and >>> reshape: >>> >>> Idea 1): ravel / reshape can proceed from the last axis to the first, >>> or the first to the last. This is "ravel index ordering" >>> Idea 2) The physical layout of the array (on disk or in memory) can be >>> "C" or "F" contiguous or neither. >>> This is "memory ordering" >>> >>> The index ordering is usually (but see below) orthogonal to the memory >>> ordering. >>> >>> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >>> index ordering, and this mixes the two ideas and is confusing. >>> >>> What the current situation looks like >>> >>> >>> Specifically, we've been rolling this around 4 experienced numpy users >>> and we all predicted at least one of the results below wrongly. >>> >>> This was what we knew, or should have known: >>> >>> In [2]: import numpy as np >>> >>> In [3]: arr = np.arange(10).reshape((2, 5)) >>> >>> In [5]: arr.ravel() >>> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> >>> So, the 'ravel' operation unravels over the last axis (1) first, >>> followed by axis 0. >>> >>> So far so good (even if the opposite to MATLAB, Octave). >>> >>> Then we found the 'order' flag to ravel: >>> >>> In [10]: arr.flags >>> Out[10]: >>> C_CONTIGUOUS : True >>> F_CONTIGUOUS : False >>> OWNDATA : False >>> WRITEABLE : True >>> ALIGNED : True >>> UPDATEIFCOPY : False >>> >>> In [11]: arr.ravel('C') >>> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> >>> But we soon got confused. How about this? >>> >>> In [12]: arr_F = np.array(arr, order='F') >>> >>> In [13]: arr_F.flags >>> Out[13]: >>> C_CONTIGUOUS : False >>> F_CONTIGUOUS : True >>> OWNDATA : True >>> WRITEABLE : True >>> ALIGNED : True >>> UPDATEIFCOPY : False >>> >>> In [16]: arr_F >>> Out[16]: >>> array([[0, 1, 2, 3, 4], >>>[5, 6, 7, 8, 9]]) >>> >>> In [17]: arr_F.ravel('C') >>> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> >>> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >>> ordering, but is to do with *index* ordering. >>> >>> And in fact, we can ask for memory ordering specifically: >>> >>> In [22]: arr.ravel('K') >>> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> >>> In [23]: arr_F.ravel('K') >>> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >>> >>> In [24]: arr.ravel('A') >>> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> >>> In [25]: arr_F.ravel('A') >>> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >>> >>> There are some confusions to get into with the 'order' flag to reshape >>> as well, of the same type. >>> >>> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >>> >>> This is very confusing. We think the index ordering and memory >>> ordering ideas need to be separated, and specifically, we should avoid >>> using "C" and "F" to refer to index ordering. >>> >>> Proposal >>> - >>> >>> * Deprecate the use of "C" and "F" meaning backwards and forwards >>> index ordering for ravel, reshape >>> * Prefer "Z" and "N", being graphical representations of unraveling in >>> 2 dimensions, axis1 first and axis0 first respectively (excellent >>> naming idea by Paul Ivanov) >>> >>> What do y'all think? >>> >>> Cheers, >>> >>> Matthew >>> Paul Ivanov >>> JB Poline >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> I always thought "F" and "C" are easy to understand, I always thought about >> the content and never about the memory when using it. > > I can only say that 4 out of 4 experienced numpy developers found > themselves unable to predict the behavior of these functions before > they saw the output. > > The problem is always that explaining something makes it clearer for a > moment, but, for those who do not have the explanation or who have > forgotten it, at least among us here, the outputs were generating > groans and / or high fives as we incorrectly or correctly guessed what > was going to happen. > > I think the only way to find out whether this really is confusing or > not, is to put someone in front of these functions without any > explanation and ask them to predict what is going to come out of the > various inputs and flags. Or to try and teach it, which was the > problem we were having. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I don't remembe
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 4:14 AM, wrote: > On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett > wrote: >> >> Hi, >> >> We were teaching today, and found ourselves getting very confused >> about ravel and shape in numpy. >> >> Summary >> -- >> >> There are two separate ideas needed to understand ordering in ravel and >> reshape: >> >> Idea 1): ravel / reshape can proceed from the last axis to the first, >> or the first to the last. This is "ravel index ordering" >> Idea 2) The physical layout of the array (on disk or in memory) can be >> "C" or "F" contiguous or neither. >> This is "memory ordering" >> >> The index ordering is usually (but see below) orthogonal to the memory >> ordering. >> >> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >> index ordering, and this mixes the two ideas and is confusing. >> >> What the current situation looks like >> >> >> Specifically, we've been rolling this around 4 experienced numpy users >> and we all predicted at least one of the results below wrongly. >> >> This was what we knew, or should have known: >> >> In [2]: import numpy as np >> >> In [3]: arr = np.arange(10).reshape((2, 5)) >> >> In [5]: arr.ravel() >> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> So, the 'ravel' operation unravels over the last axis (1) first, >> followed by axis 0. >> >> So far so good (even if the opposite to MATLAB, Octave). >> >> Then we found the 'order' flag to ravel: >> >> In [10]: arr.flags >> Out[10]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [11]: arr.ravel('C') >> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> But we soon got confused. How about this? >> >> In [12]: arr_F = np.array(arr, order='F') >> >> In [13]: arr_F.flags >> Out[13]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [16]: arr_F >> Out[16]: >> array([[0, 1, 2, 3, 4], >>[5, 6, 7, 8, 9]]) >> >> In [17]: arr_F.ravel('C') >> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >> ordering, but is to do with *index* ordering. >> >> And in fact, we can ask for memory ordering specifically: >> >> In [22]: arr.ravel('K') >> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [23]: arr_F.ravel('K') >> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> In [24]: arr.ravel('A') >> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [25]: arr_F.ravel('A') >> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> There are some confusions to get into with the 'order' flag to reshape >> as well, of the same type. >> >> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >> >> This is very confusing. We think the index ordering and memory >> ordering ideas need to be separated, and specifically, we should avoid >> using "C" and "F" to refer to index ordering. >> >> Proposal >> - >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? >> >> Cheers, >> >> Matthew >> Paul Ivanov >> JB Poline >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > I always thought "F" and "C" are easy to understand, I always thought about > the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg wrote: > On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: >> Hi, >> >> We were teaching today, and found ourselves getting very confused >> about ravel and shape in numpy. >> >> Summary >> -- >> >> There are two separate ideas needed to understand ordering in ravel and >> reshape: >> >> Idea 1): ravel / reshape can proceed from the last axis to the first, >> or the first to the last. This is "ravel index ordering" >> Idea 2) The physical layout of the array (on disk or in memory) can be >> "C" or "F" contiguous or neither. >> This is "memory ordering" >> >> The index ordering is usually (but see below) orthogonal to the memory >> ordering. >> >> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >> index ordering, and this mixes the two ideas and is confusing. >> >> What the current situation looks like >> >> >> Specifically, we've been rolling this around 4 experienced numpy users >> and we all predicted at least one of the results below wrongly. >> >> This was what we knew, or should have known: >> >> In [2]: import numpy as np >> >> In [3]: arr = np.arange(10).reshape((2, 5)) >> >> In [5]: arr.ravel() >> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> So, the 'ravel' operation unravels over the last axis (1) first, >> followed by axis 0. >> >> So far so good (even if the opposite to MATLAB, Octave). >> >> Then we found the 'order' flag to ravel: >> >> In [10]: arr.flags >> Out[10]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [11]: arr.ravel('C') >> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> But we soon got confused. How about this? >> >> In [12]: arr_F = np.array(arr, order='F') >> >> In [13]: arr_F.flags >> Out[13]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [16]: arr_F >> Out[16]: >> array([[0, 1, 2, 3, 4], >>[5, 6, 7, 8, 9]]) >> >> In [17]: arr_F.ravel('C') >> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >> ordering, but is to do with *index* ordering. >> >> And in fact, we can ask for memory ordering specifically: >> >> In [22]: arr.ravel('K') >> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [23]: arr_F.ravel('K') >> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> In [24]: arr.ravel('A') >> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [25]: arr_F.ravel('A') >> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> There are some confusions to get into with the 'order' flag to reshape >> as well, of the same type. >> >> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >> >> This is very confusing. We think the index ordering and memory >> ordering ideas need to be separated, and specifically, we should avoid >> using "C" and "F" to refer to index ordering. >> >> Proposal >> - >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? >> > > Personally I think it is clear enough and that "Z" and "N" would confuse > me just as much (though I am used to the other names). Also "Z" and "N" > would seem more like aliases, which would also make sense in the memory > order context. > If anything, I would prefer renaming the arguments iteration_order and > memory_order, but it seems overdoing it... I am not sure what you mean - at the moment there is one argument called 'order' that can refer to iteration order or memory order. Are you proposing two arguments? > Maybe the documentation could just be checked if it is always clear > though. I.e. maybe it does not use "iteration" or "memory" order > consistently (though I somewhat feel it is usually clear that it must be > iteration order, since no numpy function cares about the input memory > order as they will just do a copy if necessary). Do you really mean this? Numpy is full of 'order=' flags that refer to memory. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: > Hi, > > We were teaching today, and found ourselves getting very confused > about ravel and shape in numpy. > > Summary > -- > > There are two separate ideas needed to understand ordering in ravel and > reshape: > > Idea 1): ravel / reshape can proceed from the last axis to the first, > or the first to the last. This is "ravel index ordering" > Idea 2) The physical layout of the array (on disk or in memory) can be > "C" or "F" contiguous or neither. > This is "memory ordering" > > The index ordering is usually (but see below) orthogonal to the memory > ordering. > > The 'ravel' and 'reshape' commands use "C" and "F" in the sense of > index ordering, and this mixes the two ideas and is confusing. > > What the current situation looks like > > > Specifically, we've been rolling this around 4 experienced numpy users > and we all predicted at least one of the results below wrongly. > > This was what we knew, or should have known: > > In [2]: import numpy as np > > In [3]: arr = np.arange(10).reshape((2, 5)) > > In [5]: arr.ravel() > Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > So, the 'ravel' operation unravels over the last axis (1) first, > followed by axis 0. > > So far so good (even if the opposite to MATLAB, Octave). > > Then we found the 'order' flag to ravel: > > In [10]: arr.flags > Out[10]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [11]: arr.ravel('C') > Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > But we soon got confused. How about this? > > In [12]: arr_F = np.array(arr, order='F') > > In [13]: arr_F.flags > Out[13]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [16]: arr_F > Out[16]: > array([[0, 1, 2, 3, 4], >[5, 6, 7, 8, 9]]) > > In [17]: arr_F.ravel('C') > Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > Right - so the flag 'C' to ravel, has got nothing to do with *memory* > ordering, but is to do with *index* ordering. > > And in fact, we can ask for memory ordering specifically: > > In [22]: arr.ravel('K') > Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [23]: arr_F.ravel('K') > Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > In [24]: arr.ravel('A') > Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [25]: arr_F.ravel('A') > Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > There are some confusions to get into with the 'order' flag to reshape > as well, of the same type. > > Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should avoid > using "C" and "F" to refer to index ordering. > > Proposal > - > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of unraveling in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? > Personally I think it is clear enough and that "Z" and "N" would confuse me just as much (though I am used to the other names). Also "Z" and "N" would seem more like aliases, which would also make sense in the memory order context. If anything, I would prefer renaming the arguments iteration_order and memory_order, but it seems overdoing it... Maybe the documentation could just be checked if it is always clear though. I.e. maybe it does not use "iteration" or "memory" order consistently (though I somewhat feel it is usually clear that it must be iteration order, since no numpy function cares about the input memory order as they will just do a copy if necessary). Regards, Sebastian > Cheers, > > Matthew > Paul Ivanov > JB Poline > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Indexing bug?
On Sat, Mar 30, 2013 at 11:01 AM, Ivan Oseledets wrote: > I am using numpy 1.6.1, > and encountered a wierd fancy indexing bug: > > import numpy as np > c = np.random.randn(10,200,10); > > In [29]: print c[[0,1],:200,:2].shape > (2, 200, 2) > > In [30]: print c[[0,1],:200,[0,1]].shape > (2, 200) > > It means, that here fancy indexing is not working right for a 3d array. > It is working fine, review the docs: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing In your return, item [0, :] is c[0, :, 0] and item[1, :]is c[1, :, 1]. If you want a return of shape (2, 200, 2) where item [i, :, j] is c[i, :, j] you could use slicing: c[:2, :200, :2] or something more elaborate like: c[np.arange(2)[:, None, None], np.arange(200)[:, None], np.arange(2)] Jaime > > Is this bug fixed with higher versions of numpy? > I do not check, since mine is from EPD and is compiled with MKL (and I > can consider recompiling myself only under strong circumstances) > > Ivan > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Indexing bug?
I am using numpy 1.6.1, and encountered a wierd fancy indexing bug: import numpy as np c = np.random.randn(10,200,10); In [29]: print c[[0,1],:200,:2].shape (2, 200, 2) In [30]: print c[[0,1],:200,[0,1]].shape (2, 200) It means, that here fancy indexing is not working right for a 3d array. Is this bug fixed with higher versions of numpy? I do not check, since mine is from EPD and is compiled with MKL (and I can consider recompiling myself only under strong circumstances) Ivan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 7:14 AM, wrote: > On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett > wrote: >> >> Hi, >> >> We were teaching today, and found ourselves getting very confused >> about ravel and shape in numpy. >> >> Summary >> -- >> >> There are two separate ideas needed to understand ordering in ravel and >> reshape: >> >> Idea 1): ravel / reshape can proceed from the last axis to the first, >> or the first to the last. This is "ravel index ordering" >> Idea 2) The physical layout of the array (on disk or in memory) can be >> "C" or "F" contiguous or neither. >> This is "memory ordering" >> >> The index ordering is usually (but see below) orthogonal to the memory >> ordering. >> >> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >> index ordering, and this mixes the two ideas and is confusing. >> >> What the current situation looks like >> >> >> Specifically, we've been rolling this around 4 experienced numpy users >> and we all predicted at least one of the results below wrongly. >> >> This was what we knew, or should have known: >> >> In [2]: import numpy as np >> >> In [3]: arr = np.arange(10).reshape((2, 5)) >> >> In [5]: arr.ravel() >> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> So, the 'ravel' operation unravels over the last axis (1) first, >> followed by axis 0. >> >> So far so good (even if the opposite to MATLAB, Octave). >> >> Then we found the 'order' flag to ravel: >> >> In [10]: arr.flags >> Out[10]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [11]: arr.ravel('C') >> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> But we soon got confused. How about this? >> >> In [12]: arr_F = np.array(arr, order='F') >> >> In [13]: arr_F.flags >> Out[13]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [16]: arr_F >> Out[16]: >> array([[0, 1, 2, 3, 4], >>[5, 6, 7, 8, 9]]) >> >> In [17]: arr_F.ravel('C') >> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >> ordering, but is to do with *index* ordering. >> >> And in fact, we can ask for memory ordering specifically: >> >> In [22]: arr.ravel('K') >> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [23]: arr_F.ravel('K') >> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> In [24]: arr.ravel('A') >> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [25]: arr_F.ravel('A') >> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> There are some confusions to get into with the 'order' flag to reshape >> as well, of the same type. >> >> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >> >> This is very confusing. We think the index ordering and memory >> ordering ideas need to be separated, and specifically, we should avoid >> using "C" and "F" to refer to index ordering. >> >> Proposal >> - >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? >> >> Cheers, >> >> Matthew >> Paul Ivanov >> JB Poline >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > I always thought "F" and "C" are easy to understand, I always thought about > the content and never about the memory when using it. > > In my numpy htmlhelp for version 1.5, I don't have a K or A option > np.__version__ > '1.5.1' > np.arange(5).ravel("K") > Traceback (most recent call last): > File "", line 1, in > TypeError: order not understood > np.arange(5).ravel("A") > array([0, 1, 2, 3, 4]) > > the C, F in ravel have their twins in reshape > arr = np.arange(10).reshape(2,5, order="C").copy() arr > array([[0, 1, 2, 3, 4], >[5, 6, 7, 8, 9]]) arr.ravel() > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) arr = np.arange(10).reshape(2,5, order="F").copy() arr > array([[0, 2, 4, 6, 8], >[1, 3, 5, 7, 9]]) arrarr.ravel("F") > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > For example we use it when we get raveled arrays from R, > and F for column order and C for row order indexing are pretty > obvious names when coming from another package (Matlab, R, Gauss) just a quick search to get an idea in statsmodels 19 out of 135 ravel are ravel('F') 50 out of 270 reshapes specify: reshape.*order='F' (regular expression) Josef > > Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mai
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett wrote: > > Hi, > > We were teaching today, and found ourselves getting very confused > about ravel and shape in numpy. > > Summary > -- > > There are two separate ideas needed to understand ordering in ravel and > reshape: > > Idea 1): ravel / reshape can proceed from the last axis to the first, > or the first to the last. This is "ravel index ordering" > Idea 2) The physical layout of the array (on disk or in memory) can be > "C" or "F" contiguous or neither. > This is "memory ordering" > > The index ordering is usually (but see below) orthogonal to the memory > ordering. > > The 'ravel' and 'reshape' commands use "C" and "F" in the sense of > index ordering, and this mixes the two ideas and is confusing. > > What the current situation looks like > > > Specifically, we've been rolling this around 4 experienced numpy users > and we all predicted at least one of the results below wrongly. > > This was what we knew, or should have known: > > In [2]: import numpy as np > > In [3]: arr = np.arange(10).reshape((2, 5)) > > In [5]: arr.ravel() > Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > So, the 'ravel' operation unravels over the last axis (1) first, > followed by axis 0. > > So far so good (even if the opposite to MATLAB, Octave). > > Then we found the 'order' flag to ravel: > > In [10]: arr.flags > Out[10]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [11]: arr.ravel('C') > Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > But we soon got confused. How about this? > > In [12]: arr_F = np.array(arr, order='F') > > In [13]: arr_F.flags > Out[13]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [16]: arr_F > Out[16]: > array([[0, 1, 2, 3, 4], >[5, 6, 7, 8, 9]]) > > In [17]: arr_F.ravel('C') > Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > Right - so the flag 'C' to ravel, has got nothing to do with *memory* > ordering, but is to do with *index* ordering. > > And in fact, we can ask for memory ordering specifically: > > In [22]: arr.ravel('K') > Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [23]: arr_F.ravel('K') > Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > In [24]: arr.ravel('A') > Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [25]: arr_F.ravel('A') > Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > There are some confusions to get into with the 'order' flag to reshape > as well, of the same type. > > Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should avoid > using "C" and "F" to refer to index ordering. > > Proposal > - > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of unraveling in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? > > Cheers, > > Matthew > Paul Ivanov > JB Poline > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought "F" and "C" are easy to understand, I always thought about the content and never about the memory when using it. In my numpy htmlhelp for version 1.5, I don't have a K or A option >>> np.__version__ '1.5.1' >>> np.arange(5).ravel("K") Traceback (most recent call last): File "", line 1, in TypeError: order not understood >>> np.arange(5).ravel("A") array([0, 1, 2, 3, 4]) >>> the C, F in ravel have their twins in reshape >>> arr = np.arange(10).reshape(2,5, order="C").copy() >>> arr array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> arr.ravel() array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> arr = np.arange(10).reshape(2,5, order="F").copy() >>> arr array([[0, 2, 4, 6, 8], [1, 3, 5, 7, 9]]) >>> arrarr.ravel("F") array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) For example we use it when we get raveled arrays from R, and F for column order and C for row order indexing are pretty obvious names when coming from another package (Matlab, R, Gauss) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion