Hi, On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: >> Hi, >> >> On Sun, Mar 31, 2013 at 1:43 PM, <josef.p...@gmail.com> wrote: >> > On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett <matthew.br...@gmail.com> >> > wrote: >> >> Hi, >> >> >> >> On Sat, Mar 30, 2013 at 10:38 PM, <josef.p...@gmail.com> wrote: >> >>> On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett >> >>> <matthew.br...@gmail.com> wrote: >> >>>> Hi, >> >>>> >> >>>> On Sat, Mar 30, 2013 at 9:37 PM, <josef.p...@gmail.com> wrote: >> >>>>> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett >> >>>>> <matthew.br...@gmail.com> wrote: >> >>>>>> Hi, >> >>>>>> >> >>>>>> On Sat, Mar 30, 2013 at 7:02 PM, <josef.p...@gmail.com> wrote: >> >>>>>>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett >> >>>>>>> <matthew.br...@gmail.com> wrote: >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> On Sat, Mar 30, 2013 at 7:50 PM, <josef.p...@gmail.com> wrote: >> >>>>>>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >> >>>>>>>>> <brad.froe...@gmail.com> wrote: >> >>>>>>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >> >>>>>>>>>> <matthew.br...@gmail.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> On Sat, Mar 30, 2013 at 2:20 PM, <josef.p...@gmail.com> wrote: >> >>>>>>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM, <josef.p...@gmail.com> wrote: >> >>>>>>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >>>>>>>>>>> >> <matthew.br...@gmail.com> wrote: >> >>>>>>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM, <josef.p...@gmail.com> >> >>>>>>>>>>> >>> wrote: >> >>>>>>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> >>>>>>>>>>> >>>> <matthew.br...@gmail.com> wrote: >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of >> >>>>>>>>>>> >>>>> index >> >>>>>>>>>>> >>>>> ordering. >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> This is very confusing. We think the index ordering and >> >>>>>>>>>>> >>>>> memory >> >>>>>>>>>>> >>>>> ordering ideas need to be separated, and specifically, we >> >>>>>>>>>>> >>>>> should >> >>>>>>>>>>> >>>>> avoid >> >>>>>>>>>>> >>>>> using "C" and "F" to refer to index ordering. >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> Proposal >> >>>>>>>>>>> >>>>> ------------- >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and >> >>>>>>>>>>> >>>>> forwards >> >>>>>>>>>>> >>>>> index ordering for ravel, reshape >> >>>>>>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of >> >>>>>>>>>>> >>>>> unraveling >> >>>>>>>>>>> >>>>> in >> >>>>>>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively >> >>>>>>>>>>> >>>>> (excellent >> >>>>>>>>>>> >>>>> naming idea by Paul Ivanov) >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> What do y'all think? >> >>>>>>>>>>> >>>> >> >>>>>>>>>>> >>>> I always thought "F" and "C" are easy to understand, I >> >>>>>>>>>>> >>>> always thought >> >>>>>>>>>>> >>>> about >> >>>>>>>>>>> >>>> the content and never about the memory when using it. >> >>>>>>>>>>> >> >> >>>>>>>>>>> >> changing the names doesn't make it easier to understand. >> >>>>>>>>>>> >> I think the confusion is because the new A and K refer to >> >>>>>>>>>>> >> existing >> >>>>>>>>>>> >> memory >> >>>>>>>>>>> >> >> >>>>>>>>>>> >> >>>>>>>>>>> I disagree, I think it's confusing, but I have evidence, and >> >>>>>>>>>>> that is >> >>>>>>>>>>> that four out of four of us tested ourselves and got it wrong. >> >>>>>>>>>>> >> >>>>>>>>>>> Perhaps we are particularly dumb or poorly informed, but I think >> >>>>>>>>>>> it's >> >>>>>>>>>>> rash to assert there is no problem here. >> >>>>>>>>> >> >>>>>>>>> I think you are overcomplicating things or phrased it as a "trick >> >>>>>>>>> question" >> >>>>>>>> >> >>>>>>>> I don't know what you mean by trick question - was there something >> >>>>>>>> over-complicated in the example? I deliberately didn't include >> >>>>>>>> various much more confusing examples in "reshape". >> >>>>>>> >> >>>>>>> I meant making the "candidates" think about memory instead of just >> >>>>>>> column versus row stacking. >> >>>>>> >> >>>>>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >> >>>>>> array, it was an image, with time as the 4th dimension (N time >> >>>>>> points). Raveling and reshaping 3D and 4D arrays is a common thing >> >>>>>> to do in neuroimaging, as you can imagine. >> >>>>>> >> >>>>>> A student asked what he would get back from raveling this array, a >> >>>>>> concatenated time series, or something spatial? >> >>>>>> >> >>>>>> We showed (I'd worked it out by this time) that the first N values >> >>>>>> were the time series given by [0, 0, 0, :]. >> >>>>>> >> >>>>>> He said - "Oh - I see - so the data is stored as a whole lot of time >> >>>>>> series one by one, I thought it would be stored as a series of >> >>>>>> images'. >> >>>>>> >> >>>>>> Ironically, this was a Fortran-ordered array in memory, and he was >> >>>>>> wrong. >> >>>>>> >> >>>>>> So, I think the idea of memory ordering and index ordering is very >> >>>>>> easy to confuse, and comes up naturally. >> >>>>>> >> >>>>>> I would like, as a teacher, to be able to say something like: >> >>>>>> >> >>>>>> This is what C memory layout is (it's the memory layout that gives >> >>>>>> arr.flags.C_CONTIGUOUS=True) >> >>>>>> This is what F memory layout is (it's the memory layout that gives >> >>>>>> arr.flags.F_CONTIGUOUS=True) >> >>>>>> It's rather easy to get something that is neither C or F memory layout >> >>>>>> Numpy does many memory layouts. >> >>>>>> Ravel and reshape and numpy in general do not care (normally) about C >> >>>>>> or F layouts, they only care about index ordering. >> >>>>>> >> >>>>>> My point, that I'm repeating, is that my job is made harder by >> >>>>>> 'arr.ravel('F')'. >> >>>>> >> >>>>> But once you know that ravel and reshape don't care about memory, the >> >>>>> ravel is easy to predict (maybe not easy to visualize in 4-D): >> >>>> >> >>>> But this assumes that you already know that there's such a thing as >> >>>> memory layout, and there's such a thing as index ordering, and that >> >>>> 'C' and 'F' in ravel refer to index ordering. Once you have that, >> >>>> you're golden. I'm arguing it's markedly harder to get this >> >>>> distinction, and keep it in mind, and teach it, if we are using the >> >>>> 'C' and 'F" names for both things. >> >>> >> >>> No, I think you are still missing my point. >> >>> I think explaining ravel and reshape F and C is easy (kind of) because >> >>> the >> >>> students don't need to know at that stage about memory layouts. >> >>> >> >>> All they need to know is that we look at n-dimensional objects in >> >>> C-order or in F-order >> >>> (whichever index runs fastest) >> >> >> >> Would you accept that it may or may not be true that it is desirable >> >> or practical not to mention memory layouts when teaching numpy? >> > >> > I think they should be in two different sections. >> > >> > basic usage: >> > ravel, reshape in pure index order, and indexing, broadcasting, ... >> > >> > advanced usage: >> > memory layout and some ability to predict when you get a view and >> > when you get a copy. >> >> Right - that is what you think - but I was asking - do you agree that >> it's possible that that is not best way to teach it? >> >> What evidence would you give that it was the best way to teach it? >> >> > And I still think words can mean different things in different context >> > (with a qualifier maybe) >> > indexing in fortran order >> > memory in fortran order >> >> Right - but you'd probably also accept that using the same word for >> different and related things is likely to cause confusion? I'm sure >> we could come up with some experimental evidence for that if you do >> doubt it. >> >> > Disclaimer: I never tried to teach numpy >> > and with GSOC students my explanations only went a little bit >> > beyond what they needed to know for the purpose at hand (I hope) >> > >> >> >> >> You believe it is desirable, I believe that it is not - that teaching >> >> numpy naturally involves some discussion of memory layout. >> >> >> >> As evidence: >> >> >> >> * My student, without any prompting about memory layouts, is asking about >> >> it >> >> * Travis' numpy book has a very early section on this (section 2.3 - >> >> memory layout) >> >> * I often think about memory layouts, and from your discussion, you do >> >> too. It's uncommon that you don't have to teach something that >> >> experienced users think about often. >> > >> > I'm mentioning memory layout because I'm talking to you. >> > I wouldn't talk about memory layout if I would try to explain ravel, >> > reshape and indexing for the first time to a student. >> > >> >> * The most common use of 'order' only refers to memory layout. For >> >> example np.array "order" doesn't refer to index ordering but to memory >> >> layout. >> > >> > No, as I tried to show with the statsmodels example. >> > I don't require GSOC students (that are relatively new to numpy) to >> > understand >> > much about memory layout. >> > The only use of ``order`` in statsmodels refers to *index* order in >> > ravel and reshape. >> > >> >> * The current docstring of 'reshape' cannot be explained without >> >> referring to memory order. >> > >> > really ? >> > I thought reshape only refers to *index* order for "F" and "C" >> >> Here's the docstring for 'reshape': >> >> order : {'C', 'F', 'A'}, optional >> Determines whether the array data should be viewed as in C >> (row-major) order, FORTRAN (column-major) order, or the C/FORTRAN >> order should be preserved. >> >> The 'A' option cannot be explained without reference to 'C' or 'F' >> *memory* layout - i.e. a different meaning of the 'C' and 'F" in the >> indexing interpretation. >> >> Actually, as a matter of interest - how would you explain the behavior >> of 'A' when the array is neither 'C' or 'F' memory layout? Maybe that >> could be a good test case? >> > > The 'A' means C-order unless `ndarray.flags.fnc == True` (which means > "fortran not C"). The detail about "not C" should not matter really for > copies, for reshape it should maybe be mentioned more clearly. Though > honestly, reshaping with 'A' seems so weird to me, I doubt anyone ever > does it. As for ravel... you can probably just as well use 'K' instead > which is even less restrictive.
I was arguing that it is not possible to explain the docstring(s) without reference to memory order - I guess you agree. Cheers, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion