Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Apr 6, 2013 at 3:15 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. You might not have read my follow-up proposing to drop steps 3 and 4 if you felt they were unacceptable. P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. No, I did make that proposal, and in the spirit of negotiation and consensus, I subsequently modified my proposal, as I hope you'd expect in this situation. You have had clear NOs to the various incarnations of your proposal from 3 active developers of this community, not once but two or three times from each of those developers. Furthermore you have got only a couple of +0.5s, after 90 emails no one else seems to feel that this is a change we really have to have this change. Therefore I don't expect another modification of your proposal, I expect you to drop it. OK - I think I have a better understanding of the 'model' now. As another poster said, this thread has run its course. The technical issues are clear, and apparently we're going to have to agree to disagree about the seriousness of the confusion. Please please go and fix the docs in the way you deem best, and leave it at that. And triple please not another governance thread. https://github.com/numpy/numpy/pull/3294 Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
As one lurker to another, thanks for calling it out. Over-argumentative, and personality centric threads like these have actually led me to distance myself from the numpy community. I do not know how common it is now because I do not follow it closely anymore. It used to be quite common at one point in time. I came down to check after a while, and lo there it is again. If a mail is put forward as a question i find this confusing, is it confusing for you, it ought not to devolve into a shouting match atop moral high-horses so you think I am stupid do you? too smart are you ? how dare you express that it doesnt bother you as much when it bothers me and my documented case of 4 people. I have four, how many do you have If something is posed as a question one should be open to the answers. Sometimes it is better not to pose it a question at all but offer alternatives and ask for preference. I am not siding with any of the technical options provided, just requesting that the discourse not devolve into these personality oriented contests. It gets too loud and noisy. Thank you On Sat, Apr 6, 2013 at 12:18 PM, matti picus matti.pi...@gmail.com wrote: as a lurker, may I say that this discussion seems to have become non-productive? It seems all agree that docs needs improvement, perhaps a first step would be to suggest doc improvements, and then the need for renaming may become self-evident, or not. aww darn, ruined my lurker status. Matti Picus ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. Ralf P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. P.P.P.S. I'm not sure exactly what you mean by default keyword. If layout overrules order and layout's default value is not None, you're still proposing a backwards compatibility break. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
as a lurker, may I say that this discussion seems to have become non-productive? It seems all agree that docs needs improvement, perhaps a first step would be to suggest doc improvements, and then the need for renaming may become self-evident, or not. aww darn, ruined my lurker status. Matti Picus ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. You might not have read my follow-up proposing to drop steps 3 and 4 if you felt they were unacceptable. P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. No, I did make that proposal, and in the spirit of negotiation and consensus, I subsequently modified my proposal, as I hope you'd expect in this situation. I'm am honestly sorry that I offended you. In hindsight, although I do worry that numpy feels as if it does resist reasonable change more strongly than is healthy, I was probably responding to my feeling that you were trying to veto the discussion rather than joining it, and I really should have put it that way instead. I am sorry about that. P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. It seems to me that each change has to be considered on its merit, and strict rules of that sort are not very useful. You are again implying that this change is not important, and obviously there I don't agree. I addressed the level and timing of backwards compatibility breakage in my comments to Josef. You haven't responded to me on that. P.P.P.S. I'm not sure exactly what you mean by default keyword. If layout overrules order and layout's default value is not None, you're still proposing a backwards compatibility break. I mean, that until the expiry of some agreed period 'P' - the docstring would read def ravel(self, order='C', **kwargs) where kwargs can only contain 'layout', and 'layout', 'order' cannot both be defined and after the expiry of 'P' def ravel(self, layout='C', **kwargs) where kwargs can only contain 'order', and 'layout', 'order' cannot both be defined At least that's my proposal, I'm happy to change it if there is a better solution. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi Ralf, Ralf Gommers, on 2013-04-06 10:51, wrote: P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. but it is a functional enhancement or bug fix - the ambiguity in the affect of order= values in several places only serve to confuse two different ideas into one. -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Apr 6, 2013 at 8:16 PM, Paul Ivanov pivanov...@gmail.com wrote: Hi Ralf, Ralf Gommers, on 2013-04-06 10:51, wrote: P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. but it is a functional enhancement or bug fix - the ambiguity in the affect of order= values in several places only serve to confuse two different ideas into one. That sentence makes zero sense. The reason you can't decide whether it's a bug fix or enhancement is because it's neither. What ambiguity there is can be solved with documentation only, there's nothing new you can do with these functions after introducing a new keyword and there is no bug. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. You might not have read my follow-up proposing to drop steps 3 and 4 if you felt they were unacceptable. P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. No, I did make that proposal, and in the spirit of negotiation and consensus, I subsequently modified my proposal, as I hope you'd expect in this situation. You have had clear NOs to the various incarnations of your proposal from 3 active developers of this community, not once but two or three times from each of those developers. Furthermore you have got only a couple of +0.5s, after 90 emails no one else seems to feel that this is a change we really have to have this change. Therefore I don't expect another modification of your proposal, I expect you to drop it. As another poster said, this thread has run its course. The technical issues are clear, and apparently we're going to have to agree to disagree about the seriousness of the confusion. Please please go and fix the docs in the way you deem best, and leave it at that. And triple please not another governance thread. I'm am honestly sorry that I offended you. Thank you. I apologize as well if my tone of the last message was too strong. Ralf In hindsight, although I do worry that numpy feels as if it does resist reasonable change more strongly than is healthy, I was probably responding to my feeling that you were trying to veto the discussion rather than joining it, and I really should have put it that way instead. I am sorry about that. P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. It seems to me that each change has to be considered on its merit, and strict rules of that sort are not very useful. You are again implying that this change is not important, and obviously there I don't agree. I addressed the level and timing of backwards compatibility breakage in my comments to Josef. You haven't responded to me on that. P.P.P.S. I'm not sure exactly what you mean by default keyword. If layout overrules order and layout's default value is not None, you're still proposing a backwards compatibility break. I mean, that until the expiry of some agreed period 'P' - the docstring would read def ravel(self, order='C', **kwargs) where kwargs can only contain 'layout', and 'layout', 'order' cannot both be defined and after the expiry of 'P' def ravel(self, layout='C', **kwargs) where kwargs can only contain 'order', and 'layout', 'order' cannot both be defined At least that's my proposal, I'm happy to change it if there is a better solution. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. You might not have read my follow-up proposing to drop steps 3 and 4 if you felt they were unacceptable. P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. No, I did make that proposal, and in the spirit of negotiation and consensus, I subsequently modified my proposal, as I hope you'd expect in this situation. You have had clear NOs to the various incarnations of your proposal from 3 active developers of this community, not once but two or three times from each of those developers. Furthermore you have got only a couple of +0.5s, after 90 emails no one else seems to feel that this is a change we really have to have this change. Therefore I don't expect another modification of your proposal, I expect you to drop it. OK - I think I have a better understanding of the 'model' now. As another poster said, this thread has run its course. The technical issues are clear, and apparently we're going to have to agree to disagree about the seriousness of the confusion. Please please go and fix the docs in the way you deem best, and leave it at that. And triple please not another governance thread. The governance threads happen because of the lack of governance, as this thread shows. I don't agree that decisions should be taken like this (+1, -1, No!, Yes!). I think they should be taken by negotiation and agreement. You disagree, but on whose authority, I do not know, and we have no way of resolving that, because there is - no governance thread. I'm am honestly sorry that I offended you. Thank you. I apologize as well if my tone of the last message was too strong. Thank you in turn, that is generous of you, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? I think that is short-sighted and I think it will damage numpy. Believe me, I have as much investment in backward compatibility as you do. All the three libraries that I spend a long time maintaining need to test against old numpy versions - but - for heaven's sake - only back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, and I don't think we need to maintain compatibility with Numeric either. If you are saying that we need to maintain compatibility for 10 years at a stretch, then we will have to accept that numpy will gradually decay into a legacy library, because it is certain that, if we stay static, someone else with more ambition will do a better job. There is a cost to being averse to any change at all, no matter how gradually it is managed. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. Believe me, I have as much investment in backward compatibility as you do. All the three libraries that I spend a long time maintaining need to test against old numpy versions - but - for heaven's sake - only back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, and I don't think we need to maintain compatibility with Numeric either. Really? This is from 3 months ago: http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now 2013, we are probably dropping numarray compat in 1.8. Not exactly 10 years, but of the same order. If you are saying that we need to maintain compatibility for 10 years at a stretch, then we will have to accept that numpy will gradually decay into a legacy library, because it is certain that, if we stay static, someone else with more ambition will do a better job. There is a cost to being averse to any change at all, no matter how gradually it
[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. Believe me, I have as much investment in backward compatibility as you do. All the three libraries that I spend a long time maintaining need to test against old numpy versions - but - for heaven's sake - only back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, and I don't think we need to maintain compatibility with Numeric either. Really? This is from 3 months ago:
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 3:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to order and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. Believe me, I have as much investment in backward compatibility as you do. All the three libraries that I spend a long time maintaining need to test against old numpy versions - but - for heaven's sake - only back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, and I don't think we need to maintain
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 4:27 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey snip I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high You Ralf Bradley Mid-no - maybe something could work, but not sure we've seen it yet. Chris Middle - current situation can be confusing, maybe one of the proposed solutions would be acceptable Sebastian Nathaniel Mid-yes - previous apparent vote for argument name change Éric Depagne Andrew Jaffe (sorry if I misrepresent you) And then me. I am trying to be balanced. Unlike others, I think better names would have a significant impact on how coherent numpy is to explain and use. It seems to me that a change would be beneficial in the long term, and I'm confident we can agree on a schedule for that change that would be acceptable. But you know that. So - as I understand our 'model' - our job is to try and come to some shared agreement, if we possibly can. It has been good and encouraging for me at least to see that we have developed our ideas over the course of this thread. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 4:27 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey snip I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. Note, I'm just a user of numpy My main objection was to N and Z, which would have affected me (and statsmodels developers) I don't really care about the layout change. I have no or almost no code depending on it. And, I don't have to implement it, nor do I have to struggle with the low level numpy behavior that would be affected by this. (And renaming doesn't change the concept.) Josef You Ralf Bradley Mid-no - maybe something could work, but not sure we've seen it yet. Chris Middle - current situation can be confusing, maybe one of the proposed solutions would be acceptable Sebastian Nathaniel Mid-yes - previous apparent vote for argument name change Éric Depagne Andrew Jaffe (sorry if I misrepresent you) And then me. I am trying to be balanced. Unlike others, I think better names would have a significant impact on how coherent numpy is to explain and use. It seems to me that a change would be beneficial in the long term, and I'm confident we can agree on a schedule for that change that would be acceptable. But you know that. So - as I understand our 'model' - our job is to try and come to some shared agreement, if we possibly can. It has been good and encouraging for me at least to see that we have developed our ideas over the course of this thread. Cheers, Matthew
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 4:27 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey snip I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The only problem I can see with this, is that if someone, after period P, does not read the docstring, and uses 'layout' instead of 'order', then they will find that their code is not backwards compatible with versions of numpy of greater age than P. They can fix this, forever, by reverting to 'order'. That's certainly not zero cost, but it's not much cost either, and the cost will depend on P. Note, I'm just a user of numpy My main objection was to N and Z, which would have affected me (and statsmodels developers) Right. I don't really care about the layout change. I have no or almost no code depending on it. And, I don't have to implement it, nor do I have to struggle with the low level numpy behavior that would be affected by this. (And renaming doesn't change the concept.) No, right, the renaming is to clarify and distinguish the concepts. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 4:27 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey snip I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The only problem I can see with this, is that if someone, after period P, does not read the docstring, and uses 'layout' instead of 'order', then they will find that their code is not backwards compatible with versions of numpy of greater age than P. They can fix this, forever, by reverting to 'order'. That's certainly not zero cost, but it's not much cost either, and the cost will depend on P. You edit large parts of the numpy tutorial and explanations, you add a second keyword to (rough guess) 10 functions and a similar number of methods (even wilder guess), the methods are in C, so you have to change it both on the c and the python level. Two keywords will confuse users for a long time (and which one is in the tutorial documentation) I'm just guessing and I have no idea about the c-level. Josef Note, I'm just a user of numpy My main objection was to N and Z, which would have affected me (and statsmodels developers) Right.
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Fri, Apr 5, 2013 at 8:31 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 7:39 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 4:27 PM, josef.p...@gmail.com wrote: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg sebast...@sipsolutions.net wrote: Hey snip I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like order is an alias for 'layout' to maintain backwards compatibility with numpy = 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', order will be removed as an alias in future versions of numpy Step 4: (distant future) Remove alias ? A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of order is a no go. Therefore we have two choices here: 1. Simply document the current order keyword better and leave it at that. 2. Add a layout (or index_order) keyword, and live with both order and layout keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. I think that is short-sighted and I think it will damage numpy. It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The only problem I can see with this, is that if someone, after period P, does not read the docstring, and uses 'layout' instead of 'order', then they will find that their code is not backwards compatible with versions of numpy of greater age than P. They can fix this, forever, by reverting to 'order'. That's certainly not zero cost, but it's not much cost either, and the cost will depend on P. You edit large parts of the numpy tutorial and explanations, We agree that these concepts need to be clarified in the explanations. For the docs, we would first add the keyword as an alias and note it so. you add a second keyword to (rough guess) 10 functions and a similar number of methods (even wilder guess), the methods are in C, so you have to change it both on the c and the python level. I'm OK to do the code changes, I don't think that's a concern at the
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. well, not entirely orthogonal -- they are the some concept, used in different contexts, so there is some benefit to their having similarity. So Id advocate for using the same flag names in any case -- i.e. C and F in both cases. I think we are now more or less agreeing that: np.reshape(a, (3, 4), index_order='F') is at least as clear as: np.reshape(a, (3, 4), order='F') sure. The trick is: np.reshape(a, (3, 4), index_order='A') which in mingling index_order and memory order.. I believe our job here is to come to some consensus. yup. In that spirit, I think we do agree on these statements above. with the caveats I just added... Now we have the cost / benefit. Benefit : Some people may find it easier to understand numpy when these constructs are separated. Cost : There might be some confusion because we have changed the default keywords. Benefit --- What proportion of people would find it easier to understand with the order constructs separated? It's not just numbers -- it's depth of confusion -- if, once you get it, you remember it for the rest of your numpy use, then it's not big deal. However, if you need to re-think and test every time you re-visit reshape or ravel, then there's a significant benefit. We are talking about separating the concepts, but I think it takes more than a keyword change to do that -- the 'A' and 'K' flags mingle the concpets, and are going to be confusing with new keywords -- maybe even more so (it says index_order, but the docstring talks about memory order) Does anyone think we should depreciate the 'A' and 'K' flags? Before you answer that -- does anyone see a use case for the 'A' and 'K' flags that can't be reasonably easily accomplished with .view() or asarray() or ??? if we get rid of the 'A' and 'K' flags, I think think the docstring will be more clear, and there may be less need for two names for the different order concepts (though we could change the flags and the keywords...) The ravel docstring would looks something like this: index_order : {'C','F', 'A', 'K'}, optional ... This keyword used to be called simply 'order', and you can also use the keyword 'order' to specify index_order (this parameter). The problem would then be that, for a while, there will be older code and docs using 'order' instead of 'index_order'. I think this would not cause much trouble. Reading the docstring will explain the change. The old code will continue to work. not a killer, I agree. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. Brief thank you for your helpful and thoughtful discussion. well, not entirely orthogonal -- they are the some concept, used in different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. Brief thank you for your helpful and thoughtful discussion. well, not entirely orthogonal -- they are the some concept, used in different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout no to interpretation 2) reshape and ravel (in contrast to flatten) just return a view (if possible) (with possible some strange strides) docstring: numpy.reshape(a, newshape, order='C') Gives a new shape to an array without changing its data functions that return views versus functions that create new arrays Josef Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Thu, Apr 4, 2013 at 12:54 PM, josef.p...@gmail.com wrote: On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. Brief thank you for your helpful and thoughtful discussion. well, not entirely orthogonal -- they are the some concept, used in different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout no to interpretation 2) reshape and ravel (in contrast to flatten) just return a view (if possible) (with possible some strange strides) 'No' meaning what? That it is not possible that it could mean that? Obviously we're not arguing about whether it does mean that, we're arguing about whether such an interpretation would make sense. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 12:54 PM, josef.p...@gmail.com wrote: On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. Brief thank you for your helpful and thoughtful discussion. well, not entirely orthogonal -- they are the some concept, used in different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout no to interpretation 2) reshape and ravel (in contrast to flatten) just return a view (if possible) (with possible some strange strides) 'No' meaning what? That it is not possible that it could mean that? Obviously we're not arguing about whether it does mean that, we're arguing about whether such an interpretation would make sense. 'No' means: I don't think it makes sense given the current behavior of numpy with respect to functions that are designed to return views (and copy memory only if there is no way to make a view) One objective of functions that create views is *not* to change the underlying memory. So in most cases, requesting a specific contiguity (memory order) for a new array, when you actually want a view with strides, doesn't sound like an obvious explanation for order. --- slightly more difficult: order = I don't care (aka. order=K) means: I want a view in whichever order of the values, but please try harder not to copy any memory This also doesn't refer to the memory of a *new* array, if it is really necessary to copy. Josef Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Thu, Apr 4, 2013 at 1:33 PM, josef.p...@gmail.com wrote: On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 12:54 PM, josef.p...@gmail.com wrote: On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote: We all agree that 'order' is used with two different and orthogonal meanings in numpy. Brief thank you for your helpful and thoughtful discussion. well, not entirely orthogonal -- they are the some concept, used in different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout no to interpretation 2) reshape and ravel (in contrast to flatten) just return a view (if possible) (with possible some strange strides) 'No' meaning what? That it is not possible that it could mean that? Obviously we're not arguing about whether it does mean that, we're arguing about whether such an interpretation would make sense. 'No' means: I don't think it makes sense given the current behavior of numpy with respect to functions that are designed to return views (and copy memory only if there is no way to make a view) OK - so no-one is suggesting that it is a good option, only that the concept makes sense. As I was saying before - for most of us it is still possible to get confused between two different meanings of the same word even if one of the meanings would (for complicated reasons) be less likely than the other. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote: Hi, snip So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout Yes, it could mean both. I am simply not sure if it helps enough to warrant the trouble. So if it still interests someone, I feel the docs are more important, but I am neutral to changing this. I don't quite see a big gain, so I am just worried that it bugs a lot of people either because of changing or because of having to remember the different name (you can argue that is good, but if it bugs most maybe it does not help either). As to being confused. Did anyone ever see a np.reshape(arr, ..., order='F') and then continuing assuming the result is F-contiguous (when the original arr is not known to be contiguous)? If that actually create a real bug somewhere, that might actually convince me that it is worth it to walk through trouble and complaints. I guess I just don't believe it really happens in the real world. - Sebastian Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Thu, Apr 4, 2013 at 1:53 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote: Hi, snip So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout Yes, it could mean both. I am simply not sure if it helps enough to warrant the trouble. So if it still interests someone, I feel the docs are more important, but I am neutral to changing this. I don't think the docs enter the discussion, because we all agree that changing the docs is a good idea. I don't quite see a big gain, so I am just worried that it bugs a lot of people either because of changing or because of having to remember the different name (you can argue that is good, but if it bugs most maybe it does not help either). As to being confused. Did anyone ever see a np.reshape(arr, ..., order='F') and then continuing assuming the result is F-contiguous (when the original arr is not known to be contiguous)? If that actually create a real bug somewhere, that might actually convince me that it is worth it to walk through trouble and complaints. I guess I just don't believe it really happens in the real world. There are two aspects here; 1) Making numpy easier to understand and teach. 2) Avoiding bugs I'm thinking primarily of the first. I would hate to teach the thing in the current state. As I've said many times before, I found it very confusing, others have said so too. The more confusing it is, the more likely people will make mistakes. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote: snip Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? I'd like to propose this instead: a.reshape(..., order=C) a.copy(layout=F) This fits well with the terms we've been using during the discussion. It reduces the changes to only one of the two meanings. Thinking about it, I feel that this would have been considerably clearer to me as I learned numpy. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Thu, Apr 4, 2013 at 11:26 AM, josef.p...@gmail.com wrote: Before you answer that -- does anyone see a use case for the 'A' and 'K' flags that can't be reasonably easily accomplished with .view() or asarray() or ??? What order does a[a2] use to create the returned 1-D array? ... However, I never needed to know and never cared a[a2] = 5 a[a2] = b[a2] Now, after this thread, I know about K, does that use case use ravel() or reshape() under the hood? and there might be cases where it would be appropriate to minimize copying memory, hmm -- yes, that makes sense, and perhaps compelling enough to keep them around (at least with perhaps better docs). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Thu, Apr 4, 2013 at 5:54 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Thu, Apr 4, 2013 at 11:26 AM, josef.p...@gmail.com wrote: Before you answer that -- does anyone see a use case for the 'A' and 'K' flags that can't be reasonably easily accomplished with .view() or asarray() or ??? What order does a[a2] use to create the returned 1-D array? ... However, I never needed to know and never cared a[a2] = 5 a[a2] = b[a2] Now, after this thread, I know about K, does that use case use ravel() or reshape() under the hood? only ravel has K as far as I saw in the current documentation. example for ravel(K) would be if axis=None in functions and we only have elementwise or reduce operations. All the code I've seen uses just ravel() in this case, instead, ravel(K) would have a better chance to avoid array copying, if axis is None: x = x.ravel(K) return ((x - x.mean(0))**2).sum(0) but it's dangerous because, if there is a second array, it might not ravel(K) the same way x.ravel(K) - y.ravel(K) sounds fun similar if x[mask] wouldn't select a fixed order, then a[a2] = b[a2] would also be fun fun := find the bug that I have hidden in this code The only reason to use reshape with A, I can think of, is, if the array (matrix) is symmetric, or if it's a square picture and we never care whether it's upright or sideways. reshape(.., order=A) and ravel(A) should roundtrip, I guess. Josef and there might be cases where it would be appropriate to minimize copying memory, hmm -- yes, that makes sense, and perhaps compelling enough to keep them around (at least with perhaps better docs). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Catching up with numpy 1.6 'No' means: I don't think it makes sense given the current behavior of numpy with respect to functions that are designed to return views (and copy memory only if there is no way to make a view) One objective of functions that create views is *not* to change the underlying memory. So in most cases, requesting a specific contiguity (memory order) for a new array, when you actually want a view with strides, doesn't sound like an obvious explanation for order. why I'm buffled: To me views are just a specific way of looking at an existing array, or parts of it, similar to an iteratior but with an n-dimensional shape. ravel is just like calling list(iterator), the iterator determines how we read the existing array. So, asking about the output memory order made no sense to me. What's the output of an iterator? I (and statsmodels) are still on numpy 1.5 but not for much longer. So I'm trying to read up http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#single-array-iteration explains the case for K : for elementwise operations just run the fastest way through the array The old flat and flatiter where always c-order. a = np.arange(4*5).reshape(4,5) b = np.array(a, order='F') np.fromiter(np.nditer(b, order='K'), int) array([ 0, 5, 10, 15, 1, 6, 11, 16, 2, 7, 12, 17, 3, 8, 13, 18, 4, 9, 14, 19]) np.fromiter(np.nditer(a, order='K'), int) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) Is ravel('K') good for anything ? def f(x): '''A function that only works in 1d''' if x.ndim 1: raise ValueError return np.round(np.piecewise(x, [x 0, x = 0], [lambda x: np.sqrt(-x), lambda x: np.sqrt(x)])) b = np.array(np.arange(4*5.).reshape(4,5), order='F') b array([[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]) f(b[:,:2]) Traceback (most recent call last): File pyshell#184, line 1, in module f(b[:,:2]) File pyshell#183, line 2, in f if x.ndim 1: raise ValueError ValueError ravel and reshape with 'K' doesn't roundtrip (b.ravel('K')).reshape(b.shape, order='K') array([[ 0., 5., 10., 15., 1.], [ 6., 11., 16., 2., 7.], [ 12., 17., 3., 8., 13.], [ 18., 4., 9., 14., 19.]]) but we can do inplace transformations with it e = b[:,:2].ravel() e.flags.owndata True e = b[:,:2].ravel('K') e.flags.owndata False e[:] = f(e) b array([[ 0., 1., 2., 3., 4.], [ 2., 2., 7., 8., 9.], [ 3., 3., 12., 13., 14.], [ 4., 4., 17., 18., 19.]]) e[:] = f(e) b array([[ 0., 1., 2., 3., 4.], [ 1., 1., 7., 8., 9.], [ 2., 2., 12., 13., 14.], [ 2., 2., 17., 18., 19.]]) (A few hours of experimenting is more that I wanted to know, 99.5% of my cases are order='C' or order='F') nditer has also an interesting section on Iterator-Allocated Output Arrays Josef I found the scissors ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Apr 2, 2013 at 7:09 PM, josef.p...@gmail.com wrote: On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in F and am I iterating/raveling in F order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. I guess it could not make sense to do this: np.ravel(a, index_order='C', memory_order='F') It could make sense to do this: np.reshape(a, (3,4), index_order='F, memory_order='F') but that just points out the inherent confusion between the uses of 'order', and in this case, the fact that you can only do: np.reshape(a, (3, 4), index_order='F') correctly distinguishes between the meanings. So, if index_order and memory_order are never in the same function, then the context should be enough. It was always enough for me. np.reshape(a, (3,4), index_order='F, memory_order='F') really hurts my head because you mix a function that operates on views, indexing and shapes with memory creation, (or I have no idea what memory_order should do in this case). np.asarray(a.reshape(3,4 order=F), order=F) or the example here http://docs.scipy.org/doc/numpy/reference/generated/numpy.asfortranarray.html?highlight=asfortranarray#numpy.asfortranarray http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html keeps functions with index_order and functions with memory_order nicely separated. (It might be useful but very confusing to add memory_order to every function that creates a view if possible and a copy if necessary: If you have to make a copy, then I want F memory order, otherwise give me a view But I cannot find a candidate function right now, except for ravel and reshape see first notes in docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html ) a day later (haven't changed my mind): isn't specifying index order in the Parameter section enough as an explanation? something like: ``` def ravel Parameters order : index order how the array is stacked into a 1d array. F means we stack by columns (fortran order, first index first),C means we stack by rows (c-order, last index first) ``` most array *creation* functions explicitly mention memory layout in the docstring Josef Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg sebast...@sipsolutions.net wrote: the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. Yup, thats how I think about it too... me too... But I would really love if someone would try to make the documentation simpler! yes, I think this is where the solution lies. There is also never a mention of contiguity, even though when we refer to memory order, then having a C/F contiguous array is often the reason why good point -- in fact, I have no idea what would happen in many of these cases for a discontiguous array (or one with arbitrarily weird strides...) Also 'A' seems often explained not quite correctly (though that does not matter (except for reshape, where its explanation is fuzzy), it will matter more in the future -- even if I don't expect 'A' to be actually used). I wonder about having a 'A' option in reshape at all -- what the heck does it mean? why do we need it? Again, I come back to the fact that memory order is kind-of orthogonal to index order. So for reshape (or ravel, which is really just a special case of reshape...) the 'A' flag and 'K' flag (huh?) is pretty dangerous, and prone to error. I think of it this way: Much of the beauty of numpy is that it presents a consistent interface to various forms of strided data -- that way, folks can write code that works the same way for any ndarray, while still being able to have internal storage be efficient for the use at hand -- i.e. C order for the common case, Fortran order for interaction with libraries that expect that order (or for algorithms that are more efficient in that order, though that's mostly external libs..), and non-contiguous data so one can work on sub-parts of arrays without copying data around. In most places, the numpy API hides the internal memory order -- this is a good thing, most people have no need to think about it (or most code, anyway), and you can write code that works (even if not optimally) for any (strided) memory layout. All is good. There are times when you really need to understand, or control or manipulate the memory layout, to make sure your routines are optimized, or the data is in the right form to pass of to an external lib, or to make sense of raw data read from a file, or... That's what we have .view() and friends for. However, the 'A' and 'K' flags mix and match these concepts -- and I think that's dangerous. it would be easy for the a to use the 'A' flag, and have everything work fine and dandy with all their test cases, only to have it blow up when someone passes in a different-than-expected array. So really, they should only be used in cases where the code has checked memory order before hand, or in a really well-defined interface where you know exactly what you're getting. In those cases, it makes the code far more clear an less error prone to do you re-arranging of the memory in a separate step, rather than built-in to a ravel() or reshape() call. [note] -- I wrote earlier that I wasn't confused by the ravel() examples -- true for teh 'c' and 'F' flags, but I'm still not at all clear what 'A' and 'K' woudl give me -- particularly for 'A' and reshape() So I think the cause of the confusion here is not that we use order in two different contexts, nor the fact that 'C' and 'F' may not mean anything to some people, but that we are conflating two different process in one function, and with one flag. My (maybe) proposal: we deprecate the 'A' and 'K' flags in ravel() and reshape(). (maybe even deprecate ravel() -- does it add anything to reshape? If not deprecate, at least encourage people in the docs not to use them, and rather do their memory-structure manipulations with .view or stride manipulation, or... I'm still trying to figure out when you'd want the 'A' flag -- it seems at the end of your operation you will want: The resulting array to be a particular shape, with the elements in a particular order and You _may_ want the in-memory layout a certain way. but 'A' can't ensure both of those. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Wed, Apr 3, 2013 at 5:19 AM, josef.p...@gmail.com wrote: On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Apr 2, 2013 at 7:09 PM, josef.p...@gmail.com wrote: On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in F and am I iterating/raveling in F order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. I guess it could not make sense to do this: np.ravel(a, index_order='C', memory_order='F') It could make sense to do this: np.reshape(a, (3,4), index_order='F, memory_order='F') but that just points out the inherent confusion between the uses of 'order', and in this case, the fact that you can only do: np.reshape(a, (3, 4), index_order='F') correctly distinguishes between the meanings. So, if index_order and memory_order are never in the same function, then the context should be enough. It was always enough for me. It was not enough for me or the three others who will publicly admit to the shame of finding it confusing without further thought. Again, I just can't see a reason not to separate these ideas. We are not arguing about backwards compatibility here, only about clarity. I guess you do accept that some people, other than yourself, might be less likely to get tripped up by: np.reshape(a, (3, 4), index_order='F') than np.reshape(a, (3, 4), order='F') ? np.reshape(a, (3,4), index_order='F, memory_order='F') really hurts my head because you mix a function that operates on views, indexing and shapes with memory creation, (or I have no idea what memory_order should do in this case). Right. I think you may now be close to my own discomfort when faced with working out (fast) what: np.reshape(a, (3,4), order='F') means, given 'order' means two different things, and both might be relevant here. Or are you saying that my brain should have quickly calculated that that 'order' would be difficult to understand as memory layout and therefore rejected that and seen immediately that index order was the meaning? Speaking as a psychologist, I don't think that's the way it works. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg sebast...@sipsolutions.net wrote: the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. Yup, thats how I think about it too... me too... But I would really love if someone would try to make the documentation simpler! yes, I think this is where the solution lies. No question that better docs would be an improvement, let's all agree on that. We all agree that 'order' is used with two different and orthogonal meanings in numpy. I think we are now more or less agreeing that: np.reshape(a, (3, 4), index_order='F') is at least as clear as: np.reshape(a, (3, 4), order='F') Do I have that right so far? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com wrote: It was not enough for me or the three others who will publicly admit to the shame of finding it confusing without further thought. I would submit that some of the confusion came from the fact that with ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH index_order and memory_order -- with one flag -- I know I'm still not clear what I'd get in complex situations. Again, I just can't see a reason not to separate these ideas. I agree, but really separating them -- but ideally having a given function only deal with one or the other, not both at once. We are not arguing about backwards compatibility here, only about clarity. while it could be changed while strictly maintaining backward compatibility -- it is a change that would need to filter through the docs, example, random blog posts, stack=overflow questions, etc.. Is that worth it? I'm not convinced Right. I think you may now be close to my own discomfort when faced with working out (fast) what: np.reshape(a, (3,4), order='F') I still think it's cause you know too much ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Wed, Apr 3, 2013 at 11:52 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com wrote: It was not enough for me or the three others who will publicly admit to the shame of finding it confusing without further thought. I would submit that some of the confusion came from the fact that with ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH index_order and memory_order -- with one flag -- I know I'm still not clear what I'd get in complex situations. Again, I just can't see a reason not to separate these ideas. I agree, but really separating them -- but ideally having a given function only deal with one or the other, not both at once. We are not arguing about backwards compatibility here, only about clarity. while it could be changed while strictly maintaining backward compatibility -- it is a change that would need to filter through the docs, example, random blog posts, stack=overflow questions, etc.. Not only that, we would then also be in the situation of having `order` *and* `xxx_order` keywords. This is also confusing, at least as much as the current situation imho. Ralf Is that worth it? I'm not convinced Right. I think you may now be close to my own discomfort when faced with working out (fast) what: np.reshape(a, (3,4), order='F') I still think it's cause you know too much ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Wed, Apr 3, 2013 at 9:13 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg sebast...@sipsolutions.net wrote: the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. Yup, thats how I think about it too... me too... But I would really love if someone would try to make the documentation simpler! yes, I think this is where the solution lies. No question that better docs would be an improvement, let's all agree on that. We all agree that 'order' is used with two different and orthogonal meanings in numpy. I think we are now more or less agreeing that: np.reshape(a, (3, 4), index_order='F') is at least as clear as: np.reshape(a, (3, 4), order='F') I believe uur job here is to come to some consensus. In that spirit, I think we do agree on these statements above. Now we have the cost / benefit. Benefit : Some people may find it easier to understand numpy when these constructs are separated. Cost : There might be some confusion because we have changed the default keywords. Benefit --- What proportion of people would find it easier to understand with the order constructs separated? Clearly Chris and Josef and Sebastian - you estimate I think no change in your understanding, because your understanding was near complete already. At least I, Paul Ivanov, JB Poline found the current state strikingly confusing. I think we have other votes for that position here. It's difficult to estimate the proportions now because my original email and the subsequent discussion are based on the distinction already being made. So, it is hard for us to be objective about whether a new user is likely to get confused. At least it seems reasonable to say that some moderate proportion of users will get confused. In that situation, it seems to me the long-term benefit for separating these ideas is relatively high. The benefit will continue over the long term. Cost --- The ravel docstring would looks something like this: index_order : {'C','F', 'A', 'K'}, optional ... This keyword used to be called simply 'order', and you can also use the keyword 'order' to specify index_order (this parameter). The problem would then be that, for a while, there will be older code and docs using 'order' instead of 'index_order'. I think this would not cause much trouble. Reading the docstring will explain the change. The old code will continue to work. This cost will decrease to zero over time. So, if we are planning for the long-term for numpy, I believe the benefit to the change considerably outweighs the cost. I'm happy to do the code changes, so that's not an issue. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi all, Since we're mentionning obvious and non-obvious naming, I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. So here is a compromise proposal. How about: * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) This naming scheme is obvious for the ones that have been doing some coding for a long time, but they tend not to speak to anyone else. Why not use naming that are a little bit more explicit (and of course, keep the legacy naming available), and use 'row-first' and 'column-first' (or anything else that may be more explicit) ? Cheers, Éric. * Leaving 'C and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. Would you object to that? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Un clavier azerty en vaut deux -- Éric Depagnee...@depagne.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Surely it should be Z and ᴎ? ;-) I knew what your examples would produce, but only because I've bumped into this before. When you do reshapes of various sorts (ravel() == reshape((-1,))), then, like you say, there are two totally different sets of coordinate mapping in play: chunk of memory -1- virtual array layout -2- new array layout (C pointers) ---(Python indexes)--- (Python indexes) Mapping (1) is determined by the array strides, and you have to think about it when you interface with C code, but at the Python level it's pretty much irrelevant; all operations are defined at the virtual array layout level. Further confusing the issue is the fact that the vast majority of legal memory-virtual array mappings are *neither* C- nor F-ordered. Strides are very flexible. Further further confusing the issue is that mapping (2) actually consists of two mappings: if you have an array with shape (3, 4, 5) and reshape it to (4, 15), then the way you work out the overall mapping is by first mapping the (3, 4, 5) onto a flat 1-d space with 60 elements, and then mapping *that* to the (4, 15) space. Anyway, I agree that this is very confusing; certainly it confused me. If you bump into these two mappings just in passing, and separately, then it's very easy to miss the fact that they have nothing to do with each other. And I agree that using exactly the same terminology for both of them is part of what causes this. I even kind of like the Z/N naming scheme (I still have to look up what C/F actually mean every time, I'm ashamed to say). But I don't see how the proposed solution helps, because the problem isn't that mapping (1) and (2) use different ordering schemes -- the column-major/row-major distinction really does apply to both equally. Using different names for those seems like it will
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Personally I think it is clear enough and that Z and N would confuse me just as much (though I am used to the other names). Also Z and N would seem more like aliases, which would also make sense in the memory order context. If anything, I would prefer renaming the arguments iteration_order and memory_order, but it seems overdoing it... Maybe the documentation could just be checked if it is always clear though. I.e. maybe it does not use iteration or memory order consistently (though I somewhat feel it is usually clear that it must be iteration order, since no numpy function cares about the input memory order as they will just do a copy if necessary). I have been using both C and Fortran for 25 or so years. Despite that, I have to sit and think every time I need to know which way the arrays are stored, basically by remembering that in fortran you do (I,J,*) for an assumed-size array. So I *love* the idea of 'Z' and 'N' which I understood immediately. Andrew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com wrote: Thank you for the compliment, it's more enjoyable than other potential explanations of my confusion (sigh). But, I don't think that is the explanation. well, the core explanation is these are difficult and intertwined concepts...And yes, better names and better docs can help. Last, as soon as we came to the distinction between index order and memory layout, it was clear. We all agreed that this was an important distinction that would improve numpy if we made it. yup. I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. well, changing an API is not to be taken lightly -- we are not discussion how we'd do it if we were to start from fresh here. So any change should make things enough better that it is worth dealing with the process of teh change. So here is a compromise proposal. * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) * Leaving 'C and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. seems reasonable enough -- though even with the backward compatibility, users will be faces with many, many older examples and docs that use C' and 'F', while the new ones refer to the new names -- might this be cause for even more confusion (at least for a few years...) leaving me with an equivocal +0 on that antoher thought: Definition: np.ravel(a, order='C') A 1-D array, containing the elements of the input, is returned. A copy is made only if needed. Parameters -- a : array_like Input array. The elements in ``a`` are read in the order specified by `order`, and packed as a 1-D array. order : {'C','F', 'A', 'K'}, optional The elements of ``a`` are read in this order. 'C' means to view the elements in C (row-major) order. 'F' means to view the elements in Fortran (column-major) order. 'A' means to view the elements in 'F' order if a is Fortran contiguous, 'C' order otherwise. 'K' means to view the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, 'C' order is used. Does ravel need to support the 'A' and 'K' options? It's kind of an advanced use, and really more suited to .view(), perhaps? What I'm getting at is that this version of ravel() conflates the two concepts: virtual ordering and memory ordering in one function -- maybe they should be considered as two different functions altogether -- I think that would make for less confusion. Éric Depagne wrote: 'row-first' and 'column-first' (or anything else that may be more explicit) ? I like more explicit, but 'row-first' and 'column-first' have two issues: 1) what about higher dimension arrays?, and 2) the row and column convention is only that -- a convention -- I guess it's the way numpy prints, which gives it some meaning, but there are times when arrays are ordered: (col, row), rather than (row, col) (PIL uses that format for instance) I like the Z and N, and maybe even if they aren't used as flag names, they could be used in teh docstring -- nice and ascii safe Nathaniel wrote: To see this, note that semantically it would be perfectly possible for .reshape() to take *two* order= arguments: one to specify the coordinate space mapping (2), and the other to specify the desired memory layout used by the result array (1). Of course we shouldn't actually do this, because in the unlikely event that someone actually wanted both of these they could just call asarray() on the output of reshape(). exactly -- my point about keeping the raveling with virtual order separate from reveling with memory order -- it's really not critical that you can do both with one function call. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith n...@pobox.com wrote: On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Surely it should be Z and ᴎ? ;-) I knew what your examples would produce, but only because I've bumped into this before. When you do reshapes of various sorts (ravel() == reshape((-1,))), then, like you say, there are two totally different sets of coordinate mapping in play: chunk of memory -1- virtual array layout -2- new array layout (C pointers) ---(Python indexes)--- (Python indexes) Mapping (1) is determined by the array strides, and you have to think about it when you interface with C code, but at the Python level it's pretty much irrelevant; all operations are defined at the virtual array layout level. Further confusing the issue is the fact that the vast majority of legal memory-virtual array mappings are *neither* C- nor F-ordered. Strides are very flexible. Further further confusing the issue is that mapping (2) actually consists of two mappings: if you have an array with shape (3, 4, 5) and reshape it to (4, 15), then the way you work out the overall mapping is by first mapping the (3, 4, 5) onto a flat 1-d space with 60 elements, and then mapping *that* to the (4, 15) space. Anyway, I agree that this is very confusing; certainly it confused me. If you bump into these two mappings just in passing, and separately, then it's very easy to miss the fact that they have nothing to do with each other. And I agree that using exactly the same terminology for both of them is part of what causes this. I even kind of like the Z/N naming scheme (I still have to look up what C/F actually mean every time, I'm ashamed to say). But I don't see how the proposed solution helps, because the problem isn't that mapping (1) and (2) use different ordering schemes -- the
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com wrote: Thank you for the compliment, it's more enjoyable than other potential explanations of my confusion (sigh). But, I don't think that is the explanation. well, the core explanation is these are difficult and intertwined concepts...And yes, better names and better docs can help. Last, as soon as we came to the distinction between index order and memory layout, it was clear. We all agreed that this was an important distinction that would improve numpy if we made it. yup. I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. well, changing an API is not to be taken lightly -- we are not discussion how we'd do it if we were to start from fresh here. So any change should make things enough better that it is worth dealing with the process of teh change. Yes, for sure. I was only trying to point out that we are not talking about breaking backwards compatibility. So here is a compromise proposal. * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) * Leaving 'C and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. seems reasonable enough -- though even with the backward compatibility, users will be faces with many, many older examples and docs that use C' and 'F', while the new ones refer to the new names -- might this be cause for even more confusion (at least for a few years...) I doubt it would be 'even more' confusion. They would only have to read the docstrings to work out what is meant, and I believe, with better names, they'd be less likely to fall into the traps I fell into, at least. leaving me with an equivocal +0 on that antoher thought: Definition: np.ravel(a, order='C') A 1-D array, containing the elements of the input, is returned. A copy is made only if needed. Parameters -- a : array_like Input array. The elements in ``a`` are read in the order specified by `order`, and packed as a 1-D array. order : {'C','F', 'A', 'K'}, optional The elements of ``a`` are read in this order. 'C' means to view the elements in C (row-major) order. 'F' means to view the elements in Fortran (column-major) order. 'A' means to view the elements in 'F' order if a is Fortran contiguous, 'C' order otherwise. 'K' means to view the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, 'C' order is used. Does ravel need to support the 'A' and 'K' options? It's kind of an advanced use, and really more suited to .view(), perhaps? What I'm getting at is that this version of ravel() conflates the two concepts: virtual ordering and memory ordering in one function -- maybe they should be considered as two different functions altogether -- I think that would make for less confusion. I think it would conceal the confusion only. If we don't have 'A' and 'K' in there, it allows us to keep the dream of a world where 'C only refers to index ordering, but *only for this docstring*. As soon as somebody does ``np.array(arr, order='C')`` they will find themselves in conceptual trouble again. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 2:04 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com wrote: Thank you for the compliment, it's more enjoyable than other potential explanations of my confusion (sigh). But, I don't think that is the explanation. well, the core explanation is these are difficult and intertwined concepts...And yes, better names and better docs can help. Last, as soon as we came to the distinction between index order and memory layout, it was clear. We all agreed that this was an important distinction that would improve numpy if we made it. yup. I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. well, changing an API is not to be taken lightly -- we are not discussion how we'd do it if we were to start from fresh here. So any change should make things enough better that it is worth dealing with the process of teh change. Yes, for sure. I was only trying to point out that we are not talking about breaking backwards compatibility. So here is a compromise proposal. * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) * Leaving 'C and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. seems reasonable enough -- though even with the backward compatibility, users will be faces with many, many older examples and docs that use C' and 'F', while the new ones refer to the new names -- might this be cause for even more confusion (at least for a few years...) I doubt it would be 'even more' confusion. They would only have to read the docstrings to work out what is meant, and I believe, with better names, they'd be less likely to fall into the traps I fell into, at least. leaving me with an equivocal +0 on that antoher thought: Definition: np.ravel(a, order='C') A 1-D array, containing the elements of the input, is returned. A copy is made only if needed. Parameters -- a : array_like Input array. The elements in ``a`` are read in the order specified by `order`, and packed as a 1-D array. order : {'C','F', 'A', 'K'}, optional The elements of ``a`` are read in this order. 'C' means to view the elements in C (row-major) order. 'F' means to view the elements in Fortran (column-major) order. 'A' means to view the elements in 'F' order if a is Fortran contiguous, 'C' order otherwise. 'K' means to view the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, 'C' order is used. Does ravel need to support the 'A' and 'K' options? It's kind of an advanced use, and really more suited to .view(), perhaps? What I'm getting at is that this version of ravel() conflates the two concepts: virtual ordering and memory ordering in one function -- maybe they should be considered as two different functions altogether -- I think that would make for less confusion. I think it would conceal the confusion only. If we don't have 'A' and 'K' in there, it allows us to keep the dream of a world where 'C only refers to index ordering, but *only for this docstring*. As soon as somebody does ``np.array(arr, order='C')`` they will find themselves in conceptual trouble again. I still don't see why order is not a general concept, whether it refers to memory or indexing/iterating. The qualifier can be made clear in the docstrings (or from the context). It's all over the documentation: we can iterate in F-order over an array that is in C-order (*), or vice-versa (*) or just some strides http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.nditer.html#numpy.nditer pure shape http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-array-shape shape and copy http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html#numpy.ndarray.flatten memory http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-kind-of-array http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#from-existing-data Josef Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 6:59 PM, Matthew Brett matthew.br...@gmail.com wrote: On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith n...@pobox.com wrote: Maybe we should go through and rename order to something more descriptive in each case, so we'd have a.reshape(..., index_order=C) a.copy(memory_order=F) etc.? That seems like a good idea. If you are proposing it, I am +1. Well, I'm just throwing it out there as an idea, but if people like it, nothing better turns up, and someone implements it, then I'm not going to say no... This way if you just bumped into these while reading code, it would still be immediately obvious that they were dealing with totally different concepts. Compare to reading along without the docs and seeing a.reshape(..., order=Z) a.copy(order=C) That'd just leave me even more baffled than the current system -- I'd start thinking that Z and C somehow were different options for the same order= option, so they must somehow mean ways of ordering elements? I don't think you'd be more baffled than the current system, which, as you say, conflates two orthogonal concepts. Rather, I think it would cause the user to stop, as they should, and consider what concept order is using in this case. I don't find it difficult to explain this: There are two different but related concepts of 'order' 1) The memory layout of the array 2) The index ordering used to unravel the array If you see 'Z' or 'N for 'order' - that refers to index ordering. If you see 'C' or 'F for order - that refers to memory layout. Sure, you can write it down like this, but compare to this system: If you see 'Z' or 'N for 'order' - that refers to memory ordering. If you see 'C' or 'F for order - that refers to index layout. Now suppose I forget which system we actually use -- how do you remember which system is which? It's totally arbitrary. Now I have even more things to remember. And I'm certainly not going to work out this distinction just from seeing these used once or twice in someone else's code. This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 4:07 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: On Tue, Apr 2, 2013 at 11:37 AM, josef.p...@gmail.com wrote: I still don't see why order is not a general concept, whether it refers to memory or indexing/iterating. I agree -- the ordering concept is the same, it's _what_ is being ordered that's different. So I say we stick with 'C' and 'F' -- numpy users will need to figure out what it means eventually in any case I'm not quite sure what you are arguing. I thought we all agreed that the index ordering idea is *orthogonal* to the memory layout idea? Not so? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in F and am I iterating/raveling in F order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. Josef -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Tue, Apr 2, 2013 at 7:09 PM, josef.p...@gmail.com wrote: On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in F and am I iterating/raveling in F order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. as aside (math): numpy.flatten made it into the Wikipedia page http://en.wikipedia.org/wiki/Vectorization_%28mathematics%29#Programming_language (and how it's different from R and Matlab/Octave, but doesn't mention: use order=F to get the same behavior as math and the others) and the corresponding code in statsmodels (tools for vector autoregressive models by Wes) Josef baffled? Josef -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. Thanks - but I guess we all agree that np.array(a, order='C') and np.ravel(a, order='F') are using the term 'order' in two different and orthogonal senses, and the discussion is about whether it is possible to get confused about these two senses and, if so, what we should do about it. Just to repeat what you're suggesting np.array(a, memory_order='C') np.ravel(a, index_order='C') np.ravel(a, index_order='K') That makes sense to me. I guess we'd have to do something like: def ravel(a, index_order='C', **kwargs): Where kwargs must be empty if the second arg is specified, otherwise it can contain only one key, 'order' and 'index_order'. Thus: np.ravel(a, index_order='C') will work for the forseeable future. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Tue, Apr 2, 2013 at 7:09 PM, josef.p...@gmail.com wrote: On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote: This is like observing that if I say go North then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So go North means drive North, go htuoS means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- drive North, walk North. Or iterate C-wards, store F-wards. C and Z mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the C refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in F and am I iterating/raveling in F order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. I guess it could not make sense to do this: np.ravel(a, index_order='C', memory_order='F') It could make sense to do this: np.reshape(a, (3,4), index_order='F, memory_order='F') but that just points out the inherent confusion between the uses of 'order', and in this case, the fact that you can only do: np.reshape(a, (3, 4), index_order='F') correctly distinguishes between the meanings. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: Hi, On Sun, Mar 31, 2013 at 1:43 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about memory layouts. All they need to
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: Hi, On Sun, Mar 31, 2013 at 1:43 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Mon, Apr 1, 2013 at 3:10 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: Hi, On Sun, Mar 31, 2013 at 1:43 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Mon, Apr 1, 2013 at 4:51 PM, Chris Barker - NOAA Federal chris.bar...@noaa.gov wrote: HI folks, I've been teaching Python lately, have taught numpy a couple times (formally), and am preparing a leacture about it over the next couple weeks -- so I'm taking an interest here. I've been a regular numpy user for a long time, though as it happens, rarely use ravel() (sode note, what's always confused me the most is that it seems to me that ravel() _unravels_ the array - but that's a side note...) So I ignored the first post, then fired up iPython, read the docstring, and played with ravel a bit -- it behaved EXACTLY like I expected. -- at least for 2-d Mathew, I expect your group may have gotten tied up by the fact that you know too much! kind of like how I have a hard time getting my iphone to work, and my computer-illiterate wife has no problem at all. Thank you for the compliment, it's more enjoyable than other potential explanations of my confusion (sigh). But, I don't think that is the explanation. First, there were three of us with different levels of experience getting confused on this. Second, I think we all agree that: So: yes, I do think it's bit confusing and unfortunate that the order parameter has two somewhat different meanings, - so there is a good reason that we could get confused. Last, as soon as we came to the distinction between index order and memory layout, it was clear. We all agreed that this was an important distinction that would improve numpy if we made it. Before I sent the email I did wonder aloud whether people would read the email, understand the distinction, and then fail to see the problem. It is hard to imagine yourself before you understood something. but they are in fat, used fairly similarly. And while the idea of fortran or C ordering of arrays may be a foreign concept to folks that have not used fortran or C (or most critically, tried to interace the two...) it's a common enough concept that it's a reasonable shorthand. As for should we teach memory order at all to newbies?' I usually do teach memory order early on, partly that's because I really like to emphasize that numpy arrays are both a really nice Python data structure and set of functions, but also a wrapper around a block of data -- for the later, you need to talk about order. Also, even with pure-python, knowing a bit about whether arrays are contiguous or not is important (and views, and...). You can do a lot with numpy without thinking about memory order at all, but to really make it dance, you need to know about it. In short -- I don't think the situation is too bad, and not bad enough to change any names or flags, but if someone wants to add a bit to the ravel docstring to clarify it, I'm all for it. I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. So here is a compromise proposal. How about: * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) * Leaving 'C and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. Would you object to that? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about memory layouts. All they need to know is that we look at n-dimensional objects in C-order or in F-order (whichever index runs fastest) Would you accept that it may or may not be true that it is desirable or practical not to mention memory layouts when teaching numpy? You believe it is desirable, I believe that it is not - that teaching numpy naturally involves some discussion of memory layout.
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about memory layouts. All they need to know is that we look at n-dimensional objects in C-order or in F-order (whichever index runs fastest) Would you accept that it may or may not be true that it is desirable or practical not to mention memory layouts when teaching numpy? I think they should be in two different
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, Mar 31, 2013 at 10:43 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about memory layouts. All they need to know is that we look at n-dimensional objects in C-order or in F-order (whichever index runs
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sun, Mar 31, 2013 at 1:43 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 10:38 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 9:37 PM, josef.p...@gmail.com wrote: On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): But this assumes that you already know that there's such a thing as memory layout, and there's such a thing as index ordering, and that 'C' and 'F' in ravel refer to index ordering. Once you have that, you're golden. I'm arguing it's markedly harder to get this distinction, and keep it in mind, and teach it, if we are using the 'C' and 'F names for both things. No, I think you are still missing my point. I think explaining ravel and reshape F and C is easy (kind of) because the students don't need to know at that stage about memory layouts. All they need to know is that we look at n-dimensional objects in C-order or in F-order (whichever index runs fastest) Would you accept that it may or may not be true that it is desirable or practical not to mention
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. In my numpy htmlhelp for version 1.5, I don't have a K or A option np.__version__ '1.5.1' np.arange(5).ravel(K) Traceback (most recent call last): File stdin, line 1, in module TypeError: order not understood np.arange(5).ravel(A) array([0, 1, 2, 3, 4]) the C, F in ravel have their twins in reshape arr = np.arange(10).reshape(2,5, order=C).copy() arr array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) arr.ravel() array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) arr = np.arange(10).reshape(2,5, order=F).copy() arr array([[0, 2, 4, 6, 8], [1, 3, 5, 7, 9]]) arrarr.ravel(F) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) For example we use it when we get raveled arrays from R, and F for column order and C for row order indexing are pretty obvious names when coming from another package (Matlab, R, Gauss) Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 7:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. In my numpy htmlhelp for version 1.5, I don't have a K or A option np.__version__ '1.5.1' np.arange(5).ravel(K) Traceback (most recent call last): File stdin, line 1, in module TypeError: order not understood np.arange(5).ravel(A) array([0, 1, 2, 3, 4]) the C, F in ravel have their twins in reshape arr = np.arange(10).reshape(2,5, order=C).copy() arr array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) arr.ravel() array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) arr = np.arange(10).reshape(2,5, order=F).copy() arr array([[0, 2, 4, 6, 8], [1, 3, 5, 7, 9]]) arrarr.ravel(F) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) For example we use it when we get raveled arrays from R, and F for column order and C for row order indexing are pretty obvious names when coming from another package (Matlab, R, Gauss) just a quick search to get an idea in statsmodels 19 out of 135 ravel are ravel('F') 50 out of 270 reshapes specify: reshape.*order='F' (regular expression) Josef Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Personally I think it is clear enough and that Z and N would confuse me just as much (though I am used to the other names). Also Z and N would seem more like aliases, which would also make sense in the memory order context. If anything, I would prefer renaming the arguments iteration_order and memory_order, but it seems overdoing it... Maybe the documentation could just be checked if it is always clear though. I.e. maybe it does not use iteration or memory order consistently (though I somewhat feel it is usually clear that it must be iteration order, since no numpy function cares about the input memory order as they will just do a copy if necessary). Regards, Sebastian Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Personally I think it is clear enough and that Z and N would confuse me just as much (though I am used to the other names). Also Z and N would seem more like aliases, which would also make sense in the memory order context. If anything, I would prefer renaming the arguments iteration_order and memory_order, but it seems overdoing it... I am not sure what you mean - at the moment there is one argument called 'order' that can refer to iteration order or memory order. Are you proposing two arguments? Maybe the documentation could just be checked if it is always clear though. I.e. maybe it does not use iteration or memory order consistently (though I somewhat feel it is usually clear that it must be iteration order, since no numpy function cares about the input memory order as they will just do a copy if necessary). Do you really mean this? Numpy is full of 'order=' flags that refer to memory. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I don't remember having seen any weird cases. I always thought of order in array creation is the way we want to have the memory layout of the *target* array and has nothing to do with existing memory layout (creating view or copy as needed). reshape, and ravel are *views* if possible, memory might just be some weird strides (and can be ignored unless you want to do
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I don't remember having seen any weird cases. example from our statistics use: rows are observations/time periods, columns are variables/individuals using F or C, we can stack either by time-periods (observations) or individuals (cross-section units) that's easy to understand. A and K are pretty useless for
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 1:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I don't remember having seen any weird cases. I always thought of order in array creation is the way we want to have the memory layout of the *target* array and has nothing to do with existing memory layout (creating view or copy as needed). In the case of ravel of course F and C in memory
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. I can only say that 4 out of 4 experienced numpy developers found themselves unable to predict the behavior of these functions before they saw the output. The problem is always that explaining something makes it clearer for a moment, but, for those who do not have the explanation or who have forgotten it, at least among us here, the outputs were generating groans and / or high fives as we incorrectly or correctly guessed what was going to happen. I think the only way to find out whether this really is confusing or not, is to put someone in front of these functions without any explanation and ask them to predict what is going to come out of the various inputs and flags. Or to try and teach it, which was the problem we were having. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I don't remember having seen any weird cases. example from our statistics use: rows are observations/time periods, columns are variables/individuals using F or C, we can stack either by time-periods (observations) or individuals
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, 2013-03-30 at 12:45 -0700, Matthew Brett wrote: Hi, On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg sebast...@sipsolutions.net wrote: On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote: Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. snip What do y'all think? Personally I think it is clear enough and that Z and N would confuse me just as much (though I am used to the other names). Also Z and N would seem more like aliases, which would also make sense in the memory order context. If anything, I would prefer renaming the arguments iteration_order and memory_order, but it seems overdoing it... I am not sure what you mean - at the moment there is one argument called 'order' that can refer to iteration order or memory order. Are you proposing two arguments? Yes that is what I meant. The reason that it is not convincing to me is that if I write `np.reshape(arr, ..., order='Z')`, I may be tempted to also write `np.copy(arr, order='Z')`. I don't see anything against allowing 'Z' as a more memorable 'C' (I also used to forget which was which), but I don't really see enforcing a different _value_ on the same named argument making it clearer. Renaming the argument itself would seem more sensible to me right now, but I cannot think of a decent name, so I would prefer trying to clarify the documentation if necessary. Maybe the documentation could just be checked if it is always clear though. I.e. maybe it does not use iteration or memory order consistently (though I somewhat feel it is usually clear that it must be iteration order, since no numpy function cares about the input memory order as they will just do a copy if necessary). Do you really mean this? Numpy is full of 'order=' flags that refer to memory. I somewhat imagined there were more iteration order flags and I basically count empty/ones/.../copy as basically one array creation monster... Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.comwrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I got all four correct. I think the concept --- at least for ravel --- is pretty simple: would you like to read the data off in C ordering or Fortran ordering. Since the output array is one-dimensional, its ordering is irrelevant. I don't understand the 'Z' / 'N' suggestion at all. Are they part of some pneumonic? I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy already suffers from too much bikeshedding with names --- I rarely am able to pull out a script I wrote using NumPy even a few years ago and have it immediately work. Cheers, Brad ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 4:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I got all four correct. Then you are smarted and or better informed than we were. I hope you didn't read my explanation before you tested yourself. Of course if you did read my email first I'd expect you and I to get the answer right first time. If you didn't read my email first, and didn't think too hard about it, and still got all the examples right, and you'd get other more confusing examples right that use reshape, then I'd add you as a data point on the other side to the four data points we got yesterday. I think the concept --- at least for ravel --- is pretty simple: would you like to read the data off in C ordering or Fortran ordering. Since the output array is one-dimensional, its ordering is irrelevant. Right - hence my confidence that Josef's sense of thinking of the 'C' and 'F' being target array output was not a good way to think of it in this case. It is in the case of arr.tostring() though. I don't understand the 'Z' / 'N' suggestion at all. Are they part of some pneumonic? Think of the way you'd read off the elements using reverse (last-first) index order for a 2D array, you might imagine something like a Z. I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy already suffers from too much bikeshedding with names --- I rarely am able to pull out a script I wrote using NumPy even a few years ago and have it immediately work. I wish we could drop bike-shedding - it's a completely useless word because one person's bike-shedding is another person's necessary clarification. You think this clarification isn't necessary and you think this discussion is bike-shedding. I'm not suggesting dropping the 'F' and 'C', obviously - can I call that a 'straw man'? I am suggesting changing the name to something much clearer, leaving that name clearly explained in the docs, and leaving 'C' and 'F as functional synonyms for a very long time. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question ravel F and C have *nothing* to do with memory layout. I think it's not confusing for beginners that have no idea and never think about memory layout. I've never seen any problems with it in statsmodels and I have seen many developers (GSOC) that are pretty new to python and numpy. (I didn't check the repo history to verify, so IIRC) Even if N, Z were clearer in this case (which I don't think it is and which I have no idea what it should stand for), you would have to go for every use of ``order`` in numpy to check whether it should be N or F or Z or C, and then users would have to check which order name convention is used in a specific function. Josef I got all four correct. I think the concept --- at least for ravel --- is pretty simple: would you like to read the data off in C ordering or Fortran ordering. Since the output array is one-dimensional, its ordering is irrelevant. I don't understand the 'Z' / 'N' suggestion at all. Are they part of some pneumonic? I'd STRONGLY advise against deprecating the 'F' and 'C' options. NumPy already suffers from too much bikeshedding with names --- I rarely am able to pull out a script I wrote using NumPy even a few years ago and have it immediately work. Cheers, Brad ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that C and F do often refer to memory layout elsewhere in numpy. I think it's not confusing for beginners that have no idea and never think about memory layout. I've never seen any problems with it in statsmodels and I have seen many developers (GSOC) that are pretty new to python and numpy. (I didn't check the repo history to verify, so IIRC) Usually you don't need to know what reshape or ravel did because you are likely to reshape again and that will use the same algorithm. For example, I didn't know that that ravel worked in reverse index order, started explaining it wrong, and had to check. I use ravel and reshape a lot, and have not run into this problem because either a) I didn't test my code properly or b) I did reshape after ravel / reshape and it reversed what I did first time. So, I don't think it's we haven't noticed any problems is a good argument in the face of several experienced developers got it wrong when trying to guess what it did. Even if N, Z were clearer in this case (which I don't think it is and which I have no idea what it should stand for), you would have to go for every use of ``order`` in numpy to check whether it should be N or F or Z or C, and then users would have to check which order name convention is used in a specific function. Right - and this would be silly if and only if it made sense to conflate memory layout and index ordering. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. I don't think I ever get confused about reshape F in 2d. But when I work with 3d or larger ndim nd-arrays, I always have to try an example to check my intuition (in general not just reshape). ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that C and F do often refer to memory layout elsewhere in numpy. I guess that wasn't so helpful. (emphasis on *target*, There are very few places where an order keyword refers to *existing* memory layout. So I'm not tempted to think about existing memory layout when I see ``order``. Also my examples might have confused the issue: ravel and reshape, with C and F are easy to understand without ever looking at memory issues. memory only comes into play when we want to know whether we get a view or copy. The examples were only for the cases when I do care about this. ) I think it's not confusing for beginners that have no idea and never think about memory layout. I've never seen any problems with it in statsmodels and I have seen many developers (GSOC) that are pretty new to python and numpy. (I didn't check the repo history to verify, so IIRC) Usually you don't need to know what reshape or ravel did because you are likely to reshape again and that will use the same algorithm. For example, I didn't know that that ravel worked in reverse index order, started explaining it wrong, and had to check. I use ravel and reshape a lot, and have not run into this problem because either a) I didn't test my code properly or b) I did reshape after ravel / reshape and it reversed what I did first time. So, I don't think it's we haven't noticed any problems is a good argument in the face of several experienced developers got it wrong when trying to guess what it did. What's reverse index order? In the case of statsmodels, we do care about the stacking order. When we use reshape(..., order='F') or ravel('F'), it's only because we want to have a specific array (not memory) layout (and/or because the raveled array came from R) (aside: 2 cases - for 2d parameter vectors, we ravel and reshape often, and we changed our convention to Fortran order, (parameter in rows, equations in columns, IIRC) The interpretation of the results depends on which way we ravel or reshape. - for panel data (time versus individuals), we need to build matching kronecker product arrays which are block-diagonal if the stacking/``order`` is the right way. None of the cases cares about memory layout, it's just: Do we stack by columns or by rows, i.e. fortran- or c-order? Do we want this in rows or in columns? ) Even if N, Z were clearer in this case (which I don't think it is and which
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. I don't think I ever get confused about reshape F in 2d. But when I work with 3d or larger ndim nd-arrays, I always have to try an example to check my intuition (in general not just reshape). ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that C and F do often refer to memory layout elsewhere in numpy. I guess that wasn't so helpful. (emphasis on *target*, There are very few places where an order keyword refers to *existing* memory layout. It is helpful because it shows how easy it is to get confused between memory order and index order. What's reverse index order? I am not being clear, sorry about that: import numpy as np def ravel_iter_last_fastest(arr): res = [] for i in range(arr.shape[0]): for j in range(arr.shape[1]): for k in range(arr.shape[2]): # Iterating over last dimension fastest res.append(arr[i, j, k]) return np.array(res) def ravel_iter_first_fastest(arr): res = [] for k in range(arr.shape[2]): for j in range(arr.shape[1]): for i in range(arr.shape[0]): # Iterating over first dimension fastest res.append(arr[i, j, k]) return np.array(res) a = np.arange(24).reshape((2, 3, 4)) print np.all(a.ravel('C') == ravel_iter_last_fastest(a)) print np.all(a.ravel('F') == ravel_iter_first_fastest(a)) By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above. I guess one could argue that this was not 'reverse' but 'forward' index ordering, but I am not arguing about which is better, or those names, only that it's the order of indices that differs, not the memory layout, and that these ideas need to be kept separate. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sat, Mar 30, 2013 at 11:43 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. I don't think I ever get confused about reshape F in 2d. But when I work with 3d or larger ndim nd-arrays, I always have to try an example to check my intuition (in general not just reshape). ravel F and C have *nothing* to do with memory layout. We do agree on this of course - but you said in an earlier mail that you thought of 'C and 'F' as referring to target memory layout (which they don't in this case) so I think we also agree that C and F do often refer to memory layout elsewhere in numpy. I guess that wasn't so helpful. (emphasis on *target*, There are very few places where an order keyword refers to *existing* memory layout. It is helpful because it shows how easy it is to get confused between memory order and index order. What's reverse index order? I am not being clear, sorry about that: import numpy as np def ravel_iter_last_fastest(arr): res = [] for i in range(arr.shape[0]): for j in range(arr.shape[1]): for k in range(arr.shape[2]): # Iterating over last dimension fastest res.append(arr[i, j, k]) return np.array(res) def ravel_iter_first_fastest(arr): res = [] for k in range(arr.shape[2]): for j in range(arr.shape[1]): for i in range(arr.shape[0]): # Iterating over first dimension fastest res.append(arr[i, j, k]) return np.array(res) good example that's just C and F order in the terminology of numpy http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#controlling-iteration-order (independent of memory) http://docs.scipy.org/doc/numpy/reference/generated/numpy.flatiter.html#numpy.flatiter I don't think we want to rename a large part of the basic terminology of numpy Josef a = np.arange(24).reshape((2, 3, 4)) print np.all(a.ravel('C') == ravel_iter_last_fastest(a)) print np.all(a.ravel('F') == ravel_iter_first_fastest(a)) By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above. I guess one could argue that this was not 'reverse' but 'forward' index ordering, but I am not arguing about which is better, or those names, only that it's the order of indices that differs, not the memory layout, and that these ideas need to be kept separate. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:02 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Mar 30, 2013 at 7:50 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle brad.froe...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 2:20 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:57 PM, josef.p...@gmail.com wrote: On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote: On Sat, Mar 30, 2013 at 4:14 AM, josef.p...@gmail.com wrote: On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? I always thought F and C are easy to understand, I always thought about the content and never about the memory when using it. changing the names doesn't make it easier to understand. I think the confusion is because the new A and K refer to existing memory I disagree, I think it's confusing, but I have evidence, and that is that four out of four of us tested ourselves and got it wrong. Perhaps we are particularly dumb or poorly informed, but I think it's rash to assert there is no problem here. I think you are overcomplicating things or phrased it as a trick question I don't know what you mean by trick question - was there something over-complicated in the example? I deliberately didn't include various much more confusing examples in reshape. I meant making the candidates think about memory instead of just column versus row stacking. To be specific, we were teaching about reshaping a (I, J, K, N) 4D array, it was an image, with time as the 4th dimension (N time points). Raveling and reshaping 3D and 4D arrays is a common thing to do in neuroimaging, as you can imagine. A student asked what he would get back from raveling this array, a concatenated time series, or something spatial? We showed (I'd worked it out by this time) that the first N values were the time series given by [0, 0, 0, :]. He said - Oh - I see - so the data is stored as a whole lot of time series one by one, I thought it would be stored as a series of images'. Ironically, this was a Fortran-ordered array in memory, and he was wrong. So, I think the idea of memory ordering and index ordering is very easy to confuse, and comes up naturally. I would like, as a teacher, to be able to say something like: This is what C memory layout is (it's the memory layout that gives arr.flags.C_CONTIGUOUS=True) This is what F memory layout is (it's the memory layout that gives arr.flags.F_CONTIGUOUS=True) It's rather easy to get something that is neither C or F memory layout Numpy does many memory layouts. Ravel and reshape and numpy in general do not care (normally) about C or F layouts, they only care about index ordering. My point, that I'm repeating, is that my job is made harder by 'arr.ravel('F')'. But once you know that ravel and reshape don't care about memory, the ravel is easy to predict (maybe not easy to visualize in 4-D): order=C: stack the last dimension, N, time series of one 3d pixels, then stack the time series of the next pixel... process pixels by depth and the row by row (like old TVs) I assume you did this because your underlying array is C contiguous. so your ravel('C') is a c-contiguous view (instead of some weird strides or a copy) I usually prefer time in the first dimension, and stack order=F, then I can start at the front, stack all time periods of the first pixel, keep going and work pixels down the columns, first page, next page, ... (and I hope I have a F-contiguous array, so my raveled array is also F-contiguous.) (note: I'm bringing memory back in as optimization, but not to predict the stacking) Josef (I think brains are designed for Fortran order and C-ordering in numpy is a accident, except, reading a Western language book is neither) Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org
[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering
Hi, We were teaching today, and found ourselves getting very confused about ravel and shape in numpy. Summary -- There are two separate ideas needed to understand ordering in ravel and reshape: Idea 1): ravel / reshape can proceed from the last axis to the first, or the first to the last. This is ravel index ordering Idea 2) The physical layout of the array (on disk or in memory) can be C or F contiguous or neither. This is memory ordering The index ordering is usually (but see below) orthogonal to the memory ordering. The 'ravel' and 'reshape' commands use C and F in the sense of index ordering, and this mixes the two ideas and is confusing. What the current situation looks like Specifically, we've been rolling this around 4 experienced numpy users and we all predicted at least one of the results below wrongly. This was what we knew, or should have known: In [2]: import numpy as np In [3]: arr = np.arange(10).reshape((2, 5)) In [5]: arr.ravel() Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) So, the 'ravel' operation unravels over the last axis (1) first, followed by axis 0. So far so good (even if the opposite to MATLAB, Octave). Then we found the 'order' flag to ravel: In [10]: arr.flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [11]: arr.ravel('C') Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) But we soon got confused. How about this? In [12]: arr_F = np.array(arr, order='F') In [13]: arr_F.flags Out[13]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [16]: arr_F Out[16]: array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) In [17]: arr_F.ravel('C') Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right - so the flag 'C' to ravel, has got nothing to do with *memory* ordering, but is to do with *index* ordering. And in fact, we can ask for memory ordering specifically: In [22]: arr.ravel('K') Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [23]: arr_F.ravel('K') Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) In [24]: arr.ravel('A') Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [25]: arr_F.ravel('A') Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) There are some confusions to get into with the 'order' flag to reshape as well, of the same type. Ravel and reshape use the tems 'C' and 'F in the sense of index ordering. This is very confusing. We think the index ordering and memory ordering ideas need to be separated, and specifically, we should avoid using C and F to refer to index ordering. Proposal - * Deprecate the use of C and F meaning backwards and forwards index ordering for ravel, reshape * Prefer Z and N, being graphical representations of unraveling in 2 dimensions, axis1 first and axis0 first respectively (excellent naming idea by Paul Ivanov) What do y'all think? Cheers, Matthew Paul Ivanov JB Poline ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion