Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-30 Thread Matthew Brett
Hi,

On Sat, Apr 6, 2013 at 3:15 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
  
   It's not *any* cost, this goes deep and wide, it's one of the basic
   concepts of numpy that you want to rename.
 
  The proposal I last made was to change the default name to 'layout'
  after some period to be agreed - say - P - with suitable warning in
  the docstring up until that time, and after, and leave 'order' as an
  alias forever.
 
 
  The above paragraph is simply incorrect. Your last proposal also
  included
  deprecation warnings and a future backwards compatibility break by
  removing
  'order'.
 
  If you now say you're not proposing steps 3 and 4 anymore, then you're
  back
  to what I called option (2) - duplicate keywords forever. Which for me
  is
  undesirable, for reasons I already mentioned.

 You might not have read my follow-up proposing to drop steps 3 and 4
 if you felt they were unacceptable.

  P.S. being called short-sighted and damaging numpy by responding to a
  proposal you now say you didn't make is pretty damn annoying.

 No, I did make that proposal, and in the spirit of negotiation and
 consensus, I subsequently modified my proposal, as I hope you'd expect
 in this situation.


 You have had clear NOs to the various incarnations of your proposal from 3
 active developers of this community, not once but two or three times from
 each of those developers. Furthermore you have got only a couple of +0.5s,
 after 90 emails no one else seems to feel that this is a change we really
 have to have this change. Therefore I don't expect another modification of
 your proposal, I expect you to drop it.

 OK - I think I have a better understanding of the 'model' now.

 As another poster said, this thread has run its course. The technical issues
 are clear, and apparently we're going to have to agree to disagree about the
 seriousness of the confusion. Please please go and fix the docs in the way
 you deem best, and leave it at that. And triple please not another
 governance thread.

https://github.com/numpy/numpy/pull/3294

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-16 Thread srean
As one lurker to another, thanks for calling it out.

Over-argumentative, and personality centric threads like these have
actually led me to distance myself from the numpy community. I do not know
how common it is now because I do not follow it closely anymore. It used to
be quite common at one point in time. I came down to check after a while,
and lo there it is again.

If a mail is put forward as a question i find this confusing, is it
confusing for you, it ought not to devolve into a shouting match atop
moral high-horses so you think I am stupid do you?  too smart are you ?
how dare you express that it doesnt bother you as much when it bothers me
and my documented case of 4 people. I have four, how many do you have

If something is posed as a question one should be open to the answers.
Sometimes it is better not to pose it a question at all but offer
alternatives and ask for preference.

I am not siding with any of the technical options provided, just requesting
that the discourse not devolve into these personality oriented contests. It
gets too loud and noisy.

Thank you



On Sat, Apr 6, 2013 at 12:18 PM, matti picus matti.pi...@gmail.com wrote:

 as a lurker, may I say that this discussion seems to have become
 non-productive?

 It seems all agree that docs needs improvement, perhaps a first step would
 be to suggest doc improvements, and then the need for renaming may become
 self-evident, or not.

 aww darn, ruined my lurker status.
 Matti Picus

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Ralf Gommers
On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
 
  It's not *any* cost, this goes deep and wide, it's one of the basic
  concepts of numpy that you want to rename.

 The proposal I last made was to change the default name to 'layout'
 after some period to be agreed - say - P - with suitable warning in
 the docstring up until that time, and after, and leave 'order' as an
 alias forever.


The above paragraph is simply incorrect. Your last proposal also included
deprecation warnings and a future backwards compatibility break by removing
'order'.

If you now say you're not proposing steps 3 and 4 anymore, then you're back
to what I called option (2) - duplicate keywords forever. Which for me is
undesirable, for reasons I already mentioned.

Ralf

P.S. being called short-sighted and damaging numpy by responding to a
proposal you now say you didn't make is pretty damn annoying.

P.P.S. expect an identical response from me to future proposals that
include backwards compatibility breaks of heavily used functions for
something that's not a functional enhancement or bug fix. Such proposals
are just not OK.

P.P.P.S. I'm not sure exactly what you mean by default keyword. If layout
overrules order and layout's default value is not None, you're still
proposing a backwards compatibility break.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread matti picus
as a lurker, may I say that this discussion seems to have become
non-productive?

It seems all agree that docs needs improvement, perhaps a first step would
be to suggest doc improvements, and then the need for renaming may become
self-evident, or not.

aww darn, ruined my lurker status.
Matti Picus
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Matthew Brett
Hi,

On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
 
  It's not *any* cost, this goes deep and wide, it's one of the basic
  concepts of numpy that you want to rename.

 The proposal I last made was to change the default name to 'layout'
 after some period to be agreed - say - P - with suitable warning in
 the docstring up until that time, and after, and leave 'order' as an
 alias forever.


 The above paragraph is simply incorrect. Your last proposal also included
 deprecation warnings and a future backwards compatibility break by removing
 'order'.

 If you now say you're not proposing steps 3 and 4 anymore, then you're back
 to what I called option (2) - duplicate keywords forever. Which for me is
 undesirable, for reasons I already mentioned.

You might not have read my follow-up proposing to drop steps 3 and 4
if you felt they were unacceptable.

 P.S. being called short-sighted and damaging numpy by responding to a
 proposal you now say you didn't make is pretty damn annoying.

No, I did make that proposal, and in the spirit of negotiation and
consensus, I subsequently modified my proposal, as I hope you'd expect
in this situation.

I'm am honestly sorry that I offended you. In hindsight, although I do
worry that numpy feels as if it does resist reasonable change more
strongly than is healthy, I was probably responding to my feeling that
you were trying to veto the discussion rather than joining it, and I
really should have put it that way instead.  I am sorry about that.

 P.P.S. expect an identical response from me to future proposals that include
 backwards compatibility breaks of heavily used functions for something
 that's not a functional enhancement or bug fix. Such proposals are just not
 OK.

It seems to me that each change has to be considered on its merit, and
strict rules of that sort are not very useful.  You are again implying
that this change is not important, and obviously there I don't agree.
I addressed the level and timing of backwards compatibility breakage
in my comments to Josef.   You haven't responded to me on that.

 P.P.P.S. I'm not sure exactly what you mean by default keyword. If layout
 overrules order and layout's default value is not None, you're still
 proposing a backwards compatibility break.

I mean, that until the expiry of some agreed period 'P' - the
docstring would read

def ravel(self, order='C', **kwargs)

where  kwargs can only contain 'layout',  and 'layout', 'order' cannot
both be defined

and after the expiry of 'P'

def ravel(self, layout='C', **kwargs)

where  kwargs can only contain 'order',  and 'layout', 'order' cannot
both be defined

At least that's my proposal, I'm happy to change it if there is a
better solution.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Paul Ivanov
Hi Ralf,

Ralf Gommers, on 2013-04-06 10:51,  wrote:
 P.P.S. expect an identical response from me to future proposals that
 include backwards compatibility breaks of heavily used functions for
 something that's not a functional enhancement or bug fix. Such proposals
 are just not OK.

but it is a functional enhancement or bug fix - the ambiguity in
the affect of order= values in several places only serve to
confuse two different ideas into one.

-- 
Paul Ivanov
314 address only used for lists,  off-list direct email at:
http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Ralf Gommers
On Sat, Apr 6, 2013 at 8:16 PM, Paul Ivanov pivanov...@gmail.com wrote:

 Hi Ralf,

 Ralf Gommers, on 2013-04-06 10:51,  wrote:
  P.P.S. expect an identical response from me to future proposals that
  include backwards compatibility breaks of heavily used functions for
  something that's not a functional enhancement or bug fix. Such proposals
  are just not OK.

 but it is a functional enhancement or bug fix - the ambiguity in
 the affect of order= values in several places only serve to
 confuse two different ideas into one.


That sentence makes zero sense. The reason you can't decide whether it's a
bug fix or enhancement is because it's neither. What ambiguity there is can
be solved with documentation only, there's nothing new you can do with
these functions after introducing a new keyword and there is no bug.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Ralf Gommers
On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
  
   It's not *any* cost, this goes deep and wide, it's one of the basic
   concepts of numpy that you want to rename.
 
  The proposal I last made was to change the default name to 'layout'
  after some period to be agreed - say - P - with suitable warning in
  the docstring up until that time, and after, and leave 'order' as an
  alias forever.
 
 
  The above paragraph is simply incorrect. Your last proposal also included
  deprecation warnings and a future backwards compatibility break by
 removing
  'order'.
 
  If you now say you're not proposing steps 3 and 4 anymore, then you're
 back
  to what I called option (2) - duplicate keywords forever. Which for me is
  undesirable, for reasons I already mentioned.

 You might not have read my follow-up proposing to drop steps 3 and 4
 if you felt they were unacceptable.

  P.S. being called short-sighted and damaging numpy by responding to a
  proposal you now say you didn't make is pretty damn annoying.

 No, I did make that proposal, and in the spirit of negotiation and
 consensus, I subsequently modified my proposal, as I hope you'd expect
 in this situation.


You have had clear NOs to the various incarnations of your proposal from 3
active developers of this community, not once but two or three times from
each of those developers. Furthermore you have got only a couple of +0.5s,
after 90 emails no one else seems to feel that this is a change we really
have to have this change. Therefore I don't expect another modification of
your proposal, I expect you to drop it.

As another poster said, this thread has run its course. The technical
issues are clear, and apparently we're going to have to agree to disagree
about the seriousness of the confusion. Please please go and fix the docs
in the way you deem best, and leave it at that. And triple please not
another governance thread.

I'm am honestly sorry that I offended you.


Thank you. I apologize as well if my tone of the last message was too
strong.

Ralf

In hindsight, although I do
 worry that numpy feels as if it does resist reasonable change more
 strongly than is healthy, I was probably responding to my feeling that
 you were trying to veto the discussion rather than joining it, and I
 really should have put it that way instead.  I am sorry about that.

  P.P.S. expect an identical response from me to future proposals that
 include
  backwards compatibility breaks of heavily used functions for something
  that's not a functional enhancement or bug fix. Such proposals are just
 not
  OK.

 It seems to me that each change has to be considered on its merit, and
 strict rules of that sort are not very useful.  You are again implying
 that this change is not important, and obviously there I don't agree.
 I addressed the level and timing of backwards compatibility breakage
 in my comments to Josef.   You haven't responded to me on that.

  P.P.P.S. I'm not sure exactly what you mean by default keyword. If
 layout
  overrules order and layout's default value is not None, you're still
  proposing a backwards compatibility break.

 I mean, that until the expiry of some agreed period 'P' - the
 docstring would read

 def ravel(self, order='C', **kwargs)

 where  kwargs can only contain 'layout',  and 'layout', 'order' cannot
 both be defined

 and after the expiry of 'P'

 def ravel(self, layout='C', **kwargs)

 where  kwargs can only contain 'order',  and 'layout', 'order' cannot
 both be defined

 At least that's my proposal, I'm happy to change it if there is a
 better solution.

 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-06 Thread Matthew Brett
Hi,

On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
  
   It's not *any* cost, this goes deep and wide, it's one of the basic
   concepts of numpy that you want to rename.
 
  The proposal I last made was to change the default name to 'layout'
  after some period to be agreed - say - P - with suitable warning in
  the docstring up until that time, and after, and leave 'order' as an
  alias forever.
 
 
  The above paragraph is simply incorrect. Your last proposal also
  included
  deprecation warnings and a future backwards compatibility break by
  removing
  'order'.
 
  If you now say you're not proposing steps 3 and 4 anymore, then you're
  back
  to what I called option (2) - duplicate keywords forever. Which for me
  is
  undesirable, for reasons I already mentioned.

 You might not have read my follow-up proposing to drop steps 3 and 4
 if you felt they were unacceptable.

  P.S. being called short-sighted and damaging numpy by responding to a
  proposal you now say you didn't make is pretty damn annoying.

 No, I did make that proposal, and in the spirit of negotiation and
 consensus, I subsequently modified my proposal, as I hope you'd expect
 in this situation.


 You have had clear NOs to the various incarnations of your proposal from 3
 active developers of this community, not once but two or three times from
 each of those developers. Furthermore you have got only a couple of +0.5s,
 after 90 emails no one else seems to feel that this is a change we really
 have to have this change. Therefore I don't expect another modification of
 your proposal, I expect you to drop it.

OK - I think I have a better understanding of the 'model' now.

 As another poster said, this thread has run its course. The technical issues
 are clear, and apparently we're going to have to agree to disagree about the
 seriousness of the confusion. Please please go and fix the docs in the way
 you deem best, and leave it at that. And triple please not another
 governance thread.

The governance threads happen because of the lack of governance, as
this thread shows.  I don't agree that decisions should be taken like
this (+1, -1, No!, Yes!).  I think they should be taken by negotiation
and agreement.  You disagree, but on whose authority, I do not know,
and we have no way of resolving that, because there is - no governance
thread.

 I'm am honestly sorry that I offended you.


 Thank you. I apologize as well if my tone of the last message was too
 strong.

Thank you in turn, that is generous of you,

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 Hey

 On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
 Hi,

 On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote:
 snip
  Maybe we should go through and rename order to something more descriptive
  in each case, so we'd have
a.reshape(..., index_order=C)
a.copy(memory_order=F)
  etc.?

 I'd like to propose this instead:

 a.reshape(..., order=C)
 a.copy(layout=F)


 I actually like this, makes the point clearer that it has to do with
 memory layout and implies contiguity, plus it is short and from the
 numpy perspective copy, etc. are the ones that add additional info to
 order and not reshape (because IMO memory order is something new users
 should not worry about at first). A and K orders will still have their
 quirks with np.array and copy=True/False, but for many functions they
 are esoteric anyway.

 It will be one hell of a deprecation though, but I am +0.5 for adding an
 alias for now (maybe someone knows an even better name?), but I think
 that in this case, it probably really is better to wait with actual
 deprecation warnings for a few versions, since it touches a *lot* of
 code. Plus I think at the point of starting deprecation warnings (and
 best earlier) numpy should provide an automatic fixer script...

 The only counter point that remains for me is the difficulty of
 deprecation, since I think the new name idea is very clean. And this is
 unfortunately even more invasive then the index_order proposal.

I completely agree that we'd have to be gentle with the change.  The
problem we'd want to avoid is people innocently using 'layout' and
finding to their annoyance that the code doesn't work with other
people's numpy.

How about:

Step 1:  'order' remains as named keyword, layout added as alias,
comment on the lines of layout will become the default keyword for
this option in later versions of numpy; please consider updating any
code that does not need to remain backwards compatible'.

Step 2: default keyword becomes 'layout' with 'order' as alias,
comment like order is an alias for 'layout' to maintain backwards
compatibility with numpy = 1.7.1', please update any code that does
not need to maintain backwards compatibility with these numpy
versions'

Step 3: Add deprecation warning for 'order', order will be removed as
an alias in future versions of numpy

Step 4: (distant future) Remove alias

?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Ralf Gommers
On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  Hey
 
  On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
  Hi,
 
  On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote:
  snip
   Maybe we should go through and rename order to something more
 descriptive
   in each case, so we'd have
 a.reshape(..., index_order=C)
 a.copy(memory_order=F)
   etc.?
 
  I'd like to propose this instead:
 
  a.reshape(..., order=C)
  a.copy(layout=F)
 
 
  I actually like this, makes the point clearer that it has to do with
  memory layout and implies contiguity, plus it is short and from the
  numpy perspective copy, etc. are the ones that add additional info to
  order and not reshape (because IMO memory order is something new users
  should not worry about at first). A and K orders will still have their
  quirks with np.array and copy=True/False, but for many functions they
  are esoteric anyway.
 
  It will be one hell of a deprecation though, but I am +0.5 for adding an
  alias for now (maybe someone knows an even better name?), but I think
  that in this case, it probably really is better to wait with actual
  deprecation warnings for a few versions, since it touches a *lot* of
  code. Plus I think at the point of starting deprecation warnings (and
  best earlier) numpy should provide an automatic fixer script...
 
  The only counter point that remains for me is the difficulty of
  deprecation, since I think the new name idea is very clean. And this is
  unfortunately even more invasive then the index_order proposal.

 I completely agree that we'd have to be gentle with the change.  The
 problem we'd want to avoid is people innocently using 'layout' and
 finding to their annoyance that the code doesn't work with other
 people's numpy.

 How about:

 Step 1:  'order' remains as named keyword, layout added as alias,
 comment on the lines of layout will become the default keyword for
 this option in later versions of numpy; please consider updating any
 code that does not need to remain backwards compatible'.

 Step 2: default keyword becomes 'layout' with 'order' as alias,
 comment like order is an alias for 'layout' to maintain backwards
 compatibility with numpy = 1.7.1', please update any code that does
 not need to maintain backwards compatibility with these numpy
 versions'

 Step 3: Add deprecation warning for 'order', order will be removed as
 an alias in future versions of numpy

 Step 4: (distant future) Remove alias

 ?


A very strong -1 from me. Now we're talking about deprecation warnings and
a backwards compatibility break after all. I thought we agreed that this
was a very bad idea, so why are you proposing it now?

Here's how I see it: deprecation of order is a no go. Therefore we have
two choices here:
1. Simply document the current order keyword better and leave it at that.
2. Add a layout (or index_order) keyword, and live with both order
and layout keywords forever.

(2) is at least as confusing as (1), more work and poor design. Therefore I
propose to go with (1).

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  Hey
 
  On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
  Hi,
 
  On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote:
  snip
   Maybe we should go through and rename order to something more
   descriptive
   in each case, so we'd have
 a.reshape(..., index_order=C)
 a.copy(memory_order=F)
   etc.?
 
  I'd like to propose this instead:
 
  a.reshape(..., order=C)
  a.copy(layout=F)
 
 
  I actually like this, makes the point clearer that it has to do with
  memory layout and implies contiguity, plus it is short and from the
  numpy perspective copy, etc. are the ones that add additional info to
  order and not reshape (because IMO memory order is something new users
  should not worry about at first). A and K orders will still have their
  quirks with np.array and copy=True/False, but for many functions they
  are esoteric anyway.
 
  It will be one hell of a deprecation though, but I am +0.5 for adding an
  alias for now (maybe someone knows an even better name?), but I think
  that in this case, it probably really is better to wait with actual
  deprecation warnings for a few versions, since it touches a *lot* of
  code. Plus I think at the point of starting deprecation warnings (and
  best earlier) numpy should provide an automatic fixer script...
 
  The only counter point that remains for me is the difficulty of
  deprecation, since I think the new name idea is very clean. And this is
  unfortunately even more invasive then the index_order proposal.

 I completely agree that we'd have to be gentle with the change.  The
 problem we'd want to avoid is people innocently using 'layout' and
 finding to their annoyance that the code doesn't work with other
 people's numpy.

 How about:

 Step 1:  'order' remains as named keyword, layout added as alias,
 comment on the lines of layout will become the default keyword for
 this option in later versions of numpy; please consider updating any
 code that does not need to remain backwards compatible'.

 Step 2: default keyword becomes 'layout' with 'order' as alias,
 comment like order is an alias for 'layout' to maintain backwards
 compatibility with numpy = 1.7.1', please update any code that does
 not need to maintain backwards compatibility with these numpy
 versions'

 Step 3: Add deprecation warning for 'order', order will be removed as
 an alias in future versions of numpy

 Step 4: (distant future) Remove alias

 ?


 A very strong -1 from me. Now we're talking about deprecation warnings and a
 backwards compatibility break after all. I thought we agreed that this was a
 very bad idea, so why are you proposing it now?

 Here's how I see it: deprecation of order is a no go. Therefore we have
 two choices here:
 1. Simply document the current order keyword better and leave it at that.
 2. Add a layout (or index_order) keyword, and live with both order and
 layout keywords forever.

 (2) is at least as confusing as (1), more work and poor design. Therefore I
 propose to go with (1).

You are saying that deprecation of 'order' at any stage in the next 10
years of numpy's lifetime is a no go?

I think that is short-sighted and I think it will damage numpy.
Believe me, I have as much investment in backward compatibility as you
do.  All the three libraries that I spend a long time maintaining need
to test against old numpy versions - but - for heaven's sake - only
back to numpy 1.2 or numpy 1.3.  We don't support Python 2.5 any more,
and I don't think we need to maintain compatibility with Numeric
either.

If you are saying that we need to maintain compatibility for 10 years
at a stretch, then we will have to accept that numpy will gradually
decay into a legacy library, because it is certain that, if we stay
static, someone else with more ambition will do a better job.

There is a cost to being averse to any change at all, no matter how
gradually it is managed.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Ralf Gommers
On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
  
   On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
   Hi,
  
   On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com
 wrote:
   snip
Maybe we should go through and rename order to something more
descriptive
in each case, so we'd have
  a.reshape(..., index_order=C)
  a.copy(memory_order=F)
etc.?
  
   I'd like to propose this instead:
  
   a.reshape(..., order=C)
   a.copy(layout=F)
  
  
   I actually like this, makes the point clearer that it has to do with
   memory layout and implies contiguity, plus it is short and from the
   numpy perspective copy, etc. are the ones that add additional info to
   order and not reshape (because IMO memory order is something new
 users
   should not worry about at first). A and K orders will still have their
   quirks with np.array and copy=True/False, but for many functions they
   are esoteric anyway.
  
   It will be one hell of a deprecation though, but I am +0.5 for adding
 an
   alias for now (maybe someone knows an even better name?), but I think
   that in this case, it probably really is better to wait with actual
   deprecation warnings for a few versions, since it touches a *lot* of
   code. Plus I think at the point of starting deprecation warnings (and
   best earlier) numpy should provide an automatic fixer script...
  
   The only counter point that remains for me is the difficulty of
   deprecation, since I think the new name idea is very clean. And this
 is
   unfortunately even more invasive then the index_order proposal.
 
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
 and a
  backwards compatibility break after all. I thought we agreed that this
 was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we have
  two choices here:
  1. Simply document the current order keyword better and leave it at
 that.
  2. Add a layout (or index_order) keyword, and live with both order
 and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
 Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


For something like this? Yes.


 I think that is short-sighted and I think it will damage numpy.


It will damage numpy to be conservative and not change a name for a little
bit of clarity for some people that avoids reading the docs maybe a little
more carefully? There's a lot of things that can damage numpy, but this
isn't even close in my book. Too few developers, continuous backwards
compatibility issues, faster alternative libraries surpassing numpy -
that's the kind of thing that causes damage.


 Believe me, I have as much investment in backward compatibility as you
 do.  All the three libraries that I spend a long time maintaining need
 to test against old numpy versions - but - for heaven's sake - only
 back to numpy 1.2 or numpy 1.3.  We don't support Python 2.5 any more,
 and I don't think we need to maintain compatibility with Numeric
 either.


Really? This is from 3 months ago:
http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now
2013, we are probably dropping numarray compat in 1.8. Not exactly 10
years, but of the same order.


 If you are saying that we need to maintain compatibility for 10 years
 at a stretch, then we will have to accept that numpy will gradually
 decay into a legacy library, because it is certain that, if we stay
 static, someone else with more ambition will do a better job.

 There is a cost to being averse to any change at all, no matter how
 gradually it 

[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
  
   On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
   Hi,
  
   On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com
   wrote:
   snip
Maybe we should go through and rename order to something more
descriptive
in each case, so we'd have
  a.reshape(..., index_order=C)
  a.copy(memory_order=F)
etc.?
  
   I'd like to propose this instead:
  
   a.reshape(..., order=C)
   a.copy(layout=F)
  
  
   I actually like this, makes the point clearer that it has to do with
   memory layout and implies contiguity, plus it is short and from the
   numpy perspective copy, etc. are the ones that add additional info to
   order and not reshape (because IMO memory order is something new
   users
   should not worry about at first). A and K orders will still have
   their
   quirks with np.array and copy=True/False, but for many functions they
   are esoteric anyway.
  
   It will be one hell of a deprecation though, but I am +0.5 for adding
   an
   alias for now (maybe someone knows an even better name?), but I think
   that in this case, it probably really is better to wait with actual
   deprecation warnings for a few versions, since it touches a *lot* of
   code. Plus I think at the point of starting deprecation warnings (and
   best earlier) numpy should provide an automatic fixer script...
  
   The only counter point that remains for me is the difficulty of
   deprecation, since I think the new name idea is very clean. And this
   is
   unfortunately even more invasive then the index_order proposal.
 
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

You are saying I think that I am wrong in thinking this is an
important change that will make numpy easier to explain and use in the
long term.

You'd probably expect me to disagree, and I do.  I think I am right in
thinking the change is important - I've tried to make that case in
this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a little
 bit of clarity for some people that avoids reading the docs maybe a little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - that's
 the kind of thing that causes damage.

We're talked about consensus on this list.  Of course it can be very
hard to achieve.

 Believe me, I have as much investment in backward compatibility as you
 do.  All the three libraries that I spend a long time maintaining need
 to test against old numpy versions - but - for heaven's sake - only
 back to numpy 1.2 or numpy 1.3.  We don't support Python 2.5 any more,
 and I don't think we need to maintain compatibility with Numeric
 either.


 Really? This is from 3 months ago:
 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 3:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
  
   On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote:
   Hi,
  
   On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com
   wrote:
   snip
Maybe we should go through and rename order to something more
descriptive
in each case, so we'd have
  a.reshape(..., index_order=C)
  a.copy(memory_order=F)
etc.?
  
   I'd like to propose this instead:
  
   a.reshape(..., order=C)
   a.copy(layout=F)
  
  
   I actually like this, makes the point clearer that it has to do with
   memory layout and implies contiguity, plus it is short and from the
   numpy perspective copy, etc. are the ones that add additional info to
   order and not reshape (because IMO memory order is something new
   users
   should not worry about at first). A and K orders will still have
   their
   quirks with np.array and copy=True/False, but for many functions they
   are esoteric anyway.
  
   It will be one hell of a deprecation though, but I am +0.5 for adding
   an
   alias for now (maybe someone knows an even better name?), but I think
   that in this case, it probably really is better to wait with actual
   deprecation warnings for a few versions, since it touches a *lot* of
   code. Plus I think at the point of starting deprecation warnings (and
   best earlier) numpy should provide an automatic fixer script...
  
   The only counter point that remains for me is the difficulty of
   deprecation, since I think the new name idea is very clean. And this
   is
   unfortunately even more invasive then the index_order proposal.
 
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a little
 bit of clarity for some people that avoids reading the docs maybe a little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 Believe me, I have as much investment in backward compatibility as you
 do.  All the three libraries that I spend a long time maintaining need
 to test against old numpy versions - but - for heaven's sake - only
 back to numpy 1.2 or numpy 1.3.  We don't support Python 2.5 any more,
 and I don't think we need to maintain 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 4:27 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
snip
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a little
 bit of clarity for some people that avoids reading the docs maybe a little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 So far the consensus is that the documentation needs improvement.

The only thing all of the No camp agree with is documentation
improvement, I think that's fair.

 After that ???

Well I think we have:

Flat-no - the change not important, almost any cost is too high

You
Ralf
Bradley

Mid-no - maybe something could work, but not sure we've seen it yet.

Chris

Middle - current situation can be confusing, maybe one of the proposed
solutions would be acceptable

Sebastian
Nathaniel

Mid-yes - previous apparent vote for argument name change

Éric Depagne
Andrew Jaffe   (sorry if I misrepresent you)

And then me.

I am trying to be balanced.  Unlike others, I think better names would
have a significant impact on how coherent numpy is to explain and use.
 It seems to me that a change would be beneficial in the long term,
and I'm confident we can agree on a schedule for that change that
would be acceptable.  But you know that.

So - as I understand our 'model' - our job is to try and come to some
shared agreement, if we possibly can.

It has been good and encouraging for me at least to see that we have
developed our ideas over the course of this thread.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread josef . pktd
On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Apr 5, 2013 at 4:27 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com 
 wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
 snip
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a little
 bit of clarity for some people that avoids reading the docs maybe a little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - 
 that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 So far the consensus is that the documentation needs improvement.

 The only thing all of the No camp agree with is documentation
 improvement, I think that's fair.

 After that ???

 Well I think we have:

 Flat-no - the change not important, almost any cost is too high

It's not *any* cost, this goes deep and wide, it's one of the basic
concepts of numpy that you want to rename.

Note, I'm just a user of numpy
My main objection was to N and Z, which would have affected me
(and statsmodels developers)

I don't really care about the layout change. I have no or almost no
code depending on it. And, I don't have to implement it, nor do I have
to struggle with the low level numpy behavior that would be affected
by this. (And renaming doesn't change the concept.)

Josef



 You
 Ralf
 Bradley

 Mid-no - maybe something could work, but not sure we've seen it yet.

 Chris

 Middle - current situation can be confusing, maybe one of the proposed
 solutions would be acceptable

 Sebastian
 Nathaniel

 Mid-yes - previous apparent vote for argument name change

 Éric Depagne
 Andrew Jaffe   (sorry if I misrepresent you)

 And then me.

 I am trying to be balanced.  Unlike others, I think better names would
 have a significant impact on how coherent numpy is to explain and use.
  It seems to me that a change would be beneficial in the long term,
 and I'm confident we can agree on a schedule for that change that
 would be acceptable.  But you know that.

 So - as I understand our 'model' - our job is to try and come to some
 shared agreement, if we possibly can.

 It has been good and encouraging for me at least to see that we have
 developed our ideas over the course of this thread.

 Cheers,

 Matthew
 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Apr 5, 2013 at 4:27 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com 
 wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
 snip
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both 
  order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a little
 bit of clarity for some people that avoids reading the docs maybe a little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - 
 that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 So far the consensus is that the documentation needs improvement.

 The only thing all of the No camp agree with is documentation
 improvement, I think that's fair.

 After that ???

 Well I think we have:

 Flat-no - the change not important, almost any cost is too high

 It's not *any* cost, this goes deep and wide, it's one of the basic
 concepts of numpy that you want to rename.

The proposal I last made was to change the default name to 'layout'
after some period to be agreed - say - P - with suitable warning in
the docstring up until that time, and after, and leave 'order' as an
alias forever.

The only problem I can see with this, is that if someone, after period
P, does not read the docstring, and uses 'layout' instead of 'order',
then they will find that their code is not backwards compatible with
versions of numpy of greater age than P. They can fix this, forever,
by reverting to 'order'.  That's certainly not zero cost, but it's not
much cost either, and the cost will depend on P.

 Note, I'm just a user of numpy
 My main objection was to N and Z, which would have affected me
 (and statsmodels developers)

Right.

 I don't really care about the layout change. I have no or almost no
 code depending on it. And, I don't have to implement it, nor do I have
 to struggle with the low level numpy behavior that would be affected
 by this. (And renaming doesn't change the concept.)

No, right, the renaming is to clarify and distinguish the concepts.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread josef . pktd
On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 4:27 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com 
 wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
 snip
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed 
  as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation warnings
  and a
  backwards compatibility break after all. I thought we agreed that this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both 
  order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a 
 little
 bit of clarity for some people that avoids reading the docs maybe a 
 little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - 
 that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 So far the consensus is that the documentation needs improvement.

 The only thing all of the No camp agree with is documentation
 improvement, I think that's fair.

 After that ???

 Well I think we have:

 Flat-no - the change not important, almost any cost is too high

 It's not *any* cost, this goes deep and wide, it's one of the basic
 concepts of numpy that you want to rename.

 The proposal I last made was to change the default name to 'layout'
 after some period to be agreed - say - P - with suitable warning in
 the docstring up until that time, and after, and leave 'order' as an
 alias forever.

 The only problem I can see with this, is that if someone, after period
 P, does not read the docstring, and uses 'layout' instead of 'order',
 then they will find that their code is not backwards compatible with
 versions of numpy of greater age than P. They can fix this, forever,
 by reverting to 'order'.  That's certainly not zero cost, but it's not
 much cost either, and the cost will depend on P.

You edit large parts of the numpy tutorial and explanations,
you add a second keyword to (rough guess) 10 functions and
a similar number of methods (even wilder guess), the methods
are in C, so you have to change it both on the c and the python
level.
Two keywords will confuse users for a long time
(and which one is in the tutorial documentation)

I'm just guessing and I have no idea about the c-level.

Josef



 Note, I'm just a user of numpy
 My main objection was to N and Z, which would have affected me
 (and statsmodels developers)

 Right.

 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-05 Thread Matthew Brett
Hi,

On Fri, Apr 5, 2013 at 8:31 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 7:39 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 4:27 PM,  josef.p...@gmail.com wrote:
 On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers ralf.gomm...@gmail.com 
 wrote:



 On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers ralf.gomm...@gmail.com
 wrote:
 
 
 
  On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  Hi,
 
  On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg
  sebast...@sipsolutions.net wrote:
   Hey
 snip
  I completely agree that we'd have to be gentle with the change.  The
  problem we'd want to avoid is people innocently using 'layout' and
  finding to their annoyance that the code doesn't work with other
  people's numpy.
 
  How about:
 
  Step 1:  'order' remains as named keyword, layout added as alias,
  comment on the lines of layout will become the default keyword for
  this option in later versions of numpy; please consider updating any
  code that does not need to remain backwards compatible'.
 
  Step 2: default keyword becomes 'layout' with 'order' as alias,
  comment like order is an alias for 'layout' to maintain backwards
  compatibility with numpy = 1.7.1', please update any code that does
  not need to maintain backwards compatibility with these numpy
  versions'
 
  Step 3: Add deprecation warning for 'order', order will be removed 
  as
  an alias in future versions of numpy
 
  Step 4: (distant future) Remove alias
 
  ?
 
 
  A very strong -1 from me. Now we're talking about deprecation 
  warnings
  and a
  backwards compatibility break after all. I thought we agreed that 
  this
  was a
  very bad idea, so why are you proposing it now?
 
  Here's how I see it: deprecation of order is a no go. Therefore we
  have
  two choices here:
  1. Simply document the current order keyword better and leave it at
  that.
  2. Add a layout (or index_order) keyword, and live with both 
  order
  and
  layout keywords forever.
 
  (2) is at least as confusing as (1), more work and poor design.
  Therefore I
  propose to go with (1).

 You are saying that deprecation of 'order' at any stage in the next 10
 years of numpy's lifetime is a no go?


 For something like this? Yes.

 You are saying I think that I am wrong in thinking this is an
 important change that will make numpy easier to explain and use in the
 long term.

 You'd probably expect me to disagree, and I do.  I think I am right in
 thinking the change is important - I've tried to make that case in
 this thread, as well as I can.

 I think that is short-sighted and I think it will damage numpy.


 It will damage numpy to be conservative and not change a name for a 
 little
 bit of clarity for some people that avoids reading the docs maybe a 
 little
 more carefully? There's a lot of things that can damage numpy, but this
 isn't even close in my book. Too few developers, continuous backwards
 compatibility issues, faster alternative libraries surpassing numpy - 
 that's
 the kind of thing that causes damage.

 We're talked about consensus on this list.  Of course it can be very
 hard to achieve.

 So far the consensus is that the documentation needs improvement.

 The only thing all of the No camp agree with is documentation
 improvement, I think that's fair.

 After that ???

 Well I think we have:

 Flat-no - the change not important, almost any cost is too high

 It's not *any* cost, this goes deep and wide, it's one of the basic
 concepts of numpy that you want to rename.

 The proposal I last made was to change the default name to 'layout'
 after some period to be agreed - say - P - with suitable warning in
 the docstring up until that time, and after, and leave 'order' as an
 alias forever.

 The only problem I can see with this, is that if someone, after period
 P, does not read the docstring, and uses 'layout' instead of 'order',
 then they will find that their code is not backwards compatible with
 versions of numpy of greater age than P. They can fix this, forever,
 by reverting to 'order'.  That's certainly not zero cost, but it's not
 much cost either, and the cost will depend on P.

 You edit large parts of the numpy tutorial and explanations,

We agree that these concepts need to be clarified in the explanations.

For the docs, we would first add the keyword as an alias and note it so.

 you add a second keyword to (rough guess) 10 functions and
 a similar number of methods (even wilder guess), the methods
 are in C, so you have to change it both on the c and the python
 level.

I'm OK to do the code changes, I don't think that's a concern at the

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

well, not entirely orthogonal -- they are the some concept, used in
different contexts, so there is some benefit to their having
similarity. So Id advocate for using the same flag names in any case
-- i.e. C and F in both cases.

 I think we are now more or less agreeing that:

 np.reshape(a, (3, 4), index_order='F')

 is at least as clear as:

 np.reshape(a, (3, 4), order='F')

sure.

The trick is:

np.reshape(a, (3, 4), index_order='A')

which in mingling index_order and memory order..

 I believe our job here is to come to some consensus.

yup.

 In that spirit, I think we do agree on these statements above.

with the caveats I just added...

 Now we have the cost / benefit.

 Benefit : Some people may find it easier to understand numpy when
 these constructs are separated.

 Cost : There might be some confusion because we have changed the
 default keywords.

 Benefit
 ---

 What proportion of people would find it easier to understand with the
 order constructs separated?

It's not just numbers -- it's depth of confusion -- if, once you get
it, you remember it for the rest of your numpy use, then it's not big
deal. However, if you need to re-think and test every time you
re-visit reshape or ravel, then there's a significant benefit.

We are talking about separating the concepts, but I think it takes
more than a keyword change to do that -- the 'A' and 'K' flags mingle
the concpets, and are going to be confusing with new keywords -- maybe
even more so (it says index_order, but the docstring talks about
memory order)

Does anyone think we should depreciate the 'A' and 'K' flags?

Before you answer that -- does anyone see a use case for the 'A' and
'K' flags that can't be reasonably easily accomplished with .view() or
asarray() or ???

if we get rid of the 'A' and 'K' flags, I think think the docstring
will be more clear, and there may be less need for two names for the
different order concepts (though we could change the flags and the
keywords...)

 The ravel docstring would looks something like this:

 index_order : {'C','F', 'A', 'K'}, optional
 ...   This keyword used to be called simply 'order', and you can
 also use the keyword 'order' to specify index_order (this parameter).

 The problem would then be that, for a while, there will be older code
 and docs using 'order' instead of 'index_order'.  I think this would
 not cause much trouble.  Reading the docstring will explain the
 change.  The old code will continue to work.

not a killer, I agree.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Matthew Brett
Hi,

On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 Brief thank you for your helpful and thoughtful discussion.

 well, not entirely orthogonal -- they are the some concept, used in
 different contexts,

 Here's a further clarification, in the hope that it is helpful:

 Input and output index orderings are orthogonal - I can read the data
 with C index ordering and return an array that is index ordered
 any-old-how.

 F and C are used in the sense of F contiguous and C contiguous - where
 contiguous is not the same concept as index ordering.

 So I think it's hard to say these concepts are not orthogonal, simply
 in the technical sense that order='F could mean:

 * read my data using F-style index ordering
 * return my data in an array using F-style index ordering
 * (related to above) return my data in F-contiguous memory layout

Sorry this is not well-put and should increase confusion rather than
decrease it.  I'll try again if I may.

What do we mean by 'Fortran' 'order'.

Two things :

* np.array(a, order='F') - Fortran contiguous : the array memory is
contiguous, the strides vector is strictly increasing
* np.ravel(a, order='F') - first-to-last index ordering used to
recover values from the array

They are related in the sense that Fortran contiguous layout in memory
means that returning the elements as stored in memory gives the same
answer as first to last index ordering.  They are different in the
sense that first-to-last index ordering applies to any memory layout -
is orthogonal to memory layout.   In particular 'contiguous' has no
meaning for first-to-last or last-to-first index ordering.

So - to restate in other words - this :

np.reshape(a, (3, 4), order='F')

could reasonably mean one of two orthogonal things

1) Retrieve data from the array using first-to-last indexing, return
any memory layout you like
2) Retrieve data from the array using the default last-to-first index
ordering, and return memory in F-contiguous layout

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread josef . pktd
On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 Brief thank you for your helpful and thoughtful discussion.

 well, not entirely orthogonal -- they are the some concept, used in
 different contexts,

 Here's a further clarification, in the hope that it is helpful:

 Input and output index orderings are orthogonal - I can read the data
 with C index ordering and return an array that is index ordered
 any-old-how.

 F and C are used in the sense of F contiguous and C contiguous - where
 contiguous is not the same concept as index ordering.

 So I think it's hard to say these concepts are not orthogonal, simply
 in the technical sense that order='F could mean:

 * read my data using F-style index ordering
 * return my data in an array using F-style index ordering
 * (related to above) return my data in F-contiguous memory layout

 Sorry this is not well-put and should increase confusion rather than
 decrease it.  I'll try again if I may.

 What do we mean by 'Fortran' 'order'.

 Two things :

 * np.array(a, order='F') - Fortran contiguous : the array memory is
 contiguous, the strides vector is strictly increasing
 * np.ravel(a, order='F') - first-to-last index ordering used to
 recover values from the array

 They are related in the sense that Fortran contiguous layout in memory
 means that returning the elements as stored in memory gives the same
 answer as first to last index ordering.  They are different in the
 sense that first-to-last index ordering applies to any memory layout -
 is orthogonal to memory layout.   In particular 'contiguous' has no
 meaning for first-to-last or last-to-first index ordering.

 So - to restate in other words - this :

 np.reshape(a, (3, 4), order='F')

 could reasonably mean one of two orthogonal things

 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout

no to interpretation 2)
reshape and ravel (in contrast to flatten) just return a view (if possible)
(with possible some strange strides)

docstring:

numpy.reshape(a, newshape, order='C')
Gives a new shape to an array without changing its data


functions that return views versus functions that create new arrays

Josef


 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Matthew Brett
Hi,

On Thu, Apr 4, 2013 at 12:54 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 Brief thank you for your helpful and thoughtful discussion.

 well, not entirely orthogonal -- they are the some concept, used in
 different contexts,

 Here's a further clarification, in the hope that it is helpful:

 Input and output index orderings are orthogonal - I can read the data
 with C index ordering and return an array that is index ordered
 any-old-how.

 F and C are used in the sense of F contiguous and C contiguous - where
 contiguous is not the same concept as index ordering.

 So I think it's hard to say these concepts are not orthogonal, simply
 in the technical sense that order='F could mean:

 * read my data using F-style index ordering
 * return my data in an array using F-style index ordering
 * (related to above) return my data in F-contiguous memory layout

 Sorry this is not well-put and should increase confusion rather than
 decrease it.  I'll try again if I may.

 What do we mean by 'Fortran' 'order'.

 Two things :

 * np.array(a, order='F') - Fortran contiguous : the array memory is
 contiguous, the strides vector is strictly increasing
 * np.ravel(a, order='F') - first-to-last index ordering used to
 recover values from the array

 They are related in the sense that Fortran contiguous layout in memory
 means that returning the elements as stored in memory gives the same
 answer as first to last index ordering.  They are different in the
 sense that first-to-last index ordering applies to any memory layout -
 is orthogonal to memory layout.   In particular 'contiguous' has no
 meaning for first-to-last or last-to-first index ordering.

 So - to restate in other words - this :

 np.reshape(a, (3, 4), order='F')

 could reasonably mean one of two orthogonal things

 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout

 no to interpretation 2)
 reshape and ravel (in contrast to flatten) just return a view (if possible)
 (with possible some strange strides)

'No' meaning what?  That it is not possible that it could mean that?
Obviously we're not arguing about whether it does mean that, we're
arguing about whether such an interpretation would make sense.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread josef . pktd
On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 4, 2013 at 12:54 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 Brief thank you for your helpful and thoughtful discussion.

 well, not entirely orthogonal -- they are the some concept, used in
 different contexts,

 Here's a further clarification, in the hope that it is helpful:

 Input and output index orderings are orthogonal - I can read the data
 with C index ordering and return an array that is index ordered
 any-old-how.

 F and C are used in the sense of F contiguous and C contiguous - where
 contiguous is not the same concept as index ordering.

 So I think it's hard to say these concepts are not orthogonal, simply
 in the technical sense that order='F could mean:

 * read my data using F-style index ordering
 * return my data in an array using F-style index ordering
 * (related to above) return my data in F-contiguous memory layout

 Sorry this is not well-put and should increase confusion rather than
 decrease it.  I'll try again if I may.

 What do we mean by 'Fortran' 'order'.

 Two things :

 * np.array(a, order='F') - Fortran contiguous : the array memory is
 contiguous, the strides vector is strictly increasing
 * np.ravel(a, order='F') - first-to-last index ordering used to
 recover values from the array

 They are related in the sense that Fortran contiguous layout in memory
 means that returning the elements as stored in memory gives the same
 answer as first to last index ordering.  They are different in the
 sense that first-to-last index ordering applies to any memory layout -
 is orthogonal to memory layout.   In particular 'contiguous' has no
 meaning for first-to-last or last-to-first index ordering.

 So - to restate in other words - this :

 np.reshape(a, (3, 4), order='F')

 could reasonably mean one of two orthogonal things

 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout

 no to interpretation 2)
 reshape and ravel (in contrast to flatten) just return a view (if possible)
 (with possible some strange strides)

 'No' meaning what?  That it is not possible that it could mean that?
 Obviously we're not arguing about whether it does mean that, we're
 arguing about whether such an interpretation would make sense.

'No' means: I don't think it makes sense given the current behavior of numpy
with respect to functions that are designed to return views
(and copy memory only if there is no way to make a view)

One objective of functions that create views is *not* to change the underlying
memory. So in most cases, requesting a specific contiguity (memory order)
for a new array, when you actually want a view with strides, doesn't
sound like an obvious explanation for order.

---
slightly more difficult:
order = I don't care (aka. order=K) means: I want a view in whichever order
of the values, but please try harder not to copy any memory
This also doesn't refer to the memory of a *new* array, if it is
really necessary
to copy.

Josef


 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Matthew Brett
Hi,

On Thu, Apr 4, 2013 at 1:33 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 4, 2013 at 12:54 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 Brief thank you for your helpful and thoughtful discussion.

 well, not entirely orthogonal -- they are the some concept, used in
 different contexts,

 Here's a further clarification, in the hope that it is helpful:

 Input and output index orderings are orthogonal - I can read the data
 with C index ordering and return an array that is index ordered
 any-old-how.

 F and C are used in the sense of F contiguous and C contiguous - where
 contiguous is not the same concept as index ordering.

 So I think it's hard to say these concepts are not orthogonal, simply
 in the technical sense that order='F could mean:

 * read my data using F-style index ordering
 * return my data in an array using F-style index ordering
 * (related to above) return my data in F-contiguous memory layout

 Sorry this is not well-put and should increase confusion rather than
 decrease it.  I'll try again if I may.

 What do we mean by 'Fortran' 'order'.

 Two things :

 * np.array(a, order='F') - Fortran contiguous : the array memory is
 contiguous, the strides vector is strictly increasing
 * np.ravel(a, order='F') - first-to-last index ordering used to
 recover values from the array

 They are related in the sense that Fortran contiguous layout in memory
 means that returning the elements as stored in memory gives the same
 answer as first to last index ordering.  They are different in the
 sense that first-to-last index ordering applies to any memory layout -
 is orthogonal to memory layout.   In particular 'contiguous' has no
 meaning for first-to-last or last-to-first index ordering.

 So - to restate in other words - this :

 np.reshape(a, (3, 4), order='F')

 could reasonably mean one of two orthogonal things

 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout

 no to interpretation 2)
 reshape and ravel (in contrast to flatten) just return a view (if possible)
 (with possible some strange strides)

 'No' meaning what?  That it is not possible that it could mean that?
 Obviously we're not arguing about whether it does mean that, we're
 arguing about whether such an interpretation would make sense.

 'No' means: I don't think it makes sense given the current behavior of numpy
 with respect to functions that are designed to return views
 (and copy memory only if there is no way to make a view)

OK - so no-one is suggesting that it is a good option, only that the
concept makes sense.

As I was saying before - for most of us it is still possible to get
confused between two different meanings of the same word even if one
of the meanings would (for complicated reasons) be less likely than
the other.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Sebastian Berg
On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote:
 Hi,
 
snip
 
 So - to restate in other words - this :
 
 np.reshape(a, (3, 4), order='F')
 
 could reasonably mean one of two orthogonal things
 
 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout
 

Yes, it could mean both. I am simply not sure if it helps enough to
warrant the trouble. So if it still interests someone, I feel the docs
are more important, but I am neutral to changing this.
I don't quite see a big gain, so I am just worried that it bugs a lot of
people either because of changing or because of having to remember the
different name (you can argue that is good, but if it bugs most maybe it
does not help either).

As to being confused. Did anyone ever see a np.reshape(arr, ...,
order='F') and then continuing assuming the result is F-contiguous (when
the original arr is not known to be contiguous)? If that actually create
a real bug somewhere, that might actually convince me that it is worth
it to walk through trouble and complaints. I guess I just don't believe
it really happens in the real world.

- Sebastian

 Cheers,
 
 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Matthew Brett
Hi,

On Thu, Apr 4, 2013 at 1:53 PM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote:
 Hi,

 snip

 So - to restate in other words - this :

 np.reshape(a, (3, 4), order='F')

 could reasonably mean one of two orthogonal things

 1) Retrieve data from the array using first-to-last indexing, return
 any memory layout you like
 2) Retrieve data from the array using the default last-to-first index
 ordering, and return memory in F-contiguous layout


 Yes, it could mean both. I am simply not sure if it helps enough to
 warrant the trouble. So if it still interests someone, I feel the docs
 are more important, but I am neutral to changing this.

I don't think the docs enter the discussion, because we all agree that
changing the docs is a good idea.

 I don't quite see a big gain, so I am just worried that it bugs a lot of
 people either because of changing or because of having to remember the
 different name (you can argue that is good, but if it bugs most maybe it
 does not help either).

 As to being confused. Did anyone ever see a np.reshape(arr, ...,
 order='F') and then continuing assuming the result is F-contiguous (when
 the original arr is not known to be contiguous)? If that actually create
 a real bug somewhere, that might actually convince me that it is worth
 it to walk through trouble and complaints. I guess I just don't believe
 it really happens in the real world.

There are two aspects here;

1) Making numpy easier to understand and teach.
2) Avoiding bugs

I'm thinking primarily of the first.   I would hate to teach the thing
in the current state.   As I've said many times before, I found it
very confusing, others have said so too.  The more confusing it is,
the more likely people will make mistakes.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith n...@pobox.com wrote:
snip
 Maybe we should go through and rename order to something more descriptive
 in each case, so we'd have
   a.reshape(..., index_order=C)
   a.copy(memory_order=F)
 etc.?

I'd like to propose this instead:

a.reshape(..., order=C)
a.copy(layout=F)

This fits well with the terms we've been using during the discussion.
It reduces the changes to only one of the two meanings.

Thinking about it, I feel that this would have been considerably
clearer to me as I learned numpy.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread Chris Barker - NOAA Federal
On Thu, Apr 4, 2013 at 11:26 AM,  josef.p...@gmail.com wrote:
 Before you answer that -- does anyone see a use case for the 'A' and
 'K' flags that can't be reasonably easily accomplished with .view() or
 asarray() or ???

 What order does   a[a2]  use to create the returned 1-D array?
...
 However, I never needed to know and never cared
 a[a2] = 5
 a[a2] = b[a2]

 Now, after this thread, I know about K,

does that use case use ravel() or reshape() under the hood?

 and there might be cases
 where it would be appropriate to minimize copying memory,

hmm -- yes, that makes sense, and perhaps compelling enough to keep
them around (at least with perhaps better docs).

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread josef . pktd
On Thu, Apr 4, 2013 at 5:54 PM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 On Thu, Apr 4, 2013 at 11:26 AM,  josef.p...@gmail.com wrote:
 Before you answer that -- does anyone see a use case for the 'A' and
 'K' flags that can't be reasonably easily accomplished with .view() or
 asarray() or ???

 What order does   a[a2]  use to create the returned 1-D array?
 ...
 However, I never needed to know and never cared
 a[a2] = 5
 a[a2] = b[a2]

 Now, after this thread, I know about K,

 does that use case use ravel() or reshape() under the hood?

only ravel has K as far as I saw in the current documentation.

example for ravel(K) would be if axis=None in functions and we
only have elementwise or reduce operations.
All the code I've seen uses just ravel() in this case, instead,
ravel(K) would have a better chance to avoid array copying,

if axis is None:
   x = x.ravel(K)
return ((x - x.mean(0))**2).sum(0)

but it's dangerous because, if there is a second array, it might not ravel(K)
the same way
x.ravel(K) - y.ravel(K) sounds fun

similar if x[mask] wouldn't select a fixed order, then
a[a2] = b[a2] would also be fun

fun := find the bug that I have hidden in this code

The only reason to use reshape with A, I can think
of, is, if the array (matrix) is symmetric, or if it's a square
picture and we never care whether it's upright or sideways.

reshape(.., order=A) and ravel(A) should roundtrip, I guess.

Josef


 and there might be cases
 where it would be appropriate to minimize copying memory,

 hmm -- yes, that makes sense, and perhaps compelling enough to keep
 them around (at least with perhaps better docs).

 -Chris


 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-04 Thread josef . pktd
Catching up with numpy 1.6


 'No' means: I don't think it makes sense given the current behavior of numpy
 with respect to functions that are designed to return views
 (and copy memory only if there is no way to make a view)

 One objective of functions that create views is *not* to change the underlying
 memory. So in most cases, requesting a specific contiguity (memory order)
 for a new array, when you actually want a view with strides, doesn't
 sound like an obvious explanation for order.


why I'm buffled:

To me views are just  a specific way of looking at an existing array, or
parts of it, similar to an iteratior but with an n-dimensional shape.

ravel is just like calling list(iterator), the iterator determines how we read
the existing array.

So, asking about the output memory order made no sense to me. What's
the output of an iterator?

I (and statsmodels) are still on numpy 1.5 but not for much longer.
So I'm trying to read up

http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#single-array-iteration
explains the case for K :

for elementwise operations just run the fastest way through the array


The old flat and flatiter where always c-order.


 a = np.arange(4*5).reshape(4,5)
 b = np.array(a, order='F')
 np.fromiter(np.nditer(b, order='K'), int)
array([ 0,  5, 10, 15,  1,  6, 11, 16,  2,  7, 12, 17,  3,  8, 13, 18,  4,
9, 14, 19])
 np.fromiter(np.nditer(a, order='K'), int)
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
   17, 18, 19])

Is ravel('K') good for anything ?

 def f(x):
'''A function that only works in 1d'''
if x.ndim  1: raise ValueError
return np.round(np.piecewise(x, [x  0, x = 0], [lambda x:
np.sqrt(-x), lambda x: np.sqrt(x)]))

 b = np.array(np.arange(4*5.).reshape(4,5), order='F')
 b
array([[  0.,   1.,   2.,   3.,   4.],
   [  5.,   6.,   7.,   8.,   9.],
   [ 10.,  11.,  12.,  13.,  14.],
   [ 15.,  16.,  17.,  18.,  19.]])

 f(b[:,:2])
Traceback (most recent call last):
  File pyshell#184, line 1, in module
f(b[:,:2])
  File pyshell#183, line 2, in f
if x.ndim  1: raise ValueError
ValueError

ravel and reshape with 'K' doesn't roundtrip

 (b.ravel('K')).reshape(b.shape, order='K')
array([[  0.,   5.,  10.,  15.,   1.],
   [  6.,  11.,  16.,   2.,   7.],
   [ 12.,  17.,   3.,   8.,  13.],
   [ 18.,   4.,   9.,  14.,  19.]])


but we can do inplace transformations with it

 e = b[:,:2].ravel()
 e.flags.owndata
True
 e = b[:,:2].ravel('K')
 e.flags.owndata
False


 e[:] = f(e)
 b
array([[  0.,   1.,   2.,   3.,   4.],
   [  2.,   2.,   7.,   8.,   9.],
   [  3.,   3.,  12.,  13.,  14.],
   [  4.,   4.,  17.,  18.,  19.]])
 e[:] = f(e)
 b
array([[  0.,   1.,   2.,   3.,   4.],
   [  1.,   1.,   7.,   8.,   9.],
   [  2.,   2.,  12.,  13.,  14.],
   [  2.,   2.,  17.,  18.,  19.]])

(A few hours of experimenting is more that I wanted to know,
99.5% of my cases are order='C' or order='F')

nditer has also an interesting section on Iterator-Allocated Output Arrays

Josef
I found the scissors
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread josef . pktd
On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

 I guess it could not make sense to do this:

 np.ravel(a, index_order='C', memory_order='F')

 It could make sense to do this:

 np.reshape(a, (3,4), index_order='F, memory_order='F')

 but that just points out the inherent confusion between the uses of
 'order', and in this case, the fact that you can only do:

 np.reshape(a, (3, 4), index_order='F')

 correctly distinguishes between the meanings.

So, if index_order and memory_order are never in the same function,
then the context should be enough. It was always enough for me.

np.reshape(a, (3,4), index_order='F, memory_order='F')
really hurts my head because you mix a function that operates on
views, indexing and shapes with memory creation, (or I have
no idea what memory_order should do in this case).

np.asarray(a.reshape(3,4 order=F), order=F)
or the example here
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asfortranarray.html?highlight=asfortranarray#numpy.asfortranarray
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html
keeps functions with index_order and functions with memory_order
nicely separated.

(It might be useful but very confusing to add memory_order to every function
 that creates a view if possible and a copy if necessary: If you have to make
a copy, then I want F memory order, otherwise give me a view
But I cannot find a candidate function right now, except for ravel and reshape
see first notes in
docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
)


a day later (haven't changed my mind):

isn't specifying index order in the Parameter section enough as an
explanation?

something like:

```
def ravel

Parameters

order :
   index order how the array is stacked into a 1d array. F means we
stack by columns
   (fortran order, first index first),C means we stack by rows
(c-order, last index first)
```

most array *creation* functions explicitly mention memory layout in
the docstring

Josef


 Best,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

me too...

 But I would really love if someone would try to make the documentation
 simpler!

yes, I think this is where the solution lies.

 There is also never a mention of contiguity, even though when
 we refer to memory order, then having a C/F contiguous array is often
 the reason why

good point -- in fact, I have no idea what would happen in many of
these cases for a discontiguous array (or one with arbitrarily weird
strides...)

  Also 'A' seems often explained not
 quite correctly (though that does not matter (except for reshape, where
 its explanation is fuzzy), it will matter more in the future -- even if
 I don't expect 'A' to be actually used).

I wonder about having a 'A' option in reshape at all -- what the heck
does it mean? why do we need it? Again, I come back to the fact that
memory order is kind-of orthogonal to index order. So for reshape (or
ravel, which is really just a special case of reshape...) the 'A' flag
and 'K' flag (huh?) is pretty dangerous, and prone to error. I think
of it this way:

Much of the beauty of numpy is that it presents a consistent interface
to various forms of strided data -- that way, folks can write code
that works the same way for any ndarray, while still being able to
have internal storage be efficient for the use at hand -- i.e. C order
for the common case, Fortran order for interaction with libraries that
expect that order (or for algorithms that are more efficient in that
order, though that's mostly external libs..), and non-contiguous data
so one can work on sub-parts of arrays without copying data around.

In most places, the numpy API hides the internal memory order -- this
is a good thing, most people have no need to think about it (or most
code, anyway), and you can write code that works (even if not
optimally) for any (strided) memory layout. All is good.

There are times when you really need to understand, or control or
manipulate the memory layout, to make sure your routines are
optimized, or the data is in the right form to pass of to an external
lib, or to make sense of raw data read from a file, or... That's what
we have .view() and friends for.

However, the 'A' and 'K' flags mix and match these concepts -- and I
think that's dangerous. it would be easy for the a to use the 'A'
flag, and have everything work fine and dandy with all their test
cases, only to have it blow up when  someone passes in a
different-than-expected array. So really, they should only be used in
cases where the code has checked memory order before hand, or in a
really well-defined interface where you know exactly what you're
getting. In those cases, it makes the code far more clear an less
error prone to do you re-arranging of the memory in a separate step,
rather than built-in to a ravel() or reshape() call.

[note] -- I wrote earlier that I wasn't confused by the ravel()
examples -- true for teh 'c' and 'F' flags, but I'm still not at all
clear what 'A' and 'K' woudl give me -- particularly for 'A' and
reshape()

So I think the cause of the confusion here is not that we use order
in two different contexts, nor the fact that 'C' and 'F' may not mean
anything to some people, but that we are conflating two different
process in one function, and with one flag.

My (maybe) proposal: we deprecate the 'A' and 'K' flags in ravel() and
reshape(). (maybe even deprecate ravel() -- does it add anything to
reshape? If not deprecate, at least encourage people in the docs not
to use them, and rather do their memory-structure manipulations with
.view or stride manipulation, or...

I'm still trying to figure out when you'd want the 'A' flag -- it
seems at the end of your operation you will want:

The resulting array to be a particular shape, with the elements in a
particular order

and

You _may_ want the in-memory layout a certain way.

but 'A' can't ensure both of those.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Matthew Brett
Hi,

On Wed, Apr 3, 2013 at 5:19 AM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

 I guess it could not make sense to do this:

 np.ravel(a, index_order='C', memory_order='F')

 It could make sense to do this:

 np.reshape(a, (3,4), index_order='F, memory_order='F')

 but that just points out the inherent confusion between the uses of
 'order', and in this case, the fact that you can only do:

 np.reshape(a, (3, 4), index_order='F')

 correctly distinguishes between the meanings.

 So, if index_order and memory_order are never in the same function,
 then the context should be enough. It was always enough for me.

It was not enough for me or the three others who will publicly admit
to the shame of finding it confusing without further thought.

Again, I just can't see a reason not to separate these ideas.  We are
not arguing about backwards compatibility here, only about clarity.  I
guess you do accept that some people, other than yourself, might be
less likely to get tripped up by:

np.reshape(a, (3, 4), index_order='F')

than

np.reshape(a, (3, 4), order='F')

?

 np.reshape(a, (3,4), index_order='F, memory_order='F')
 really hurts my head because you mix a function that operates on
 views, indexing and shapes with memory creation, (or I have
 no idea what memory_order should do in this case).

Right.   I think you may now be close to my own discomfort when faced
with working out (fast) what:

np.reshape(a, (3,4), order='F')

means, given 'order' means two different things, and both might be
relevant here.

Or are you saying that my brain should have quickly calculated that
that 'order' would be difficult to understand as memory layout and
therefore rejected that and seen immediately that index order was the
meaning?   Speaking as a psychologist,  I don't think that's the way
it works.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Matthew Brett
Hi,

On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

 me too...

 But I would really love if someone would try to make the documentation
 simpler!

 yes, I think this is where the solution lies.

No question that better docs would be an improvement, let's all agree on that.

We all agree that 'order' is used with two different and orthogonal
meanings in numpy.

I think we are now more or less agreeing that:

np.reshape(a, (3, 4), index_order='F')

is at least as clear as:

np.reshape(a, (3, 4), order='F')

Do I have that right so far?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com wrote:
 It was not enough for me or the three others who will publicly admit
 to the shame of finding it confusing without further thought.

I would submit that some of the confusion came from the fact that with
ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH
index_order and memory_order -- with one flag -- I know I'm still not
clear what I'd get in complex situations.

 Again, I just can't see a reason not to separate these ideas.

I agree, but really separating them -- but ideally having a given
function only deal with one or the other, not both at once.

  We are
 not arguing about backwards compatibility here, only about clarity.

while it could be changed while strictly maintaining backward
compatibility -- it is a change that would need to filter through the
docs, example, random blog posts, stack=overflow questions, etc..

Is that worth it? I'm not convinced

 Right.   I think you may now be close to my own discomfort when faced
 with working out (fast) what:

 np.reshape(a, (3,4), order='F')

I still think it's cause you know too much ;-)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Ralf Gommers
On Wed, Apr 3, 2013 at 11:52 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com
 wrote:
  It was not enough for me or the three others who will publicly admit
  to the shame of finding it confusing without further thought.

 I would submit that some of the confusion came from the fact that with
 ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH
 index_order and memory_order -- with one flag -- I know I'm still not
 clear what I'd get in complex situations.

  Again, I just can't see a reason not to separate these ideas.

 I agree, but really separating them -- but ideally having a given
 function only deal with one or the other, not both at once.

   We are
  not arguing about backwards compatibility here, only about clarity.

 while it could be changed while strictly maintaining backward
 compatibility -- it is a change that would need to filter through the
 docs, example, random blog posts, stack=overflow questions, etc..


Not only that, we would then also be in the situation of having `order`
*and* `xxx_order` keywords. This is also confusing, at least as much as the
current situation imho.

Ralf


 Is that worth it? I'm not convinced

  Right.   I think you may now be close to my own discomfort when faced
  with working out (fast) what:
 
  np.reshape(a, (3,4), order='F')

 I still think it's cause you know too much ;-)

 -Chris


 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread josef . pktd
On Wed, Apr 3, 2013 at 9:13 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

 me too...

 But I would really love if someone would try to make the documentation
 simpler!

 yes, I think this is where the solution lies.

 No question that better docs would be an improvement, let's all agree on 
 that.

 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 I think we are now more or less agreeing that:

 np.reshape(a, (3, 4), index_order='F')

 is at least as clear as:

 np.reshape(a, (3, 4), order='F')

 I believe uur job here is to come to some consensus.

 In that spirit, I think we do agree on these statements above.

 Now we have the cost / benefit.

 Benefit : Some people may find it easier to understand numpy when
 these constructs are separated.

 Cost : There might be some confusion because we have changed the
 default keywords.

 Benefit
 ---

 What proportion of people would find it easier to understand with the
 order constructs separated?   Clearly Chris and Josef and Sebastian -
 you estimate I think no change in your understanding, because your
 understanding was near complete already.

 At least I, Paul Ivanov, JB Poline found the current state strikingly
 confusing.   I think we have other votes for that position here.  It's
 difficult to estimate the proportions now because my original email
 and the subsequent discussion are based on the distinction already
 being made.  So, it is hard for us to be objective about whether a new
 user is likely to get confused.  At least it seems reasonable to say
 that some moderate proportion of users will get confused.

 In that situation, it seems to me the long-term benefit for separating
 these ideas is relatively high.   The benefit will continue over the
 long term.

 Cost
 ---

 The ravel docstring would looks something like this:

 index_order : {'C','F', 'A', 'K'}, optional
 ...   This keyword used to be called simply 'order', and you can
 also use the keyword 'order' to specify index_order (this parameter).

 The problem would then be that, for a while, there will be older code
 and docs using 'order' instead of 'index_order'.  I think this would
 not cause much trouble.  Reading the docstring will explain the
 change.  The old code will continue to work.

 This cost will decrease to zero over time.

 So, if we are planning for the long-term for numpy, I believe the
 benefit to the change considerably outweighs the cost.

 I'm happy to do the code changes, so that's not an issue.

 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Éric Depagne
Hi all, 

Since we're mentionning obvious and non-obvious naming, 


 
 I think you agree that there is potential for confusion, and there
 doesn't seem any reason to continue with that confusion if we can come
 up with a clearer name.
 
 So here is a compromise proposal.
 
 How about:
 
 * Preferring the names 'c-style' and 'f-style' for the indexing order
 case (ravel, reshape, flatiter)

This naming scheme is obvious for the ones that have been doing some coding for 
a long time, but they tend not to speak to anyone else. Why not use naming 
that are a little bit more explicit (and of course, keep the legacy naming 
available), and use 'row-first' and 'column-first' (or anything else that may 
be 
more explicit) ?

Cheers, 

Éric.

 * Leaving 'C and 'F' as functional shortcuts, so there is no possible
 backwards-compatibility problem.
 
 Would you object to that?
 
 Cheers,
 
 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
Un clavier azerty en vaut deux
--
Éric Depagnee...@depagne.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Nathaniel Smith
On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett matthew.br...@gmail.com
wrote:
 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and
reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory
ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

Surely it should be Z and ᴎ? ;-)

I knew what your examples would produce, but only because I've bumped into
this before. When you do reshapes of various sorts (ravel() ==
reshape((-1,))), then, like you say, there are two totally different sets
of coordinate mapping in play:

chunk of memory  -1-  virtual array layout  -2-  new array layout
  (C pointers)   ---(Python indexes)---  (Python indexes)

Mapping (1) is determined by the array strides, and you have to think about
it when you interface with C code, but at the Python level it's pretty much
irrelevant; all operations are defined at the virtual array layout level.

Further confusing the issue is the fact that the vast majority of legal
memory-virtual array mappings are *neither* C- nor F-ordered. Strides are
very flexible.

Further further confusing the issue is that mapping (2) actually consists
of two mappings: if you have an array with shape (3, 4, 5) and reshape it
to (4, 15), then the way you work out the overall mapping is by first
mapping the (3, 4, 5) onto a flat 1-d space with 60 elements, and then
mapping *that* to the (4, 15) space.

Anyway, I agree that this is very confusing; certainly it confused me. If
you bump into these two mappings just in passing, and separately, then it's
very easy to miss the fact that they have nothing to do with each other.
And I agree that using exactly the same terminology for both of them is
part of what causes this. I even kind of like the Z/N naming scheme (I
still have to look up what C/F actually mean every time, I'm ashamed to
say).

But I don't see how the proposed solution helps, because the problem isn't
that mapping (1) and (2) use different ordering schemes -- the
column-major/row-major distinction really does apply to both equally. Using
different names for those seems like it will 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Andrew Jaffe


 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?


 Personally I think it is clear enough and that Z and N would confuse
 me just as much (though I am used to the other names). Also Z and N
 would seem more like aliases, which would also make sense in the memory
 order context.
 If anything, I would prefer renaming the arguments iteration_order and
 memory_order, but it seems overdoing it...
 Maybe the documentation could just be checked if it is always clear
 though. I.e. maybe it does not use iteration or memory order
 consistently (though I somewhat feel it is usually clear that it must be
 iteration order, since no numpy function cares about the input memory
 order as they will just do a copy if necessary).

I have been using both C and Fortran for 25 or so years. Despite that, I 
have to sit and think every time I need to know which way the arrays are 
stored, basically by remembering that in fortran you do (I,J,*) for an 
assumed-size array.

So I *love* the idea of 'Z' and 'N' which I understood immediately.

Andrew




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Chris Barker - NOAA Federal
On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Thank you for the compliment, it's more enjoyable than other potential
 explanations of my confusion (sigh).

 But, I don't think that is the explanation.

well, the core explanation is these are difficult and intertwined
concepts...And yes, better names and better docs can help.

 Last, as soon as we came to the distinction between index order and
 memory layout, it was clear.

 We all agreed that this was an important distinction that would
 improve numpy if we made it.

yup.

 I think you agree that there is potential for confusion, and there
 doesn't seem any reason to continue with that confusion if we can come
 up with a clearer name.

well, changing an API is not to be taken lightly -- we are not
discussion how we'd do it if we were to start from fresh here. So any
change should make things enough better that it is worth dealing with
the process of teh change.

 So here is a compromise proposal.

 * Preferring the names 'c-style' and 'f-style' for the indexing order
 case (ravel, reshape, flatiter)

 * Leaving 'C and 'F' as functional shortcuts, so there is no possible
 backwards-compatibility problem.

seems reasonable enough -- though even with the backward
compatibility, users will be faces with many, many older examples and
docs that use C' and 'F', while the new ones refer to the new names
-- might this be cause for even more confusion (at least for a few
years...)

leaving me with an equivocal +0 on that 

antoher thought:


Definition: np.ravel(a, order='C')

A 1-D array, containing the elements of the input, is returned.  A copy is
made only if needed.

Parameters
--
a : array_like
Input array.  The elements in ``a`` are read in the order specified by
`order`, and packed as a 1-D array.
order : {'C','F', 'A', 'K'}, optional
The elements of ``a`` are read in this order. 'C' means to view
the elements in C (row-major) order. 'F' means to view the elements
in Fortran (column-major) order. 'A' means to view the elements
in 'F' order if a is Fortran contiguous, 'C' order otherwise.
'K' means to view the elements in the order they occur in memory,
except for reversing the data when strides are negative.
By default, 'C' order is used.


Does ravel need to support the 'A' and 'K' options? It's kind of an
advanced use, and really more suited to .view(), perhaps?

What I'm getting at is that this version of ravel() conflates the two
concepts: virtual ordering and memory ordering in one function --
maybe they should be considered as two different functions altogether
-- I think that would make for less confusion.

Éric Depagne wrote:
 'row-first' and 'column-first' (or anything else that may be more explicit) ?

I like more explicit, but 'row-first' and 'column-first' have two
issues: 1) what about higher dimension arrays?, and 2) the row and
column convention is only that -- a convention -- I guess it's the
way numpy prints, which gives it some meaning, but there are times
when arrays are ordered: (col, row), rather than (row, col) (PIL uses
that format for instance)

I like the Z and N, and  maybe even if they aren't used as flag names,
they could be used in teh docstring -- nice and ascii safe

Nathaniel wrote:
To see this, note that semantically it would be perfectly possible for 
.reshape() to
 take *two* order= arguments: one to specify the coordinate space mapping (2),
 and the other to specify the desired memory layout used by the result array 
 (1). Of
 course we shouldn't actually do this, because in the unlikely event that 
 someone
 actually wanted both of these they could just call asarray() on the output of
 reshape().

exactly -- my point about keeping the raveling with virtual order
separate from reveling with memory order -- it's really not critical
that you can do both with one function call.

-Chris










-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith n...@pobox.com wrote:
 On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett matthew.br...@gmail.com
 wrote:
 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Surely it should be Z and ᴎ? ;-)

 I knew what your examples would produce, but only because I've bumped into
 this before. When you do reshapes of various sorts (ravel() ==
 reshape((-1,))), then, like you say, there are two totally different sets of
 coordinate mapping in play:

 chunk of memory  -1-  virtual array layout  -2-  new array layout
   (C pointers)   ---(Python indexes)---  (Python indexes)

 Mapping (1) is determined by the array strides, and you have to think about
 it when you interface with C code, but at the Python level it's pretty much
 irrelevant; all operations are defined at the virtual array layout level.

 Further confusing the issue is the fact that the vast majority of legal
 memory-virtual array mappings are *neither* C- nor F-ordered. Strides are
 very flexible.

 Further further confusing the issue is that mapping (2) actually consists of
 two mappings: if you have an array with shape (3, 4, 5) and reshape it to
 (4, 15), then the way you work out the overall mapping is by first mapping
 the (3, 4, 5) onto a flat 1-d space with 60 elements, and then mapping
 *that* to the (4, 15) space.

 Anyway, I agree that this is very confusing; certainly it confused me. If
 you bump into these two mappings just in passing, and separately, then it's
 very easy to miss the fact that they have nothing to do with each other. And
 I agree that using exactly the same terminology for both of them is part of
 what causes this. I even kind of like the Z/N naming scheme (I still
 have to look up what C/F actually mean every time, I'm ashamed to say).

 But I don't see how the proposed solution helps, because the problem isn't
 that mapping (1) and (2) use different ordering schemes -- the
 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Thank you for the compliment, it's more enjoyable than other potential
 explanations of my confusion (sigh).

 But, I don't think that is the explanation.

 well, the core explanation is these are difficult and intertwined
 concepts...And yes, better names and better docs can help.

 Last, as soon as we came to the distinction between index order and
 memory layout, it was clear.

 We all agreed that this was an important distinction that would
 improve numpy if we made it.

 yup.

 I think you agree that there is potential for confusion, and there
 doesn't seem any reason to continue with that confusion if we can come
 up with a clearer name.

 well, changing an API is not to be taken lightly -- we are not
 discussion how we'd do it if we were to start from fresh here. So any
 change should make things enough better that it is worth dealing with
 the process of teh change.

Yes, for sure.  I was only trying to point out that we are not talking
about breaking backwards compatibility.

 So here is a compromise proposal.

 * Preferring the names 'c-style' and 'f-style' for the indexing order
 case (ravel, reshape, flatiter)

 * Leaving 'C and 'F' as functional shortcuts, so there is no possible
 backwards-compatibility problem.

 seems reasonable enough -- though even with the backward
 compatibility, users will be faces with many, many older examples and
 docs that use C' and 'F', while the new ones refer to the new names
 -- might this be cause for even more confusion (at least for a few
 years...)

I doubt it would be 'even more' confusion.  They would only have to
read the docstrings to work out what is meant, and I believe, with
better names, they'd be less likely to fall into the traps I fell
into, at least.

 leaving me with an equivocal +0 on that 

 antoher thought:

 
 Definition: np.ravel(a, order='C')

 A 1-D array, containing the elements of the input, is returned.  A copy is
 made only if needed.

 Parameters
 --
 a : array_like
 Input array.  The elements in ``a`` are read in the order specified by
 `order`, and packed as a 1-D array.
 order : {'C','F', 'A', 'K'}, optional
 The elements of ``a`` are read in this order. 'C' means to view
 the elements in C (row-major) order. 'F' means to view the elements
 in Fortran (column-major) order. 'A' means to view the elements
 in 'F' order if a is Fortran contiguous, 'C' order otherwise.
 'K' means to view the elements in the order they occur in memory,
 except for reversing the data when strides are negative.
 By default, 'C' order is used.
 

 Does ravel need to support the 'A' and 'K' options? It's kind of an
 advanced use, and really more suited to .view(), perhaps?

 What I'm getting at is that this version of ravel() conflates the two
 concepts: virtual ordering and memory ordering in one function --
 maybe they should be considered as two different functions altogether
 -- I think that would make for less confusion.

I think it would conceal the confusion only.   If we don't have 'A'
and 'K' in there, it allows us to keep the dream of a world where 'C
only refers to index ordering, but *only for this docstring*.   As
soon as somebody does ``np.array(arr, order='C')`` they will find
themselves in conceptual trouble again.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread josef . pktd
On Tue, Apr 2, 2013 at 2:04 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Thank you for the compliment, it's more enjoyable than other potential
 explanations of my confusion (sigh).

 But, I don't think that is the explanation.

 well, the core explanation is these are difficult and intertwined
 concepts...And yes, better names and better docs can help.

 Last, as soon as we came to the distinction between index order and
 memory layout, it was clear.

 We all agreed that this was an important distinction that would
 improve numpy if we made it.

 yup.

 I think you agree that there is potential for confusion, and there
 doesn't seem any reason to continue with that confusion if we can come
 up with a clearer name.

 well, changing an API is not to be taken lightly -- we are not
 discussion how we'd do it if we were to start from fresh here. So any
 change should make things enough better that it is worth dealing with
 the process of teh change.

 Yes, for sure.  I was only trying to point out that we are not talking
 about breaking backwards compatibility.

 So here is a compromise proposal.

 * Preferring the names 'c-style' and 'f-style' for the indexing order
 case (ravel, reshape, flatiter)

 * Leaving 'C and 'F' as functional shortcuts, so there is no possible
 backwards-compatibility problem.

 seems reasonable enough -- though even with the backward
 compatibility, users will be faces with many, many older examples and
 docs that use C' and 'F', while the new ones refer to the new names
 -- might this be cause for even more confusion (at least for a few
 years...)

 I doubt it would be 'even more' confusion.  They would only have to
 read the docstrings to work out what is meant, and I believe, with
 better names, they'd be less likely to fall into the traps I fell
 into, at least.

 leaving me with an equivocal +0 on that 

 antoher thought:

 
 Definition: np.ravel(a, order='C')

 A 1-D array, containing the elements of the input, is returned.  A copy is
 made only if needed.

 Parameters
 --
 a : array_like
 Input array.  The elements in ``a`` are read in the order specified by
 `order`, and packed as a 1-D array.
 order : {'C','F', 'A', 'K'}, optional
 The elements of ``a`` are read in this order. 'C' means to view
 the elements in C (row-major) order. 'F' means to view the elements
 in Fortran (column-major) order. 'A' means to view the elements
 in 'F' order if a is Fortran contiguous, 'C' order otherwise.
 'K' means to view the elements in the order they occur in memory,
 except for reversing the data when strides are negative.
 By default, 'C' order is used.
 

 Does ravel need to support the 'A' and 'K' options? It's kind of an
 advanced use, and really more suited to .view(), perhaps?

 What I'm getting at is that this version of ravel() conflates the two
 concepts: virtual ordering and memory ordering in one function --
 maybe they should be considered as two different functions altogether
 -- I think that would make for less confusion.

 I think it would conceal the confusion only.   If we don't have 'A'
 and 'K' in there, it allows us to keep the dream of a world where 'C
 only refers to index ordering, but *only for this docstring*.   As
 soon as somebody does ``np.array(arr, order='C')`` they will find
 themselves in conceptual trouble again.

I still don't see why order is not a general concept, whether it
refers to memory or indexing/iterating.
The qualifier can be made clear in the docstrings (or from the context).

It's all over the documentation:
we can iterate in F-order over an array that is in C-order (*), or vice-versa
(*) or just some strides

http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.nditer.html#numpy.nditer
pure shape
http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-array-shape
shape and copy
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html#numpy.ndarray.flatten
memory
http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-kind-of-array
http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#from-existing-data

Josef


 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Nathaniel Smith
On Tue, Apr 2, 2013 at 6:59 PM, Matthew Brett matthew.br...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith n...@pobox.com wrote:
 Maybe we should go through and rename order to something more descriptive
 in each case, so we'd have
   a.reshape(..., index_order=C)
   a.copy(memory_order=F)
 etc.?

 That seems like a good idea.  If you are proposing it, I am +1.

Well, I'm just throwing it out there as an idea, but if people like
it, nothing better turns up, and someone implements it, then I'm not
going to say no...

 This way if you just bumped into these while reading code, it would still be
 immediately obvious that they were dealing with totally different concepts.
 Compare to reading along without the docs and seeing
   a.reshape(..., order=Z)
   a.copy(order=C)
 That'd just leave me even more baffled than the current system -- I'd start
 thinking that Z and C somehow were different options for the same order=
 option, so they must somehow mean ways of ordering elements?

 I don't think you'd be more baffled than the current system, which, as
 you say, conflates two orthogonal concepts.  Rather, I think it would
 cause the user to stop, as they should, and consider what concept
 order is using in this case.

 I don't find it difficult to explain this:

 There are two different but related concepts of 'order'

 1) The memory layout of the array
 2) The index ordering used to unravel the array

 If you see 'Z' or 'N for 'order' - that refers to index ordering.
 If you see 'C' or 'F for order - that refers to memory layout.

Sure, you can write it down like this, but compare to this system:

If you see 'Z' or 'N for 'order' - that refers to memory ordering.
If you see 'C' or 'F for order - that refers to index layout.

Now suppose I forget which system we actually use -- how do you
remember which system is which? It's totally arbitrary. Now I have
even more things to remember. And I'm certainly not going to work out
this distinction just from seeing these used once or twice in someone
else's code.

This is like observing that if I say go North then it's ambiguous
about whether I want you to drive or walk, and concluding that we need
new words for the directions depending on what sort of vehicle you
use. So go North means drive North, go htuoS means walk North,
etc. Totally silly. Makes much more sense to have one set of words for
directions, and then make clear from context what the directions are
used for -- drive North, walk North. Or iterate C-wards, store
F-wards.

C and Z mean exactly the same thing -- they describe a way of
unraveling a cube into a straight line. The difference is what we do
with the resulting straight line. That's why I'm suggesting that the
distinction should be made in the name of the argument.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 4:07 PM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 On Tue, Apr 2, 2013 at 11:37 AM,  josef.p...@gmail.com wrote:
 I still don't see why order is not a general concept, whether it
 refers to memory or indexing/iterating.

 I agree -- the ordering concept is the same, it's _what_ is being
 ordered that's different. So I say we stick with 'C' and 'F' -- numpy
 users will need to figure out what it means eventually in any case

I'm not quite sure what you are arguing.  I thought we all agreed that
the index ordering idea is *orthogonal* to the memory layout idea?
Not so?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Nathaniel Smith
On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

Z/C/column-major/whatever-you-want-to-call-it is a general strategy
for converting between a 1-dim representation and a n-dim
representation. In the case of memory storage, the 1-dim
representation is the flat space of pointer arithmetic. In the case of
ravel, the 1-dim representation is the flat space of a 1-dim indexed
array. But the 1-dim-to-n-dim part is the same in both cases.

I think that's why you're seeing people baffled by your proposal -- to
them the C refers to this general strategy, and what's different is
the context where it gets applied. So giving the same strategy two
different names is silly; if anything it's the contexts that should
have different names.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread josef . pktd
On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

And once we get into memory optimization (and avoiding copies and
preserving contiguity), it is necessary to keep both orders in mind,
is memory order in F and am I iterating/raveling in F order
(or slicing columns).

I think having two separate keywords give the impression we can
choose two different things at the same time.

Josef



 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread josef . pktd
On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

as aside (math):
numpy.flatten made it into the Wikipedia page
http://en.wikipedia.org/wiki/Vectorization_%28mathematics%29#Programming_language
(and how it's different from R and Matlab/Octave,
but doesn't mention: use order=F to get the same behavior as math
and the others)

and the corresponding code in statsmodels (tools for vector
autoregressive models by Wes)

Josef
baffled?


 Josef



 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

Thanks - but I guess we all agree that

np.array(a, order='C')

and

np.ravel(a, order='F')

are using the term 'order' in two different and orthogonal senses, and
the discussion is about whether it is possible to get confused about
these two senses and, if so, what we should do about it.

Just to repeat what you're suggesting

np.array(a, memory_order='C')
np.ravel(a, index_order='C')
np.ravel(a, index_order='K')

That makes sense to me.  I guess we'd have to do something like:

def ravel(a, index_order='C', **kwargs):

Where kwargs must be empty if the second arg is specified, otherwise
it can contain only one key, 'order' and 'index_order'.  Thus:

np.ravel(a, index_order='C')

will work for the forseeable future.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-02 Thread Matthew Brett
Hi,

On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

I guess it could not make sense to do this:

np.ravel(a, index_order='C', memory_order='F')

It could make sense to do this:

np.reshape(a, (3,4), index_order='F, memory_order='F')

but that just points out the inherent confusion between the uses of
'order', and in this case, the fact that you can only do:

np.reshape(a, (3, 4), index_order='F')

correctly distinguishes between the meanings.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-01 Thread Sebastian Berg
On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote:
 Hi,
 
 On Sun, Mar 31, 2013 at 1:43 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com 
  wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com 
  wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
  brad.froe...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com 
   wrote:
   On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
  
   Ravel and reshape use the tems 'C' and 'F in the sense of 
   index
   ordering.
  
   This is very confusing.  We think the index ordering and 
   memory
   ordering ideas need to be separated, and specifically, we 
   should
   avoid
   using C and F to refer to index ordering.
  
   Proposal
   -
  
   * Deprecate the use of C and F meaning backwards and 
   forwards
   index ordering for ravel, reshape
   * Prefer Z and N, being graphical representations of 
   unraveling
   in
   2 dimensions, axis1 first and axis0 first respectively 
   (excellent
   naming idea by Paul Ivanov)
  
   What do y'all think?
  
   I always thought F and C are easy to understand, I 
   always thought
   about
   the content and never about the memory when using it.
  
   changing the names doesn't make it easier to understand.
   I think the confusion is because the new A and K refer to 
   existing
   memory
  
 
  I disagree, I think it's confusing, but I have evidence, and that 
  is
  that four out of four of us tested ourselves and got it wrong.
 
  Perhaps we are particularly dumb or poorly informed, but I think 
  it's
  rash to assert there is no problem here.
 
  I think you are overcomplicating things or phrased it as a trick 
  question
 
  I don't know what you mean by trick question - was there something
  over-complicated in the example?  I deliberately didn't include
  various much more confusing examples in reshape.
 
  I meant making the candidates think about memory instead of just
  column versus row stacking.
 
  To be specific, we were teaching about reshaping a (I, J, K, N) 4D
  array, it was an image, with time as the 4th dimension (N time
  points).   Raveling and reshaping 3D and 4D arrays is a common thing
  to do in neuroimaging, as you can imagine.
 
  A student asked what he would get back from raveling this array, a
  concatenated time series, or something spatial?
 
  We showed (I'd worked it out by this time) that the first N values
  were the time series given by [0, 0, 0, :].
 
  He said - Oh - I see - so the data is stored as a whole lot of time
  series one by one, I thought it would be stored as a series of
  images'.
 
  Ironically, this was a Fortran-ordered array in memory, and he was 
  wrong.
 
  So, I think the idea of memory ordering and index ordering is very
  easy to confuse, and comes up naturally.
 
  I would like, as a teacher, to be able to say something like:
 
  This is what C memory layout is (it's the memory layout  that gives
  arr.flags.C_CONTIGUOUS=True)
  This is what F memory layout is (it's the memory layout  that gives
  arr.flags.F_CONTIGUOUS=True)
  It's rather easy to get something that is neither C or F memory layout
  Numpy does many memory layouts.
  Ravel and reshape and numpy in general do not care (normally) about C
  or F layouts, they only care about index ordering.
 
  My point, that I'm repeating, is that my job is made harder by
  'arr.ravel('F')'.
 
  But once you know that ravel and reshape don't care about memory, the
  ravel is easy to predict (maybe not easy to visualize in 4-D):
 
  But this assumes that you already know that there's such a thing as
  memory layout, and there's such a thing as index ordering, and that
  'C' and 'F' in ravel refer to index ordering.  Once you have that,
  you're golden.  I'm arguing it's markedly harder to get this
  distinction, and keep it in mind, and teach it, if we are using the
  'C' and 'F names for both things.
 
  No, I think you are still missing my point.
  I think explaining ravel and reshape F and C is easy (kind of) because the
  students don't need to know at that stage about memory layouts.
 
  All they need to 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-01 Thread Matthew Brett
Hi,

On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote:
 Hi,

 On Sun, Mar 31, 2013 at 1:43 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com 
  wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
  brad.froe...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com 
   wrote:
   On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
  
   Ravel and reshape use the tems 'C' and 'F in the sense of 
   index
   ordering.
  
   This is very confusing.  We think the index ordering and 
   memory
   ordering ideas need to be separated, and specifically, we 
   should
   avoid
   using C and F to refer to index ordering.
  
   Proposal
   -
  
   * Deprecate the use of C and F meaning backwards and 
   forwards
   index ordering for ravel, reshape
   * Prefer Z and N, being graphical representations of 
   unraveling
   in
   2 dimensions, axis1 first and axis0 first respectively 
   (excellent
   naming idea by Paul Ivanov)
  
   What do y'all think?
  
   I always thought F and C are easy to understand, I 
   always thought
   about
   the content and never about the memory when using it.
  
   changing the names doesn't make it easier to understand.
   I think the confusion is because the new A and K refer to 
   existing
   memory
  
 
  I disagree, I think it's confusing, but I have evidence, and 
  that is
  that four out of four of us tested ourselves and got it wrong.
 
  Perhaps we are particularly dumb or poorly informed, but I think 
  it's
  rash to assert there is no problem here.
 
  I think you are overcomplicating things or phrased it as a trick 
  question
 
  I don't know what you mean by trick question - was there something
  over-complicated in the example?  I deliberately didn't include
  various much more confusing examples in reshape.
 
  I meant making the candidates think about memory instead of just
  column versus row stacking.
 
  To be specific, we were teaching about reshaping a (I, J, K, N) 4D
  array, it was an image, with time as the 4th dimension (N time
  points).   Raveling and reshaping 3D and 4D arrays is a common thing
  to do in neuroimaging, as you can imagine.
 
  A student asked what he would get back from raveling this array, a
  concatenated time series, or something spatial?
 
  We showed (I'd worked it out by this time) that the first N values
  were the time series given by [0, 0, 0, :].
 
  He said - Oh - I see - so the data is stored as a whole lot of time
  series one by one, I thought it would be stored as a series of
  images'.
 
  Ironically, this was a Fortran-ordered array in memory, and he was 
  wrong.
 
  So, I think the idea of memory ordering and index ordering is very
  easy to confuse, and comes up naturally.
 
  I would like, as a teacher, to be able to say something like:
 
  This is what C memory layout is (it's the memory layout  that gives
  arr.flags.C_CONTIGUOUS=True)
  This is what F memory layout is (it's the memory layout  that gives
  arr.flags.F_CONTIGUOUS=True)
  It's rather easy to get something that is neither C or F memory layout
  Numpy does many memory layouts.
  Ravel and reshape and numpy in general do not care (normally) about C
  or F layouts, they only care about index ordering.
 
  My point, that I'm repeating, is that my job is made harder by
  'arr.ravel('F')'.
 
  But once you know that ravel and reshape don't care about memory, the
  ravel is easy to predict (maybe not easy to visualize in 4-D):
 
  But this assumes that you already know that there's such a thing as
  memory layout, and there's such a thing as index ordering, and that
  'C' and 'F' in ravel refer to index ordering.  Once you have that,
  you're golden.  I'm arguing it's markedly harder to get this
  distinction, and keep it in mind, and teach it, if we are using the
  'C' and 'F names for both things.
 
  No, I think you are still missing my point.
  I think explaining ravel and reshape F and C is easy (kind of) because 
  

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-01 Thread josef . pktd
On Mon, Apr 1, 2013 at 3:10 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote:
 Hi,

 On Sun, Mar 31, 2013 at 1:43 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com 
  wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett 
  matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
  brad.froe...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
  matthew.br...@gmail.com
  wrote:
 
  On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com 
   wrote:
   On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com 
   wrote:
   On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
  
   Ravel and reshape use the tems 'C' and 'F in the sense 
   of index
   ordering.
  
   This is very confusing.  We think the index ordering and 
   memory
   ordering ideas need to be separated, and specifically, we 
   should
   avoid
   using C and F to refer to index ordering.
  
   Proposal
   -
  
   * Deprecate the use of C and F meaning backwards and 
   forwards
   index ordering for ravel, reshape
   * Prefer Z and N, being graphical representations of 
   unraveling
   in
   2 dimensions, axis1 first and axis0 first respectively 
   (excellent
   naming idea by Paul Ivanov)
  
   What do y'all think?
  
   I always thought F and C are easy to understand, I 
   always thought
   about
   the content and never about the memory when using it.
  
   changing the names doesn't make it easier to understand.
   I think the confusion is because the new A and K refer to 
   existing
   memory
  
 
  I disagree, I think it's confusing, but I have evidence, and 
  that is
  that four out of four of us tested ourselves and got it wrong.
 
  Perhaps we are particularly dumb or poorly informed, but I 
  think it's
  rash to assert there is no problem here.
 
  I think you are overcomplicating things or phrased it as a trick 
  question
 
  I don't know what you mean by trick question - was there something
  over-complicated in the example?  I deliberately didn't include
  various much more confusing examples in reshape.
 
  I meant making the candidates think about memory instead of just
  column versus row stacking.
 
  To be specific, we were teaching about reshaping a (I, J, K, N) 4D
  array, it was an image, with time as the 4th dimension (N time
  points).   Raveling and reshaping 3D and 4D arrays is a common thing
  to do in neuroimaging, as you can imagine.
 
  A student asked what he would get back from raveling this array, a
  concatenated time series, or something spatial?
 
  We showed (I'd worked it out by this time) that the first N values
  were the time series given by [0, 0, 0, :].
 
  He said - Oh - I see - so the data is stored as a whole lot of time
  series one by one, I thought it would be stored as a series of
  images'.
 
  Ironically, this was a Fortran-ordered array in memory, and he was 
  wrong.
 
  So, I think the idea of memory ordering and index ordering is very
  easy to confuse, and comes up naturally.
 
  I would like, as a teacher, to be able to say something like:
 
  This is what C memory layout is (it's the memory layout  that gives
  arr.flags.C_CONTIGUOUS=True)
  This is what F memory layout is (it's the memory layout  that gives
  arr.flags.F_CONTIGUOUS=True)
  It's rather easy to get something that is neither C or F memory 
  layout
  Numpy does many memory layouts.
  Ravel and reshape and numpy in general do not care (normally) about C
  or F layouts, they only care about index ordering.
 
  My point, that I'm repeating, is that my job is made harder by
  'arr.ravel('F')'.
 
  But once you know that ravel and reshape don't care about memory, the
  ravel is easy to predict (maybe not easy to visualize in 4-D):
 
  But this assumes that you already know that there's such a thing as
  memory layout, and there's such a thing as index ordering, and that
  'C' and 'F' in ravel refer to index ordering.  Once you have that,
  you're golden.  I'm arguing it's markedly harder to get this
  distinction, and keep it in mind, and teach it, if we are using the
  'C' and 'F names for both things.
 
  No, I think you are still missing 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-01 Thread Matthew Brett
Hi,

On Mon, Apr 1, 2013 at 4:51 PM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 HI folks,

 I've been teaching Python lately, have taught numpy a couple times
 (formally), and am preparing a leacture about it over the next couple
 weeks -- so I'm taking an interest here.

 I've been a regular numpy user for a long time, though as it happens,
 rarely use ravel() (sode note, what's always confused me the most is
 that it seems to me that ravel() _unravels_ the array - but that's a
 side note...)

 So I ignored the first post, then fired up iPython, read the
 docstring, and played with ravel a bit -- it behaved EXACTLY like I
 expected. -- at least for 2-d

 Mathew, I expect your group may have gotten tied up by the fact that
 you know too much! kind of like how I have a hard time getting my
 iphone to work, and my computer-illiterate wife has no problem at all.

Thank you for the compliment, it's more enjoyable than other potential
explanations of my confusion (sigh).

But, I don't think that is the explanation.

First, there were three of us with different levels of experience
getting confused on this.

Second, I think we all agree that:

 So: yes, I do think it's bit confusing and unfortunate that the
 order parameter has two somewhat different meanings,

- so there is a good reason that we could get confused.

Last, as soon as we came to the distinction between index order and
memory layout, it was clear.

We all agreed that this was an important distinction that would
improve numpy if we made it.

Before I sent the email I did wonder aloud whether people would read
the email, understand the distinction, and then fail to see the
problem.  It is hard to imagine yourself before you understood
something.

  but they are in
 fat, used fairly similarly. And while the idea of fortran or C
 ordering of arrays may be a foreign concept to folks that have not
 used fortran or C (or most critically, tried to interace the two...)
 it's a common enough concept that it's a reasonable shorthand.

 As for should we teach memory order at all to newbies?'

 I usually do teach memory order early on, partly that's because I
 really like to emphasize that numpy arrays are both a really nice
 Python data structure and set of functions, but also a wrapper around
 a block of data -- for the later, you need to talk about order. Also,
 even with pure-python, knowing a bit about whether arrays are
 contiguous or not is important (and views, and...). You can do a lot
 with numpy without thinking about memory order at all, but to really
 make it dance, you need to know about it.

 In short -- I don't think the situation is too bad, and not bad enough
 to change any names or flags, but if someone wants to add a bit to the
 ravel docstring to clarify it, I'm all for it.

I think you agree that there is potential for confusion, and there
doesn't seem any reason to continue with that confusion if we can come
up with a clearer name.

So here is a compromise proposal.

How about:

* Preferring the names 'c-style' and 'f-style' for the indexing order
case (ravel, reshape, flatiter)
* Leaving 'C and 'F' as functional shortcuts, so there is no possible
backwards-compatibility problem.

Would you object to that?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-31 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
 matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and 
  forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of 
  unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively 
  (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always 
  thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick 
 question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.

 To be specific, we were teaching about reshaping a (I, J, K, N) 4D
 array, it was an image, with time as the 4th dimension (N time
 points).   Raveling and reshaping 3D and 4D arrays is a common thing
 to do in neuroimaging, as you can imagine.

 A student asked what he would get back from raveling this array, a
 concatenated time series, or something spatial?

 We showed (I'd worked it out by this time) that the first N values
 were the time series given by [0, 0, 0, :].

 He said - Oh - I see - so the data is stored as a whole lot of time
 series one by one, I thought it would be stored as a series of
 images'.

 Ironically, this was a Fortran-ordered array in memory, and he was wrong.

 So, I think the idea of memory ordering and index ordering is very
 easy to confuse, and comes up naturally.

 I would like, as a teacher, to be able to say something like:

 This is what C memory layout is (it's the memory layout  that gives
 arr.flags.C_CONTIGUOUS=True)
 This is what F memory layout is (it's the memory layout  that gives
 arr.flags.F_CONTIGUOUS=True)
 It's rather easy to get something that is neither C or F memory layout
 Numpy does many memory layouts.
 Ravel and reshape and numpy in general do not care (normally) about C
 or F layouts, they only care about index ordering.

 My point, that I'm repeating, is that my job is made harder by
 'arr.ravel('F')'.

 But once you know that ravel and reshape don't care about memory, the
 ravel is easy to predict (maybe not easy to visualize in 4-D):

 But this assumes that you already know that there's such a thing as
 memory layout, and there's such a thing as index ordering, and that
 'C' and 'F' in ravel refer to index ordering.  Once you have that,
 you're golden.  I'm arguing it's markedly harder to get this
 distinction, and keep it in mind, and teach it, if we are using the
 'C' and 'F names for both things.

 No, I think you are still missing my point.
 I think explaining ravel and reshape F and C is easy (kind of) because the
 students don't need to know at that stage about memory layouts.

 All they need to know is that we look at n-dimensional objects in
 C-order or in  F-order
 (whichever index runs fastest)

Would you accept that it may or may not be true that it is desirable
or practical not to mention memory layouts when teaching numpy?

You believe it is desirable, I believe that it is not - that teaching
numpy naturally involves some discussion of memory layout.


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-31 Thread josef . pktd
On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
 matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of 
  index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we 
  should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and 
  forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of 
  unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively 
  (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always 
  thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick 
 question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.

 To be specific, we were teaching about reshaping a (I, J, K, N) 4D
 array, it was an image, with time as the 4th dimension (N time
 points).   Raveling and reshaping 3D and 4D arrays is a common thing
 to do in neuroimaging, as you can imagine.

 A student asked what he would get back from raveling this array, a
 concatenated time series, or something spatial?

 We showed (I'd worked it out by this time) that the first N values
 were the time series given by [0, 0, 0, :].

 He said - Oh - I see - so the data is stored as a whole lot of time
 series one by one, I thought it would be stored as a series of
 images'.

 Ironically, this was a Fortran-ordered array in memory, and he was wrong.

 So, I think the idea of memory ordering and index ordering is very
 easy to confuse, and comes up naturally.

 I would like, as a teacher, to be able to say something like:

 This is what C memory layout is (it's the memory layout  that gives
 arr.flags.C_CONTIGUOUS=True)
 This is what F memory layout is (it's the memory layout  that gives
 arr.flags.F_CONTIGUOUS=True)
 It's rather easy to get something that is neither C or F memory layout
 Numpy does many memory layouts.
 Ravel and reshape and numpy in general do not care (normally) about C
 or F layouts, they only care about index ordering.

 My point, that I'm repeating, is that my job is made harder by
 'arr.ravel('F')'.

 But once you know that ravel and reshape don't care about memory, the
 ravel is easy to predict (maybe not easy to visualize in 4-D):

 But this assumes that you already know that there's such a thing as
 memory layout, and there's such a thing as index ordering, and that
 'C' and 'F' in ravel refer to index ordering.  Once you have that,
 you're golden.  I'm arguing it's markedly harder to get this
 distinction, and keep it in mind, and teach it, if we are using the
 'C' and 'F names for both things.

 No, I think you are still missing my point.
 I think explaining ravel and reshape F and C is easy (kind of) because the
 students don't need to know at that stage about memory layouts.

 All they need to know is that we look at n-dimensional objects in
 C-order or in  F-order
 (whichever index runs fastest)

 Would you accept that it may or may not be true that it is desirable
 or practical not to mention memory layouts when teaching numpy?

I think they should be in two different 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-31 Thread Ralf Gommers
On Sun, Mar 31, 2013 at 10:43 PM, josef.p...@gmail.com wrote:

 On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett 
 matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
  On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett 
 matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett 
 matthew.br...@gmail.com wrote:
  Hi,
 
  On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
  brad.froe...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
 matthew.br...@gmail.com
  wrote:
 
  On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com
 wrote:
   On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
   On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com
 wrote:
   On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
   matthew.br...@gmail.com wrote:
  
   Ravel and reshape use the tems 'C' and 'F in the sense
 of index
   ordering.
  
   This is very confusing.  We think the index ordering and
 memory
   ordering ideas need to be separated, and specifically, we
 should
   avoid
   using C and F to refer to index ordering.
  
   Proposal
   -
  
   * Deprecate the use of C and F meaning backwards and
 forwards
   index ordering for ravel, reshape
   * Prefer Z and N, being graphical representations of
 unraveling
   in
   2 dimensions, axis1 first and axis0 first respectively
 (excellent
   naming idea by Paul Ivanov)
  
   What do y'all think?
  
   I always thought F and C are easy to understand, I
 always thought
   about
   the content and never about the memory when using it.
  
   changing the names doesn't make it easier to understand.
   I think the confusion is because the new A and K refer to
 existing
   memory
  
 
  I disagree, I think it's confusing, but I have evidence, and
 that is
  that four out of four of us tested ourselves and got it wrong.
 
  Perhaps we are particularly dumb or poorly informed, but I
 think it's
  rash to assert there is no problem here.
 
  I think you are overcomplicating things or phrased it as a trick
 question
 
  I don't know what you mean by trick question - was there something
  over-complicated in the example?  I deliberately didn't include
  various much more confusing examples in reshape.
 
  I meant making the candidates think about memory instead of just
  column versus row stacking.
 
  To be specific, we were teaching about reshaping a (I, J, K, N) 4D
  array, it was an image, with time as the 4th dimension (N time
  points).   Raveling and reshaping 3D and 4D arrays is a common thing
  to do in neuroimaging, as you can imagine.
 
  A student asked what he would get back from raveling this array, a
  concatenated time series, or something spatial?
 
  We showed (I'd worked it out by this time) that the first N values
  were the time series given by [0, 0, 0, :].
 
  He said - Oh - I see - so the data is stored as a whole lot of time
  series one by one, I thought it would be stored as a series of
  images'.
 
  Ironically, this was a Fortran-ordered array in memory, and he was
 wrong.
 
  So, I think the idea of memory ordering and index ordering is very
  easy to confuse, and comes up naturally.
 
  I would like, as a teacher, to be able to say something like:
 
  This is what C memory layout is (it's the memory layout  that gives
  arr.flags.C_CONTIGUOUS=True)
  This is what F memory layout is (it's the memory layout  that gives
  arr.flags.F_CONTIGUOUS=True)
  It's rather easy to get something that is neither C or F memory
 layout
  Numpy does many memory layouts.
  Ravel and reshape and numpy in general do not care (normally) about C
  or F layouts, they only care about index ordering.
 
  My point, that I'm repeating, is that my job is made harder by
  'arr.ravel('F')'.
 
  But once you know that ravel and reshape don't care about memory, the
  ravel is easy to predict (maybe not easy to visualize in 4-D):
 
  But this assumes that you already know that there's such a thing as
  memory layout, and there's such a thing as index ordering, and that
  'C' and 'F' in ravel refer to index ordering.  Once you have that,
  you're golden.  I'm arguing it's markedly harder to get this
  distinction, and keep it in mind, and teach it, if we are using the
  'C' and 'F names for both things.
 
  No, I think you are still missing my point.
  I think explaining ravel and reshape F and C is easy (kind of) because
 the
  students don't need to know at that stage about memory layouts.
 
  All they need to know is that we look at n-dimensional objects in
  C-order or in  F-order
  (whichever index runs 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-31 Thread Matthew Brett
Hi,

On Sun, Mar 31, 2013 at 1:43 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 10:38 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 9:37 PM,  josef.p...@gmail.com wrote:
 On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett 
 matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett 
 matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of 
  index
  ordering.
 
  This is very confusing.  We think the index ordering and 
  memory
  ordering ideas need to be separated, and specifically, we 
  should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and 
  forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of 
  unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively 
  (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always 
  thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to 
  existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think 
 it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick 
 question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.

 To be specific, we were teaching about reshaping a (I, J, K, N) 4D
 array, it was an image, with time as the 4th dimension (N time
 points).   Raveling and reshaping 3D and 4D arrays is a common thing
 to do in neuroimaging, as you can imagine.

 A student asked what he would get back from raveling this array, a
 concatenated time series, or something spatial?

 We showed (I'd worked it out by this time) that the first N values
 were the time series given by [0, 0, 0, :].

 He said - Oh - I see - so the data is stored as a whole lot of time
 series one by one, I thought it would be stored as a series of
 images'.

 Ironically, this was a Fortran-ordered array in memory, and he was wrong.

 So, I think the idea of memory ordering and index ordering is very
 easy to confuse, and comes up naturally.

 I would like, as a teacher, to be able to say something like:

 This is what C memory layout is (it's the memory layout  that gives
 arr.flags.C_CONTIGUOUS=True)
 This is what F memory layout is (it's the memory layout  that gives
 arr.flags.F_CONTIGUOUS=True)
 It's rather easy to get something that is neither C or F memory layout
 Numpy does many memory layouts.
 Ravel and reshape and numpy in general do not care (normally) about C
 or F layouts, they only care about index ordering.

 My point, that I'm repeating, is that my job is made harder by
 'arr.ravel('F')'.

 But once you know that ravel and reshape don't care about memory, the
 ravel is easy to predict (maybe not easy to visualize in 4-D):

 But this assumes that you already know that there's such a thing as
 memory layout, and there's such a thing as index ordering, and that
 'C' and 'F' in ravel refer to index ordering.  Once you have that,
 you're golden.  I'm arguing it's markedly harder to get this
 distinction, and keep it in mind, and teach it, if we are using the
 'C' and 'F names for both things.

 No, I think you are still missing my point.
 I think explaining ravel and reshape F and C is easy (kind of) because the
 students don't need to know at that stage about memory layouts.

 All they need to know is that we look at n-dimensional objects in
 C-order or in  F-order
 (whichever index runs fastest)

 Would you accept that it may or may not be true that it is desirable
 or practical not to mention 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



I always thought F and C are easy to understand, I always thought about
the content and never about the memory when using it.

In my numpy htmlhelp for version 1.5, I don't have a K or A option

 np.__version__
'1.5.1'

 np.arange(5).ravel(K)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: order not understood

 np.arange(5).ravel(A)
array([0, 1, 2, 3, 4])


the C, F in ravel have their twins in reshape

 arr = np.arange(10).reshape(2,5, order=C).copy()
 arr
array([[0, 1, 2, 3, 4],
   [5, 6, 7, 8, 9]])
 arr.ravel()
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 arr = np.arange(10).reshape(2,5, order=F).copy()
 arr
array([[0, 2, 4, 6, 8],
   [1, 3, 5, 7, 9]])
 arrarr.ravel(F)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

For example we use it when we get raveled arrays from R,
and F for column order and C for row order indexing are pretty
obvious names when coming from another package (Matlab, R, Gauss)

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 7:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

 In my numpy htmlhelp for version 1.5, I don't have a K or A option

 np.__version__
 '1.5.1'

 np.arange(5).ravel(K)
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: order not understood

 np.arange(5).ravel(A)
 array([0, 1, 2, 3, 4])


 the C, F in ravel have their twins in reshape

 arr = np.arange(10).reshape(2,5, order=C).copy()
 arr
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
 arr.ravel()
 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 arr = np.arange(10).reshape(2,5, order=F).copy()
 arr
 array([[0, 2, 4, 6, 8],
[1, 3, 5, 7, 9]])
 arrarr.ravel(F)
 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 For example we use it when we get raveled arrays from R,
 and F for column order and C for row order indexing are pretty
 obvious names when coming from another package (Matlab, R, Gauss)

just a quick search to get an idea

in statsmodels
19 out of 135 ravel are ravel('F')
50 out of 270 reshapes specify: reshape.*order='F' (regular expression)

Josef


 Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Sebastian Berg
On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote:
 Hi,
 
 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.
 
 Summary
 --
 
 There are two separate ideas needed to understand ordering in ravel and 
 reshape:
 
 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering
 
 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.
 
 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.
 
 What the current situation looks like
 
 
 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.
 
 This was what we knew, or should have known:
 
 In [2]: import numpy as np
 
 In [3]: arr = np.arange(10).reshape((2, 5))
 
 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 
 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.
 
 So far so good (even if the opposite to MATLAB, Octave).
 
 Then we found the 'order' flag to ravel:
 
 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False
 
 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 
 But we soon got confused.  How about this?
 
 In [12]: arr_F = np.array(arr, order='F')
 
 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False
 
 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
 
 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 
 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.
 
 And in fact, we can ask for memory ordering specifically:
 
 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 
 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])
 
 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
 
 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])
 
 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.
 
 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.
 
 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.
 
 Proposal
 -
 
 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)
 
 What do y'all think?
 

Personally I think it is clear enough and that Z and N would confuse
me just as much (though I am used to the other names). Also Z and N
would seem more like aliases, which would also make sense in the memory
order context.
If anything, I would prefer renaming the arguments iteration_order and
memory_order, but it seems overdoing it...
Maybe the documentation could just be checked if it is always clear
though. I.e. maybe it does not use iteration or memory order
consistently (though I somewhat feel it is usually clear that it must be
iteration order, since no numpy function cares about the input memory
order as they will just do a copy if necessary).

Regards,

Sebastian 

 Cheers,
 
 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote:
 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?


 Personally I think it is clear enough and that Z and N would confuse
 me just as much (though I am used to the other names). Also Z and N
 would seem more like aliases, which would also make sense in the memory
 order context.
 If anything, I would prefer renaming the arguments iteration_order and
 memory_order, but it seems overdoing it...

I am not sure what you mean - at the moment  there is one argument
called 'order' that can refer to iteration order or memory order.  Are
you proposing two arguments?

 Maybe the documentation could just be checked if it is always clear
 though. I.e. maybe it does not use iteration or memory order
 consistently (though I somewhat feel it is usually clear that it must be
 iteration order, since no numpy function cares about the input memory
 order as they will just do a copy if necessary).

Do you really mean this?  Numpy is full of 'order=' flags that refer to memory.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

I can only say that 4 out of 4 experienced numpy developers found
themselves unable to predict the behavior of these functions before
they saw the output.

The problem is always that explaining something makes it clearer for a
moment, but, for those who do not have the explanation or who have
forgotten it, at least among us here, the outputs were generating
groans and / or high fives as we incorrectly or correctly guessed what
was going to happen.

I think the only way to find out whether this really is confusing or
not, is to put someone in front of these functions without any
explanation and ask them to predict what is going to come out of the
various inputs and flags.   Or to try and teach it, which was the
problem we were having.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

 I can only say that 4 out of 4 experienced numpy developers found
 themselves unable to predict the behavior of these functions before
 they saw the output.

 The problem is always that explaining something makes it clearer for a
 moment, but, for those who do not have the explanation or who have
 forgotten it, at least among us here, the outputs were generating
 groans and / or high fives as we incorrectly or correctly guessed what
 was going to happen.

 I think the only way to find out whether this really is confusing or
 not, is to put someone in front of these functions without any
 explanation and ask them to predict what is going to come out of the
 various inputs and flags.   Or to try and teach it, which was the
 problem we were having.

changing the names doesn't make it easier to understand.
I think the confusion is because the new A and K refer to existing memory


``ravel`` is just stacking columns ('F') or stacking rows ('C'), I
don't remember having seen any weird cases.


I always thought of order in array creation is the way we want to
have the memory layout of the *target* array and has nothing to do
with existing memory layout (creating view or copy as needed).

reshape, and ravel are *views* if possible, memory might just be some
weird strides
(and can be ignored unless you want to do 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

 I can only say that 4 out of 4 experienced numpy developers found
 themselves unable to predict the behavior of these functions before
 they saw the output.

 The problem is always that explaining something makes it clearer for a
 moment, but, for those who do not have the explanation or who have
 forgotten it, at least among us here, the outputs were generating
 groans and / or high fives as we incorrectly or correctly guessed what
 was going to happen.

 I think the only way to find out whether this really is confusing or
 not, is to put someone in front of these functions without any
 explanation and ask them to predict what is going to come out of the
 various inputs and flags.   Or to try and teach it, which was the
 problem we were having.

 changing the names doesn't make it easier to understand.
 I think the confusion is because the new A and K refer to existing memory


 ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I
 don't remember having seen any weird cases.

example from our statistics use:
rows are observations/time periods, columns are variables/individuals

using F or C, we can stack either by time-periods (observations)
or individuals (cross-section units)
that's easy to understand.


A and K  are pretty useless for 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 1:57 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

 I can only say that 4 out of 4 experienced numpy developers found
 themselves unable to predict the behavior of these functions before
 they saw the output.

 The problem is always that explaining something makes it clearer for a
 moment, but, for those who do not have the explanation or who have
 forgotten it, at least among us here, the outputs were generating
 groans and / or high fives as we incorrectly or correctly guessed what
 was going to happen.

 I think the only way to find out whether this really is confusing or
 not, is to put someone in front of these functions without any
 explanation and ask them to predict what is going to come out of the
 various inputs and flags.   Or to try and teach it, which was the
 problem we were having.

 changing the names doesn't make it easier to understand.
 I think the confusion is because the new A and K refer to existing memory


 ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I
 don't remember having seen any weird cases.
 

 I always thought of order in array creation is the way we want to
 have the memory layout of the *target* array and has nothing to do
 with existing memory layout (creating view or copy as needed).

In the case of ravel of course F and C in memory 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
 On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:

 Hi,

 We were teaching today, and found ourselves getting very confused
 about ravel and shape in numpy.

 Summary
 --

 There are two separate ideas needed to understand ordering in ravel and 
 reshape:

 Idea 1): ravel / reshape can proceed from the last axis to the first,
 or the first to the last.  This is ravel index ordering
 Idea 2) The physical layout of the array (on disk or in memory) can be
 C or F contiguous or neither.
 This is memory ordering

 The index ordering is usually (but see below) orthogonal to the memory 
 ordering.

 The 'ravel' and 'reshape' commands use C and F in the sense of
 index ordering, and this mixes the two ideas and is confusing.

 What the current situation looks like
 

 Specifically, we've been rolling this around 4 experienced numpy users
 and we all predicted at least one of the results below wrongly.

 This was what we knew, or should have known:

 In [2]: import numpy as np

 In [3]: arr = np.arange(10).reshape((2, 5))

 In [5]: arr.ravel()
 Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 So, the 'ravel' operation unravels over the last axis (1) first,
 followed by axis 0.

 So far so good (even if the opposite to MATLAB, Octave).

 Then we found the 'order' flag to ravel:

 In [10]: arr.flags
 Out[10]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [11]: arr.ravel('C')
 Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 But we soon got confused.  How about this?

 In [12]: arr_F = np.array(arr, order='F')

 In [13]: arr_F.flags
 Out[13]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 In [16]: arr_F
 Out[16]:
 array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

 In [17]: arr_F.ravel('C')
 Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 Right - so the flag 'C' to ravel, has got nothing to do with *memory*
 ordering, but is to do with *index* ordering.

 And in fact, we can ask for memory ordering specifically:

 In [22]: arr.ravel('K')
 Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [23]: arr_F.ravel('K')
 Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 In [24]: arr.ravel('A')
 Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 In [25]: arr_F.ravel('A')
 Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

 There are some confusions to get into with the 'order' flag to reshape
 as well, of the same type.

 Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

 This is very confusing.  We think the index ordering and memory
 ordering ideas need to be separated, and specifically, we should avoid
 using C and F to refer to index ordering.

 Proposal
 -

 * Deprecate the use of C and F meaning backwards and forwards
 index ordering for ravel, reshape
 * Prefer Z and N, being graphical representations of unraveling in
 2 dimensions, axis1 first and axis0 first respectively (excellent
 naming idea by Paul Ivanov)

 What do y'all think?

 Cheers,

 Matthew
 Paul Ivanov
 JB Poline
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 I always thought F and C are easy to understand, I always thought about
 the content and never about the memory when using it.

 I can only say that 4 out of 4 experienced numpy developers found
 themselves unable to predict the behavior of these functions before
 they saw the output.

 The problem is always that explaining something makes it clearer for a
 moment, but, for those who do not have the explanation or who have
 forgotten it, at least among us here, the outputs were generating
 groans and / or high fives as we incorrectly or correctly guessed what
 was going to happen.

 I think the only way to find out whether this really is confusing or
 not, is to put someone in front of these functions without any
 explanation and ask them to predict what is going to come out of the
 various inputs and flags.   Or to try and teach it, which was the
 problem we were having.

 changing the names doesn't make it easier to understand.
 I think the confusion is because the new A and K refer to existing memory


 ``ravel`` is just stacking columns ('F') or stacking rows ('C'), I
 don't remember having seen any weird cases.

 example from our statistics use:
 rows are observations/time periods, columns are variables/individuals

 using F or C, we can stack either by time-periods (observations)
 or individuals 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Sebastian Berg
On Sat, 2013-03-30 at 12:45 -0700, Matthew Brett wrote:
 Hi,
 
 On Sat, Mar 30, 2013 at 11:55 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
  On Fri, 2013-03-29 at 19:08 -0700, Matthew Brett wrote:
  Hi,
 
  We were teaching today, and found ourselves getting very confused
  about ravel and shape in numpy.
 

snip

 
  What do y'all think?
 
 
  Personally I think it is clear enough and that Z and N would confuse
  me just as much (though I am used to the other names). Also Z and N
  would seem more like aliases, which would also make sense in the memory
  order context.
  If anything, I would prefer renaming the arguments iteration_order and
  memory_order, but it seems overdoing it...
 
 I am not sure what you mean - at the moment  there is one argument
 called 'order' that can refer to iteration order or memory order.  Are
 you proposing two arguments?
 

Yes that is what I meant. The reason that it is not convincing to me is
that if I write `np.reshape(arr, ..., order='Z')`, I may be tempted to
also write `np.copy(arr, order='Z')`. I don't see anything against
allowing 'Z' as a more memorable 'C' (I also used to forget which was
which), but I don't really see enforcing a different _value_ on the same
named argument making it clearer.
Renaming the argument itself would seem more sensible to me right now,
but I cannot think of a decent name, so I would prefer trying to clarify
the documentation if necessary.

  Maybe the documentation could just be checked if it is always clear
  though. I.e. maybe it does not use iteration or memory order
  consistently (though I somewhat feel it is usually clear that it must be
  iteration order, since no numpy function cares about the input memory
  order as they will just do a copy if necessary).
 
 Do you really mean this?  Numpy is full of 'order=' flags that refer to 
 memory.
 

I somewhat imagined there were more iteration order flags and I
basically count empty/ones/.../copy as basically one array creation
monster...

 Cheers,
 
 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Bradley M. Froehle
On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.comwrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett 
 matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
 ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
 avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
 in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
 about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
 memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.


I got all four correct.  I think the concept --- at least for ravel --- is
pretty simple: would you like to read the data off in C ordering or Fortran
ordering.  Since the output array is one-dimensional, its ordering is
irrelevant.

I don't understand the 'Z' / 'N' suggestion at all.  Are they part of some
pneumonic?

I'd STRONGLY advise against deprecating the 'F' and 'C' options.  NumPy
already suffers from too much bikeshedding with names --- I rarely am able
to pull out a script I wrote using NumPy even a few years ago and have
it immediately work.

Cheers,
Brad
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 4:31 PM, Bradley M. Froehle
brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.


 I got all four correct.

Then you are smarted and or better informed than we were.  I hope you
didn't read my explanation before you tested yourself.

Of course if you did read my email first I'd expect you and I to get
the answer right first time.

If you didn't read my email first, and didn't think too hard about it,
and still got all the examples right, and you'd get other more
confusing examples right that use reshape,  then I'd add you as a data
point on the other side to the four data points we got yesterday.

 I think the concept --- at least for ravel --- is
 pretty simple: would you like to read the data off in C ordering or Fortran
 ordering.  Since the output array is one-dimensional, its ordering is
 irrelevant.

Right - hence my confidence that Josef's sense of thinking of the 'C'
and 'F' being target array output was not a good way to think of it in
this case.  It is in the case of arr.tostring() though.

 I don't understand the 'Z' / 'N' suggestion at all.  Are they part of some
 pneumonic?

Think of the way you'd read off the elements using reverse
(last-first) index order for a 2D array, you might imagine something
like a Z.

 I'd STRONGLY advise against deprecating the 'F' and 'C' options.  NumPy
 already suffers from too much bikeshedding with names --- I rarely am able
 to pull out a script I wrote using NumPy even a few years ago and have it
 immediately work.

I wish we could drop bike-shedding - it's a completely useless word
because one person's bike-shedding is another person's necessary
clarification.  You think this clarification isn't necessary and you
think this discussion is bike-shedding.

I'm not suggesting dropping the 'F' and 'C', obviously - can I call
that a 'straw man'?

I am suggesting changing the name to something much clearer, leaving
that name clearly explained in the docs, and leaving 'C' and 'F as
functional synonyms for a very long time.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

I think you are overcomplicating things or phrased it as a trick question

ravel F and C have *nothing* to do with memory layout.
I think it's not confusing for beginners that have no idea and never think
about memory layout.
I've never seen any problems with it in statsmodels and I have seen
many developers (GSOC) that are pretty new to python and numpy.
(I didn't check the repo history to verify, so IIRC)

Even if N, Z were clearer in this case (which I don't think it is and which
I have no idea what it should stand for), you would have to go for every
use of ``order`` in numpy to check whether it should be N or F or Z or C,
and then users would have to check which order name convention is
used in a specific function.

Josef



 I got all four correct.  I think the concept --- at least for ravel --- is
 pretty simple: would you like to read the data off in C ordering or Fortran
 ordering.  Since the output array is one-dimensional, its ordering is
 irrelevant.

 I don't understand the 'Z' / 'N' suggestion at all.  Are they part of some
 pneumonic?

 I'd STRONGLY advise against deprecating the 'F' and 'C' options.  NumPy
 already suffers from too much bikeshedding with names --- I rarely am able
 to pull out a script I wrote using NumPy even a few years ago and have it
 immediately work.

 Cheers,
 Brad



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

I don't know what you mean by trick question - was there something
over-complicated in the example?  I deliberately didn't include
various much more confusing examples in reshape.

 ravel F and C have *nothing* to do with memory layout.

We do agree on this of course - but you said in an earlier mail that
you thought of 'C and 'F' as referring to target memory layout (which
they don't in this case) so I think we also agree that C and F do
often refer to memory layout elsewhere in numpy.

 I think it's not confusing for beginners that have no idea and never think
 about memory layout.
 I've never seen any problems with it in statsmodels and I have seen
 many developers (GSOC) that are pretty new to python and numpy.
 (I didn't check the repo history to verify, so IIRC)

Usually you don't need to know what reshape or ravel did because you
are likely to reshape again and that will use the same algorithm.

For example, I didn't know that that ravel worked in reverse index
order, started explaining it wrong, and had to check. I use ravel and
reshape a lot, and have not run into this problem because either a) I
didn't test my code properly or b) I did reshape after ravel / reshape
and it reversed what I did first time.  So, I don't think it's we
haven't noticed any problems is a good argument in the face of
several experienced developers got it wrong when trying to guess what
it did.

 Even if N, Z were clearer in this case (which I don't think it is and which
 I have no idea what it should stand for), you would have to go for every
 use of ``order`` in numpy to check whether it should be N or F or Z or C,
 and then users would have to check which order name convention is
 used in a specific function.

Right - and this would be silly if and only if it made sense to
conflate memory layout and index ordering.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

I meant making the candidates think about memory instead of just
column versus row stacking.
I don't think I ever get confused about reshape F in 2d.
But when I work with 3d or larger ndim nd-arrays, I always have to
try an example to check my intuition (in general not just reshape).


 ravel F and C have *nothing* to do with memory layout.

 We do agree on this of course - but you said in an earlier mail that
 you thought of 'C and 'F' as referring to target memory layout (which
 they don't in this case) so I think we also agree that C and F do
 often refer to memory layout elsewhere in numpy.

I guess that wasn't so helpful.
(emphasis on *target*, There are very few places where an order
keyword refers to *existing* memory layout.
So I'm not tempted to think about existing memory layout when I see
``order``.

Also my examples might have confused the issue:
ravel and reshape, with C and F are easy to understand without
ever looking at memory issues.

memory only comes into play when we want to know whether we
get a view or copy. The examples were only for the cases when I
do care about this.
)


 I think it's not confusing for beginners that have no idea and never think
 about memory layout.
 I've never seen any problems with it in statsmodels and I have seen
 many developers (GSOC) that are pretty new to python and numpy.
 (I didn't check the repo history to verify, so IIRC)

 Usually you don't need to know what reshape or ravel did because you
 are likely to reshape again and that will use the same algorithm.

 For example, I didn't know that that ravel worked in reverse index
 order, started explaining it wrong, and had to check. I use ravel and
 reshape a lot, and have not run into this problem because either a) I
 didn't test my code properly or b) I did reshape after ravel / reshape
 and it reversed what I did first time.  So, I don't think it's we
 haven't noticed any problems is a good argument in the face of
 several experienced developers got it wrong when trying to guess what
 it did.

What's reverse index order?

In the case of statsmodels, we do care about the stacking order. When
we use reshape(..., order='F') or ravel('F'), it's only because we
want to have a
specific array (not memory) layout (and/or because the raveled array came
from R)

(aside:  2 cases
- for 2d parameter vectors, we ravel and reshape often, and we changed
our convention to Fortran order, (parameter in rows, equations in columns, IIRC)
The interpretation of the results depends on which way we ravel or reshape.

- for panel data (time versus individuals), we need to build matching
kronecker product arrays which are block-diagonal if the stacking/``order``
is the right way.

None of the cases cares about memory layout, it's just:
Do we stack by columns or by rows, i.e. fortran- or c-order?
Do we want this in rows or in columns?
)



 Even if N, Z were clearer in this case (which I don't think it is and which

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.
 I don't think I ever get confused about reshape F in 2d.
 But when I work with 3d or larger ndim nd-arrays, I always have to
 try an example to check my intuition (in general not just reshape).


 ravel F and C have *nothing* to do with memory layout.

 We do agree on this of course - but you said in an earlier mail that
 you thought of 'C and 'F' as referring to target memory layout (which
 they don't in this case) so I think we also agree that C and F do
 often refer to memory layout elsewhere in numpy.

 I guess that wasn't so helpful.
 (emphasis on *target*, There are very few places where an order
 keyword refers to *existing* memory layout.

It is helpful because it shows how easy it is to get confused between
memory order and index order.

 What's reverse index order?

I am not being clear, sorry about that:

import numpy as np

def ravel_iter_last_fastest(arr):
res = []
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
for k in range(arr.shape[2]):
# Iterating over last dimension fastest
res.append(arr[i, j, k])
return np.array(res)


def ravel_iter_first_fastest(arr):
res = []
for k in range(arr.shape[2]):
for j in range(arr.shape[1]):
for i in range(arr.shape[0]):
# Iterating over first dimension fastest
res.append(arr[i, j, k])
return np.array(res)


a = np.arange(24).reshape((2, 3, 4))

print np.all(a.ravel('C') == ravel_iter_last_fastest(a))
print np.all(a.ravel('F') == ravel_iter_first_fastest(a))

By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above.  I
guess one could argue that this was not 'reverse' but 'forward' index
ordering, but I am not arguing about which is better, or those names,
only that it's the order of indices that differs, not the memory
layout, and that these ideas need to be kept separate.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread Matthew Brett
Hi,

On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.

To be specific, we were teaching about reshaping a (I, J, K, N) 4D
array, it was an image, with time as the 4th dimension (N time
points).   Raveling and reshaping 3D and 4D arrays is a common thing
to do in neuroimaging, as you can imagine.

A student asked what he would get back from raveling this array, a
concatenated time series, or something spatial?

We showed (I'd worked it out by this time) that the first N values
were the time series given by [0, 0, 0, :].

He said - Oh - I see - so the data is stored as a whole lot of time
series one by one, I thought it would be stored as a series of
images'.

Ironically, this was a Fortran-ordered array in memory, and he was wrong.

So, I think the idea of memory ordering and index ordering is very
easy to confuse, and comes up naturally.

I would like, as a teacher, to be able to say something like:

This is what C memory layout is (it's the memory layout  that gives
arr.flags.C_CONTIGUOUS=True)
This is what F memory layout is (it's the memory layout  that gives
arr.flags.F_CONTIGUOUS=True)
It's rather easy to get something that is neither C or F memory layout
Numpy does many memory layouts.
Ravel and reshape and numpy in general do not care (normally) about C
or F layouts, they only care about index ordering.

My point, that I'm repeating, is that my job is made harder by
'arr.ravel('F')'.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sat, Mar 30, 2013 at 11:43 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always 
  thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.
 I don't think I ever get confused about reshape F in 2d.
 But when I work with 3d or larger ndim nd-arrays, I always have to
 try an example to check my intuition (in general not just reshape).


 ravel F and C have *nothing* to do with memory layout.

 We do agree on this of course - but you said in an earlier mail that
 you thought of 'C and 'F' as referring to target memory layout (which
 they don't in this case) so I think we also agree that C and F do
 often refer to memory layout elsewhere in numpy.

 I guess that wasn't so helpful.
 (emphasis on *target*, There are very few places where an order
 keyword refers to *existing* memory layout.

 It is helpful because it shows how easy it is to get confused between
 memory order and index order.

 What's reverse index order?

 I am not being clear, sorry about that:

 import numpy as np

 def ravel_iter_last_fastest(arr):
 res = []
 for i in range(arr.shape[0]):
 for j in range(arr.shape[1]):
 for k in range(arr.shape[2]):
 # Iterating over last dimension fastest
 res.append(arr[i, j, k])
 return np.array(res)


 def ravel_iter_first_fastest(arr):
 res = []
 for k in range(arr.shape[2]):
 for j in range(arr.shape[1]):
 for i in range(arr.shape[0]):
 # Iterating over first dimension fastest
 res.append(arr[i, j, k])
 return np.array(res)

good example

that's just C and F order in the terminology of numpy
http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#controlling-iteration-order
(independent of memory)
http://docs.scipy.org/doc/numpy/reference/generated/numpy.flatiter.html#numpy.flatiter

I don't think we want to rename a large part of the basic terminology of numpy


Josef




 a = np.arange(24).reshape((2, 3, 4))

 print np.all(a.ravel('C') == ravel_iter_last_fastest(a))
 print np.all(a.ravel('F') == ravel_iter_first_fastest(a))

 By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above.  I
 guess one could argue that this was not 'reverse' but 'forward' index
 ordering, but I am not arguing about which is better, or those names,
 only that it's the order of indices that differs, not the memory
 layout, and that these ideas need to be kept separate.

 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-30 Thread josef . pktd
On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:02 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Sat, Mar 30, 2013 at 7:50 PM,  josef.p...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
 brad.froe...@gmail.com wrote:
 On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 On Sat, Mar 30, 2013 at 2:20 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:57 PM,  josef.p...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
  On Sat, Mar 30, 2013 at 4:14 AM,  josef.p...@gmail.com wrote:
  On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
  matthew.br...@gmail.com wrote:
 
  Ravel and reshape use the tems 'C' and 'F in the sense of index
  ordering.
 
  This is very confusing.  We think the index ordering and memory
  ordering ideas need to be separated, and specifically, we should
  avoid
  using C and F to refer to index ordering.
 
  Proposal
  -
 
  * Deprecate the use of C and F meaning backwards and forwards
  index ordering for ravel, reshape
  * Prefer Z and N, being graphical representations of unraveling
  in
  2 dimensions, axis1 first and axis0 first respectively (excellent
  naming idea by Paul Ivanov)
 
  What do y'all think?
 
  I always thought F and C are easy to understand, I always 
  thought
  about
  the content and never about the memory when using it.
 
  changing the names doesn't make it easier to understand.
  I think the confusion is because the new A and K refer to existing
  memory
 

 I disagree, I think it's confusing, but I have evidence, and that is
 that four out of four of us tested ourselves and got it wrong.

 Perhaps we are particularly dumb or poorly informed, but I think it's
 rash to assert there is no problem here.

 I think you are overcomplicating things or phrased it as a trick question

 I don't know what you mean by trick question - was there something
 over-complicated in the example?  I deliberately didn't include
 various much more confusing examples in reshape.

 I meant making the candidates think about memory instead of just
 column versus row stacking.

 To be specific, we were teaching about reshaping a (I, J, K, N) 4D
 array, it was an image, with time as the 4th dimension (N time
 points).   Raveling and reshaping 3D and 4D arrays is a common thing
 to do in neuroimaging, as you can imagine.

 A student asked what he would get back from raveling this array, a
 concatenated time series, or something spatial?

 We showed (I'd worked it out by this time) that the first N values
 were the time series given by [0, 0, 0, :].

 He said - Oh - I see - so the data is stored as a whole lot of time
 series one by one, I thought it would be stored as a series of
 images'.

 Ironically, this was a Fortran-ordered array in memory, and he was wrong.

 So, I think the idea of memory ordering and index ordering is very
 easy to confuse, and comes up naturally.

 I would like, as a teacher, to be able to say something like:

 This is what C memory layout is (it's the memory layout  that gives
 arr.flags.C_CONTIGUOUS=True)
 This is what F memory layout is (it's the memory layout  that gives
 arr.flags.F_CONTIGUOUS=True)
 It's rather easy to get something that is neither C or F memory layout
 Numpy does many memory layouts.
 Ravel and reshape and numpy in general do not care (normally) about C
 or F layouts, they only care about index ordering.

 My point, that I'm repeating, is that my job is made harder by
 'arr.ravel('F')'.

But once you know that ravel and reshape don't care about memory, the
ravel is easy to predict (maybe not easy to visualize in 4-D):

order=C: stack the last dimension, N, time series of one 3d pixels,
then stack the time series of the next pixel...
process pixels by depth and the row by row (like old TVs)

I assume you did this because your underlying array is C contiguous.
so your ravel('C') is a c-contiguous view (instead of some weird
strides or a copy)

I usually prefer time in the first dimension, and stack order=F, then
I can start at the front, stack all time periods of the first pixel,
keep going and work pixels down the columns, first page, next page,
...
(and I hope I have a F-contiguous array, so my raveled array is also
F-contiguous.)

(note: I'm bringing memory back in as optimization, but not to predict
the stacking)

Josef
(I think brains are designed for Fortran order and C-ordering in numpy
is a accident,
except, reading a Western language book is neither)



 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org

[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-03-29 Thread Matthew Brett
Hi,

We were teaching today, and found ourselves getting very confused
about ravel and shape in numpy.

Summary
--

There are two separate ideas needed to understand ordering in ravel and reshape:

Idea 1): ravel / reshape can proceed from the last axis to the first,
or the first to the last.  This is ravel index ordering
Idea 2) The physical layout of the array (on disk or in memory) can be
C or F contiguous or neither.
This is memory ordering

The index ordering is usually (but see below) orthogonal to the memory ordering.

The 'ravel' and 'reshape' commands use C and F in the sense of
index ordering, and this mixes the two ideas and is confusing.

What the current situation looks like


Specifically, we've been rolling this around 4 experienced numpy users
and we all predicted at least one of the results below wrongly.

This was what we knew, or should have known:

In [2]: import numpy as np

In [3]: arr = np.arange(10).reshape((2, 5))

In [5]: arr.ravel()
Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

So, the 'ravel' operation unravels over the last axis (1) first,
followed by axis 0.

So far so good (even if the opposite to MATLAB, Octave).

Then we found the 'order' flag to ravel:

In [10]: arr.flags
Out[10]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [11]: arr.ravel('C')
Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

But we soon got confused.  How about this?

In [12]: arr_F = np.array(arr, order='F')

In [13]: arr_F.flags
Out[13]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [16]: arr_F
Out[16]:
array([[0, 1, 2, 3, 4],
   [5, 6, 7, 8, 9]])

In [17]: arr_F.ravel('C')
Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Right - so the flag 'C' to ravel, has got nothing to do with *memory*
ordering, but is to do with *index* ordering.

And in fact, we can ask for memory ordering specifically:

In [22]: arr.ravel('K')
Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]: arr_F.ravel('K')
Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

In [24]: arr.ravel('A')
Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [25]: arr_F.ravel('A')
Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

There are some confusions to get into with the 'order' flag to reshape
as well, of the same type.

Ravel and reshape use the tems 'C' and 'F in the sense of index ordering.

This is very confusing.  We think the index ordering and memory
ordering ideas need to be separated, and specifically, we should avoid
using C and F to refer to index ordering.

Proposal
-

* Deprecate the use of C and F meaning backwards and forwards
index ordering for ravel, reshape
* Prefer Z and N, being graphical representations of unraveling in
2 dimensions, axis1 first and axis0 first respectively (excellent
naming idea by Paul Ivanov)

What do y'all think?

Cheers,

Matthew
Paul Ivanov
JB Poline
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion