Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-11 Thread Jarrod Millman
On Thu, Apr 10, 2008 at 10:17 AM, Lou Pecora [EMAIL PROTECTED] wrote:
  Yes, I use np= number of points, too.  But you all
  might want to use something else.  That's the point of
  the flexibility of import ... as

I would recommend against using np as a variable name.  Variable names
should be short and informative.  I would much rather see something
like num_points.

  Trying to lock in namespaces as np or N or whatever is
  a BAD idea.  Allow the flexibility.  You can admonish
  against from ... import * for newbies and then tell
  them to use from ... import actual function names (as
  mentioned above).  But locking people into a standard,
  even an informal one is, as someone else said, acting
  a bit too much like accountants.  Stop, please!

Coding standards and conventions in a large, collaborative codebase
are essential.  If you want to contribute to NumPy or SciPy you will
have to conform to these conventions.  In your own private code or
private project do whatever you want.

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Stéfan van der Walt
Hi Joe, all

On 10/04/2008, Joe Harrington [EMAIL PROTECTED] wrote:
  Absolutely.  Let's please standardize on:
   import numpy as np
   import scipy as sp

  I hope we do NOT standardize on these abbreviations.  While a few may
  have discussed it at a sprint, it hasn't seen broad discussion and
  there are reasons to prefer the other practice (numpy as N, scipy as
  S, pylab as P).

N is a very unfortunate choice of abbreviation, given that so many
algorithms use it to indicate the number of elements in things.  np
is much safer and, like Jarrod mentioned, also only takes two keys to
type.  Sebastian, a simple regexp replace should fix your problem
(investment in hundreds of lines of N.* usage).

  My reasons for saying this go back to my reasons for
  disliking lots of heirarchical namespaces at all: if we must have
  namespaces, let's minimize the visual and typing impact by making them
  short and visually distinct from the function names (by capitalizing
  them).

The Python Style Guide (PEP08) recommends that we stick to lowercase,
underscore-separated names.  We'd do our users a real disservice by
not following community defined standards.

Namespaces throttle the amount of information with which the user is
presented, and well thought through design leads to logical, intuitive
segmentation of functionality.

Searchable documentation and indices then become essential in guiding
the user to the right place.  On the other hand, when I was a
freshman, we had a course on MATLAB; I remember spending countless
hours using lookfor (I think that's what it is called?).  That was
one of the effects of a flat namespace.

For interactive work, a flat namespace may be ideal (and I have no
problem with us providing that as well), but otherwise, for file based
code, I'd much prefer to have a (relatively shallow) namespace
structure.

  What concerns me about the discussion is that we are still not
  thinking like communications and thought-process experts, we are
  thinking like categorizers and accountants.  The arguments we are
  raising don't have to do, positively or negatively, with the difficult
  acts of communicating with a computer and with other readers of our
  code.  Those are the sole purposes of computer languages.

Isn't it easier to explain how to use a well-structured, organised
library, rather than some functions-all-over-the-floor mess?  If an
accountant can import numpy.finance and do his work, how is that more
difficult than importing every possible function included, and then
sifting through them?

  Namespaces add characters to code that have a high redundancy factor.
  This means they pollute code, make it slow and inaccurate to read, and
  making learning harder.  Lines get longer and may wrap if they contain
  several calls.  It is harder while visually scanning code to
  distinguish the function name if it's adjacent to a bunch of other
  text, particularly if that text appears commonly in the nearby code.

Python provides very good machinery with dealing with this verbosity:

import very_foo_module as f
from very.deeply.nested.namespace import func

and even

def foo(args):
c = commonly_used_func
result = c(3) + c(4) + 2*c(5)

At the moment, everyone warns against using '*' with numpy, but with
proper namespace, the * can be quite handy:

from numpy.math import *

a = sin(theta) + 3*cos(theta**2)

(the example above already works in current numpy)

  It therefore becomes harder to spot bugs.  Mathematical code becomes
  less and less like the math expressions we write on paper when doing
  derivations, making it harder to interpret and verify.  You have to
  memorize which subpackage each function is in, which is hard to do for
  those functions that could naturally go in two subpackages.

If you have need to use a subset of functions defined over different
namespaces, it is very easy to create a custom module, say
my_field_of_study.py:

from numpy.math import cosh, sinh
from numpy.linalg import inv

etc.

Then, a simple from my_field_of_study import * provides you with
everything you need.  This needs to be done once in your life, and can
be advocated as a Cookbook recipe.  Memorisation be gone (but who
needs to memorise with TAB-completion anyway).

  While
  many math function names are obvious, subpackage names are not.  Is it
  .stat or .stats or .statistics?  .rand or .random?  .fin or
  .financial?  Some functions have this problem, but *every* namespace
  name has it in spades.

Introspection is such a joy with IPython, or with the SAGE notebook,
and many editors even provide similar functionality.  Stuffing
domain-specific functions into a flat namespace sounds like the ideal
way of confusion a new user.

 There is simply
  no reduction in readability, writeability, or debugability if you
  don't have namespace prefixes on everything, and knowing you know
  everything is easily accomplished now with the online categorized
  function list.

You're right, we read code 

Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Alexander Michael
On Thu, Apr 10, 2008 at 6:55 AM, Stéfan van der Walt [EMAIL PROTECTED] wrote:
 Hi Joe, all

  On 10/04/2008, Joe Harrington [EMAIL PROTECTED] wrote:
Absolutely.  Let's please standardize on:
 import numpy as np
 import scipy as sp
  
I hope we do NOT standardize on these abbreviations.  While a few may
have discussed it at a sprint, it hasn't seen broad discussion and
there are reasons to prefer the other practice (numpy as N, scipy as
S, pylab as P).

  N is a very unfortunate choice of abbreviation, given that so many
  algorithms use it to indicate the number of elements in things.  np
  is much safer and, like Jarrod mentioned, also only takes two keys to
  type.  Sebastian, a simple regexp replace should fix your problem
  (investment in hundreds of lines of N.* usage).

Hey! I use np *all the time* as an abbreviation for number of points. I don't
really see what the problem is with using numpy.whatever in library code and
published scripts and whatever you want in one-off throw-away scripts. It's easy
to setup a shortcut key in almost any editor to alleviate the typing burden, if
that is the main objection. If you have a section of an algorithm that you are
trying to make look as much like text-book pseudocode as possible, than you
can't do better than from numpy import whatever both for clarity and python
coding convention. You can also say d = numpy.dot in the local scope at the
top of your algorithm so you can write d(x,y) in the algorithm itself for very
pithy code that doesn't require a FAQ to understand.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Bruce Southey
Alexander Michael wrote:
 On Thu, Apr 10, 2008 at 6:55 AM, Stéfan van der Walt [EMAIL PROTECTED] 
 wrote:
   
 Hi Joe, all

  On 10/04/2008, Joe Harrington [EMAIL PROTECTED] wrote:
Absolutely.  Let's please standardize on:
 import numpy as np
 import scipy as sp
  
I hope we do NOT standardize on these abbreviations.  While a few may
have discussed it at a sprint, it hasn't seen broad discussion and
there are reasons to prefer the other practice (numpy as N, scipy as
S, pylab as P).

  N is a very unfortunate choice of abbreviation, given that so many
  algorithms use it to indicate the number of elements in things.  np
  is much safer and, like Jarrod mentioned, also only takes two keys to
  type.  Sebastian, a simple regexp replace should fix your problem
  (investment in hundreds of lines of N.* usage).
 

 Hey! I use np *all the time* as an abbreviation for number of points. I 
 don't
 really see what the problem is with using numpy.whatever in library code and
 published scripts and whatever you want in one-off throw-away scripts. It's 
 easy
 to setup a shortcut key in almost any editor to alleviate the typing burden, 
 if
 that is the main objection. If you have a section of an algorithm that you are
 trying to make look as much like text-book pseudocode as possible, than you
 can't do better than from numpy import whatever both for clarity and python
 coding convention. You can also say d = numpy.dot in the local scope at the
 top of your algorithm so you can write d(x,y) in the algorithm itself for 
 very
 pithy code that doesn't require a FAQ to understand.
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

   
Hi,
I would prefer that the user has the choice and we have to remember that 
Python is dynamic typed. It is one thing to address experienced users 
but it takes time to get experienced. Also, some people may be just 
using some other code without knowing numpy is being used. That means 
users can  'import numpy as np' either directly or indirectly and 
somewhere further in their code entry 'np=1'. Python will be quite happy 
until the code wants a numpy function - a harsh but important lesson to 
learn.

Readability (and debugging) is another situation where numpy is more 
informative than np (which is better than N) especially if someone is 
not familiar with numpy.

Regards
Bruce

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Stéfan van der Walt
On 10/04/2008, Alexander Michael [EMAIL PROTECTED] wrote:
 On Thu, Apr 10, 2008 at 6:55 AM, Stéfan van der Walt [EMAIL PROTECTED] 
 wrote:
   Hi Joe, all
  
On 10/04/2008, Joe Harrington [EMAIL PROTECTED] wrote:
  Absolutely.  Let's please standardize on:
   import numpy as np
   import scipy as sp

  I hope we do NOT standardize on these abbreviations.  While a few may
  have discussed it at a sprint, it hasn't seen broad discussion and
  there are reasons to prefer the other practice (numpy as N, scipy as
  S, pylab as P).
  
N is a very unfortunate choice of abbreviation, given that so many
algorithms use it to indicate the number of elements in things.  np
is much safer and, like Jarrod mentioned, also only takes two keys to
type.  Sebastian, a simple regexp replace should fix your problem
(investment in hundreds of lines of N.* usage).


 Hey! I use np *all the time* as an abbreviation for number of points.

Is that the de-facto abbreviation used for number of pts in your field
(i.e. in literature)?  If so, I'd recommend you stick to N :)

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Neil Crighton
Thanks Joe for the excellent post. It mirrors my experience with
Python and Numpy very eloquently, and I think it presents a good
argument against the excessive use of namespaces. I'm not so worried
about N. vs np. though - I use the same method Matthew Brett suggests.
If I'm going to use, say, sin and cos a lot in a script such that all
the np. prefixes would make the code hard to read, I'll use:

import numpy as np
from numpy import sin,cos

To those people who have invoked 'Namespaces are a honking great idea
- let's do more of those', I'll cancel that with 'Flat is better than
nested' :)  I certainly wouldn't argue that using namespaces to
separate categories of functions is always a bad thing, but I think it
should only be done as a last resort.

Neil

On 10/04/2008, Joe Harrington [EMAIL PROTECTED] wrote:
   Absolutely.  Let's please standardize on:
   import numpy as np
   import scipy as sp

  I hope we do NOT standardize on these abbreviations.  While a few may
  have discussed it at a sprint, it hasn't seen broad discussion and
  there are reasons to prefer the other practice (numpy as N, scipy as
  S, pylab as P).  My reasons for saying this go back to my reasons for
  disliking lots of heirarchical namespaces at all: if we must have
  namespaces, let's minimize the visual and typing impact by making them
  short and visually distinct from the function names (by capitalizing
  them).

  What concerns me about the discussion is that we are still not
  thinking like communications and thought-process experts, we are
  thinking like categorizers and accountants.  The arguments we are
  raising don't have to do, positively or negatively, with the difficult
  acts of communicating with a computer and with other readers of our
  code.  Those are the sole purposes of computer languages.

  Namespaces add characters to code that have a high redundancy factor.
  This means they pollute code, make it slow and inaccurate to read, and
  making learning harder.  Lines get longer and may wrap if they contain
  several calls.  It is harder while visually scanning code to
  distinguish the function name if it's adjacent to a bunch of other
  text, particularly if that text appears commonly in the nearby code.
  It therefore becomes harder to spot bugs.  Mathematical code becomes
  less and less like the math expressions we write on paper when doing
  derivations, making it harder to interpret and verify.  You have to
  memorize which subpackage each function is in, which is hard to do for
  those functions that could naturally go in two subpackages.  While
  many math function names are obvious, subpackage names are not.  Is it
  .stat or .stats or .statistics?  .rand or .random?  .fin or
  .financial?  Some functions have this problem, but *every* namespace
  name has it in spades.

  The arguments people are raising are arguments related to how
  emotionally satisfying it is to have a place for everything and
  everything in its place, and to know you know everything there is to
  know.  While we like both those things, as scientists, engineers, and
  mathematicians, they are almost irrelevant to coding.  There is simply
  no reduction in readability, writeability, or debugability if you
  don't have namespace prefixes on everything, and knowing you know
  everything is easily accomplished now with the online categorized
  function list.  We can incorporate that functionality into the doc
  reading apparatus (help, currently) by using keywords in ReST
  comments in the docstrings and providing a way for help and its
  friends to list the keywords and what functions are connected to them.

  What nobody has said is if we have lots of namespaces, my code will
  look prettier or if we have lots of namespaces, normal people will
  learn faster or if we have lots of namespaces, my code will be
  easier to verify and debug.  I don't believe any of these statements
  to be true.  Do you?

  Similarly, nobody has said, if we have lots of namespaces, I'll be a
  faster coder.  There is a *very* high obnoxiousness factor in typing
  redundant stuff at an interpreter.  It's already annoying to type
  N.sin instead of sin, but N.T.sin?  Or worse, np.tg.sin?  Now the
  prefix has twice the characters of the function itself!  Most IDL
  users *hate* that you have to type print,  in order to inspect the
  contents of a variable.  Yet, with multiple layers of namespaces we'd
  have lots more than seven extra characters on most lines of code, and
  unlike the IDL mess you'd have to *think* to recall what the right
  extra characters were for each function call, unlike just telling your
  hands to run the print,  finger macro once again.

  The reasons we all like Python relate to how quick and easy it is to
  emit code from our fingertips that is similar to what we are thinking
  in our brains, compared to other languages.  The brain doesn't declare
  variables, nor run loops over arrays.  Neither does Python.  When we
  

Re: [Numpy-discussion] Making NumPy accessible to everyone (or no-one) (was Numpy-discussion Digest, Vol 19, Issue 44)

2008-04-10 Thread Lou Pecora

--- Alexander Michael [EMAIL PROTECTED] wrote:

 Hey! I use np *all the time* as an abbreviation for
 number of points. I don't
 really see what the problem is with using
 numpy.whatever in library code and
 published scripts and whatever you want in one-off
 throw-away scripts. It's easy
 to setup a shortcut key in almost any editor to
 alleviate the typing burden, if
 that is the main objection. If you have a section of
 an algorithm that you are
 trying to make look as much like text-book
 pseudocode as possible, than you
 can't do better than from numpy import whatever
 both for clarity and python
 coding convention. You can also say d = numpy.dot
 in the local scope at the
 top of your algorithm so you can write d(x,y) in
 the algorithm itself for very
 pithy code that doesn't require a FAQ to understand.

Yes, I use np= number of points, too.  But you all
might want to use something else.  That's the point of
the flexibility of import ... as 

Trying to lock in namespaces as np or N or whatever is
a BAD idea.  Allow the flexibility.  You can admonish
against from ... import * for newbies and then tell
them to use from ... import actual function names (as
mentioned above).  But locking people into a standard,
even an informal one is, as someone else said, acting
a bit too much like accountants.  Stop, please!



-- Lou Pecora,   my views are my own.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion