from:"Benjamin Root"

Re: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that.

2012-07-10 Thread Benjamin Root

On Tue, Jul 10, 2012 at 3:37 AM, Robert Kern robert.k...@gmail.com wrote:

 On Tue, Jul 10, 2012 at 4:32 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  Hi All,
 
  I've been adding type specific sorts for object and structured arrays. It
  seems that datetime64 and timedelta64 are also not supported. Is there
 any
  reason why those types should not be sorted as int64?

 You need special handling for NaTs to be consistent with how we deal
 with NaNs in floats.


Not sure if this is an issue or not, but different datetime64 objects can
be set for different units:
http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html#datetime-units.
A straight-out comparison of the values as int64 would likely drop the
units, correct?  On second thought, though, I guess all datetime64's in a
numpy array would all have the same units, so it shouldn't matter, right?

Just thinking aloud.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc.

2012-07-10 Thread Benjamin Root

On Tue, Jul 10, 2012 at 6:07 AM, Ralf Gommers
ralf.gomm...@googlemail.comwrote:



 On Tue, Jul 10, 2012 at 11:36 AM, Ralf Gommers 
 ralf.gomm...@googlemail.com wrote:



 On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman 
 silberman@gmail.comwrote:

 Hi all,

 Some colleagues and I are interested in contributing to numpy. We have
 a range of backgrounds -- I for example am new to contributing to open
 source software but have a (small) bit of background in scientific
 computation, while others have extensive experience contributing to
 open source projects. We've looked at the issue tracker and submitted
 a couple patches today but we would be interested to hear what active
 contributors to the project consider the most pressing, important,
 and/or interesting needs at the moment. I personally am quite
 interested in hearing about the most pressing documentation needs
 (including example code).


 As for important issues, I think many of them are related to the core of
 numpy. But there's some more isolated ones, which is probably better to get
 started. Here are some that are high on my list of things to fix/improve:

 - Numpy doesn't work well (or at all) on OS X 10.7 when built with
 llvm-gcc, which is the default compiler on that platform. With Clang it
 seems to work fine. Same for Scipy.
 http://projects.scipy.org/numpy/ticket/1951

 - We don't have binary installers for Python 3.x on OS X yet. This
 requires adapting the installer build scripts that work for 2.x. See
 pavement.py in the base dir of the repo.

 - Something that's more straightforward: improving test coverage. It's
 lacking in a number of places; one of the things that comes to mind is that
 all functions should be tested for correct behavior with empty input.
 Normally the expected behavior is empty in -- empty out. When that's not
 tested, we get things like http://projects.scipy.org/numpy/ticket/2078.
 Ticket for empty test coverage:
 http://projects.scipy.org/numpy/ticket/2007

 - There's a large amount of normal bugs, working on any of those would
 be very helpful too. Hard to say here which ones out of the several hundred
 are important. It is safe to say though I think that the ones requiring
 touching the C code are more in need of attention than the pure Python ones.


 I see a patch for f2py already, and a second ticket opened. This is of
 course useful, but not too many devs are working on it. Unless Pearu has
 time to respond this week, it may be hard to get feedback on that topic
 quickly.


 Here are some relatively straightforward issues which only require
 touching Python code:

 http://projects.scipy.org/numpy/ticket/808
 http://projects.scipy.org/numpy/ticket/1968
 http://projects.scipy.org/numpy/ticket/1976
 http://projects.scipy.org/numpy/ticket/1989

 And a Cython one (numpy.random):
 http://projects.scipy.org/numpy/ticket/1492

 I ran into one more patch that I assume one of you just attached:
 http://projects.scipy.org/numpy/ticket/2074. It's important to understand
 a little of how our infrastructure works. We changed to git + github last
 year; submitting patches as pull requests on Github has the lowest overhead
 for us, and we get notifications. For patches on Trac, we have to manually
 download and apply them. Plus we don't get notifications, which is quite
 unhelpful unfortunately. Therefore I suggest using git, and if you can't or
 you feel that the overhead / learning curve is too large, please ping this
 mailing list about patches you submit on Trac.

 Cheers,
 Ralf


By the way, for those who are looking to learn how to use git and github:

https://github.com/blog/1183-try-git-in-your-browser

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] build numpy 1.6.2

2012-07-10 Thread Benjamin Root

On Tue, Jul 10, 2012 at 2:45 PM, Prakash Joshi pjo...@numenta.com wrote:

  Hi All,

  I built numpy 1.6.2 on linux 64 bit and installed numpy in
 site-packages,  It pass all the test cases of numpy, but I am not sure if
 this is good build; As I did not specified any fortran compiler while
 setup, also I do not have fortran compiler on my machine.

  Thanks
 Prakash


NumPy does not need Fortran for its build.  SciPy, however, does.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] build numpy 1.6.2

2012-07-10 Thread Benjamin Root

Prakash,

On Tue, Jul 10, 2012 at 3:26 PM, Prakash Joshi pjo...@numenta.com wrote:

  Thanks Ben.

  Also I did not specified any of BLAS, LAPACK, ATLAS libraries, do we
 need these libraries for numpy?


Need, no, you do not need them in the sense that NumPy does not require
them to work.  NumPy will work just fine without those libraries.  However,
if you want them, then that is where the choice of Fortran compiler comes
in.  Look at the INSTALL.txt file for more detailed instructions.


  I simply used following command to build:
   python setup.py build
   python setup.py install —prefix=/usr/local

  If above commands are sufficient, than I hope same steps to build will
 work on Mac OSX?


That entirely depends on your development setup on your Mac.  I will leave
that discussion up to others on the list to answer.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Remove current 1.7 branch?

2012-07-12 Thread Benjamin Root

On Thursday, July 12, 2012, Thouis (Ray) Jones wrote:

 On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris
 charlesr.har...@gmail.com javascript:; wrote:
  Hi All,
 
  Travis and I agree that it would be appropriate to remove the current
 1.7.x
  branch and branch again after a code freeze. That way we can avoid the
 pain
  and potential errors of backports. It is considered bad form to mess with
  public repositories that way, so another option would be to rename the
  branch, although I'm not sure how well that would work. Suggestions?

 I might be mistaken, but if the branch is merged into master (even if
 that merge makes no changes), I think it's safe to delete it at that
 point (and recreate it at a later date with the same name) with
 regards to remote repositories.  It should be fairly easy to test.

 Ray Jones


No, that is not the case.  We had a situation occur awhile back where one
of the public branches of mpl got completely messed up.  You can't even
rename it since the rename doesn't occur in the pulls and merges.

What we ended up doing was creating a brand new branch v1.0.x-maint and
making sure all the devs knew to switch over to that.  You might even go a
step further and make a final commit to the bad branch that makes the build
fail with a big note explaining what to do.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Remove current 1.7 branch?

2012-07-12 Thread Benjamin Root

On Thursday, July 12, 2012, Nathaniel Smith wrote:

 On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root 
 ben.r...@ou.edujavascript:;
 wrote:
 
 
  On Thursday, July 12, 2012, Thouis (Ray) Jones wrote:
 
  On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris
  charlesr.har...@gmail.com javascript:; wrote:
   Hi All,
  
   Travis and I agree that it would be appropriate to remove the current
   1.7.x
   branch and branch again after a code freeze. That way we can avoid the
   pain
   and potential errors of backports. It is considered bad form to mess
   with
   public repositories that way, so another option would be to rename the
   branch, although I'm not sure how well that would work. Suggestions?
 
  I might be mistaken, but if the branch is merged into master (even if
  that merge makes no changes), I think it's safe to delete it at that
  point (and recreate it at a later date with the same name) with
  regards to remote repositories.  It should be fairly easy to test.
 
  Ray Jones
 
 
  No, that is not the case.  We had a situation occur awhile back where
 one of
  the public branches of mpl got completely messed up.  You can't even
 rename
  it since the rename doesn't occur in the pulls and merges.
 
  What we ended up doing was creating a brand new branch v1.0.x-maint and
  making sure all the devs knew to switch over to that.  You might even go
 a
  step further and make a final commit to the bad branch that makes the
 build
  fail with a big note explaining what to do.

 The branch isn't bad, it's just out of date. So long as the new
 version of the branch has the current version of the branch in its
 ancestry, then everything will be fine.

 Option 1:
   git checkout master
   git merge maint1.7.x
   git checkout maint1.7.x
   git merge master # will be a fast-forward

 Option 2:
   git checkout master
   git merge maint1.7.x
   git branch -d maint1.7.x  # delete the branch
   git checkout -b maint1.7.x  # recreate it

 In git terms these two options are literally identical; they result in
 the exact same repo state...

 -N


Ah, I misunderstood.  Then yes, I think this is correct.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root

On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE chaoyue...@gmail.com wrote:

 Dear all,

 I want to create a function and I would like one of the arguments of the
 function to determine what slicing of numpy array I want to use.
 a simple example:

 a=np.arange(100).reshape(10,10)

 suppose I want to have a imaging function to show image of part of this
 data:

 def show_part_of_data(m,n):
 plt.imshow(a[m,n])

 like I can give m=3:5, n=2:7, when I call function
 show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]).
 the above example doesn't work in reality. but it illustrates something
 similar that I desire, that is, I can specify what slicing of
 number array I want by giving values to function arguments.

 thanks a lot,

 Chao



What you want to do is create slice objects.

a[3:5]

is equivalent to

sl = slice(3, 5)
a[sl]


and

a[3:5, 5:14]

is equivalent to

sl = (slice(3, 5), slice(5, 14))
a[sl]

Furthermore, notation such as ::-1 is equivalent to slice(None, None, -1)

I hope this helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root

On Thu, Jul 12, 2012 at 4:46 PM, Chao YUE chaoyue...@gmail.com wrote:

 Hi Ben,

 it helps a lot. I am nearly finishing a function in a way I think
 pythonic.
 Just one more question, I have:

 In [24]: b=np.arange(1,11)

 In [25]: b
 Out[25]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

 In [26]: b[slice(1)]
 Out[26]: array([1])

 In [27]: b[slice(4)]
 Out[27]: array([1, 2, 3, 4])

 In [28]: b[slice(None,4)]
 Out[28]: array([1, 2, 3, 4])

 so slice(4) is actually slice(None,4), how can I exactly want retrieve
 a[4] using slice object?

 thanks again!

 Chao


Tricky question.  Note the difference between

a[4]

and

a[4:5]

The first returns a scalar, while the second returns an array.  The first,
though, is not a slice, just an integer.

Also, note that the arguments for slice() behaves very similar to the
arguments for range() (with some exceptions/differences).

Cheers!
Ben Root



 2012/7/12 Benjamin Root ben.r...@ou.edu



 On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE chaoyue...@gmail.com wrote:

 Dear all,

 I want to create a function and I would like one of the arguments of the
 function to determine what slicing of numpy array I want to use.
 a simple example:

 a=np.arange(100).reshape(10,10)

 suppose I want to have a imaging function to show image of part of this
 data:

 def show_part_of_data(m,n):
 plt.imshow(a[m,n])

 like I can give m=3:5, n=2:7, when I call function
 show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]).
 the above example doesn't work in reality. but it illustrates something
 similar that I desire, that is, I can specify what slicing of
 number array I want by giving values to function arguments.

 thanks a lot,

 Chao



 What you want to do is create slice objects.

 a[3:5]

 is equivalent to

 sl = slice(3, 5)
 a[sl]


 and

 a[3:5, 5:14]

 is equivalent to

 sl = (slice(3, 5), slice(5, 14))
 a[sl]

 Furthermore, notation such as ::-1 is equivalent to slice(None, None,
 -1)

 I hope this helps!
 Ben Root


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




 --

 ***
 Chao YUE
 Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
 UMR 1572 CEA-CNRS-UVSQ
 Batiment 712 - Pe 119
 91191 GIF Sur YVETTE Cedex
 Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16

 


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root

On Thursday, July 12, 2012, Chao YUE wrote:

 Thanks all for the discussion. Actually I am trying to use something like
 numpy ndarray indexing in the function. Like when I call:

 func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and
 func(a,'1:3,:,4') for a[1:3,:,4] ect.
 I am very close now.

 #so this function changes the string to list of slice objects.
 def convert_string_to_slice(slice_string):
 
 provide slice_string as '2:3,:', it will return [slice(2, 3, None),
 slice(None, None, None)]
 
 slice_list=[]
 split_slice_string_list=slice_string.split(',')
 for sub_slice_string in split_slice_string_list:
 split_sub=sub_slice_string.split(':')
 if len(split_sub)==1:
 sub_slice=slice(int(split_sub[0]))
 else:
 if split_sub[0]=='':
 sub1=None
 else:
 sub1=int(split_sub[0])
 if split_sub[1]=='':
 sub2=None
 else:
 sub2=int(split_sub[1])
 sub_slice=slice(sub1,sub2)
 slice_list.append(sub_slice)
 return slice_list

 In [119]: a=np.arange(3*4*5).reshape(3,4,5)

 for this it works fine.
 In [120]: convert_string_to_slice('1:3,:,2:4')
 Out[120]: [slice(1, 3, None), slice(None, None, None), slice(2, 4, None)]

 In [121]: a[slice(1, 3, None), slice(None, None, None), slice(2, 4,
 None)]==a[1:3,:,2:4]
 Out[121]:
 array([[[ True,  True],
 [ True,  True],
 [ True,  True],
 [ True,  True]],

[[ True,  True],
 [ True,  True],
 [ True,  True],
 [ True,  True]]], dtype=bool)

 And problems happens when I want to retrieve a single number along a given
 dimension:
 because it treats 1:3,:,4 as 1:3,:,:4, as shown below:

 In [122]: convert_string_to_slice('1:3,:,4')
 Out[122]: [slice(1, 3, None), slice(None, None, None), slice(None, 4,
 None)]

 In [123]: a[1:3,:,4]
 Out[123]:
 array([[24, 29, 34, 39],
[44, 49, 54, 59]])

 In [124]: a[slice(1, 3, None), slice(None, None, None), slice(None, 4,
 None)]
 Out[124]:
 array([[[20, 21, 22, 23],
 [25, 26, 27, 28],
 [30, 31, 32, 33],
 [35, 36, 37, 38]],

[[40, 41, 42, 43],
 [45, 46, 47, 48],
 [50, 51, 52, 53],
 [55, 56, 57, 58]]])


 Then I have a function:

 #this function retrieves data from ndarray a by specifying slice_string:
 def retrieve_data(a,slice_string):
 slice_list=convert_string_to_slice(slice_string)
 return a[*slice_list]

 In the list line of the fuction retrieve_data I have problem, I get an
 invalid syntax error.

 return a[*slice_list]
  ^
 SyntaxError: invalid syntax

 I hope it's not too long, please comment as you like. Thanks a lot

 Chao


I won't comment on the wisdom of your approach, but for you very last part,
don't try unpacking the slice list.  Also, I think it has to be a tuple,
but I could be wrong on that.

Ben Root


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.complex

2012-07-23 Thread Benjamin Root

On Monday, July 23, 2012, OC wrote:

   It's unPythonic just in the sense that it is unlike every other type
   constructor in Python. int(x) returns an int, list(x) returns a list,
   but np.complex64(x) sometimes returns a np.complex64, and sometimes it
   returns a np.ndarray, depending on what 'x' is.

 This object factory design pattern adds useful and natural functionality.

   I can see an argument for deprecating this behaviour altogether and
   referring people to the np.asarray(x, dtype=complex) form; that would
   be cleaner and reduce confusion. Don't know if it's worth it, but
   that's the only cleanup that I can see even being considered for these
   constructors.

  From my experience in teaching, I can tell that even beginners have no
 problem with the fact that complex128(1) returns a scalar and that
 complex128(r_[1]) returns an array. It seems to be pretty natural.

 Also, from the duck-typing point of view, both returned values are
 complex, i.e. provide 'real' and 'imag' attributes and 'conjugate()'
 method.

 On the contrary a real confusion is with numpy.complex acting
 differently than the other numpy.complex*.

   People do write from numpy import *

 Yeah, that's what I do very often in interactive ipython sessions.
 Other than this, people are warned often enough that this shouldn't be
 used in real programs.



Don't be so sure of that.  The pylab mode from matplotlib has been both a
blessing and a curse.  This mode is very popular and for many, it is all
they need/want to know.  While it has made the transition from other
languages easier for many, the polluted namespace comes at a small cost.

And it is only going to get worse when moving over to py3k where just about
everything is a generator.  __builtin__.any can handle generators, but
np.any does not.  Same goes for several other functions.

Note, I do agree with you that the discrepancy needs to be fixed, I just am
not sure which way.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Synonym standards

2012-07-26 Thread Benjamin Root

On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams fn...@ncf.ca wrote:

  It seems that these standards have been adopted, which is good:

 The following import conventions are used throughout the NumPy source and
 documentation:

 import numpy as np
 import matplotlib as mpl
 import matplotlib.pyplot as plt

 Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt

  Is there some similar standard for PyLab?

 Thanks,

 Colin W.



Colin,

Typically, with pylab mode of matplotlib, you do:

from pylab import *

This is essentially equivalent to:

from numpy import *
from matplotlib.pyplot import *

Note that the pylab module is actually a part of matplotlib and is a
shortcut to provide an environment that is very familiar to Matlab users.
Converts are then encouraged to use the imports you mentioned in order to
properly utilize python namespaces.

I hope that helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Synonym standards

2012-07-27 Thread Benjamin Root

On Thu, Jul 26, 2012 at 7:12 PM, Robert Kern robert.k...@gmail.com wrote:

 On Fri, Jul 27, 2012 at 12:05 AM, Colin J. Williams
 cjwilliam...@gmail.com wrote:
  On 26/07/2012 4:57 PM, Benjamin Root wrote:
 
 
  On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams fn...@ncf.ca wrote:
 
  It seems that these standards have been adopted, which is good:
 
  The following import conventions are used throughout the NumPy source
 and
  documentation:
 
  import numpy as np
  import matplotlib as mpl
  import matplotlib.pyplot as plt
 
  Source:
  https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
 
  Is there some similar standard for PyLab?
 
  Thanks,
 
  Colin W.
 
 
 
  Colin,
 
  Typically, with pylab mode of matplotlib, you do:
 
  from pylab import *
 
  This is essentially equivalent to:
 
  from numpy import *
  from matplotlib.pyplot import *
 
  Note that the pylab module is actually a part of matplotlib and is a
  shortcut to provide an environment that is very familiar to Matlab users.
  Converts are then encouraged to use the imports you mentioned in order to
  properly utilize python namespaces.
 
  I hope that helps!
  Ben Root
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
  Thanks Ben,
 
  I would prefer not to use:  from xxx import *,
 
  because of the name pollution.
 
  The name  convention that I copied above facilitates avoiding the
 pollution.
 
  In the same spirit, I've used:
  import pylab as plb

 But in that same spirit, using np and plt separately is preferred.


Namespaces are one honking great idea -- let's do more of those!
from http://www.python.org/dev/peps/pep-0020/

Absolutely correct.  The namespace pollution is exactly why we encourage
converts to move over from the pylab mode to separating out the numpy and
pyplot namespaces.  There are very subtle issues that arise when doing
from pylab import * such as overriding the built-in any and all.  The
only real advantage of the pylab mode over separating out numpy and pyplot
is conciseness, which many matlab users expect at first.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root

On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge ho...@stsci.edu wrote:

 On a Linux machine:

   uname -srvop
 Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64
 GNU/Linux

 this example shows an apparent problem with the where function:

 Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
 [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
 Type help, copyright, credits or license for more information.
   import numpy as np
   print np.__version__
 1.5.1
   net = np.zeros(3, dtype='f4')
   net[1] = 0.00458849
   net[2] = 0.605202
   max_net = net.max()
   test = np.where(net = 0., max_net, net)
   print test
 [ -2.23910537e-35   4.58848989e-03   6.05202019e-01]

 When I specified the dtype for net as 'f8', test[0] was
 3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202)
 when I specified float(max_net) as the second argument to np.where.

 Phil


Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine.
Strange indeed.

Breaking it down further:

 res = (net = 0.)
 print res
[ True False False]
 np.where(res, max_net, net)
array([ -2.23910537e-35,   4.58848989e-03,   6.05202019e-01], dtype=float32)

Very Strange...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root

On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller
amuel...@ais.uni-bonn.dewrote:

 Hi Everybody.
 The bug is that no error is raised, right?
 The docs say

 where(condition, [x, y])

 x, y : array_like, optional
  Values from which to choose. `x` and `y` need to have the same
  shape as `condition`

 In the example you gave, x was a scalar.

 Cheers,
 Andy


Hmm, that is incorrect, I believe.  I have used a scalar before.  Maybe it
works because a scalar is broadcastable to the same shape as any other
N-dim array?

If so, then the wording of that docstring needs to be fixed.

No, I think Christopher hit it on the head.  For whatever reason, the
endian-ness somewhere is not being respected and causes a byte-swapped
version to show up.  How that happens, though, is beyond me.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: NumPy 1.7.0b1 release

2012-08-23 Thread Benjamin Root

On Tue, Aug 21, 2012 at 12:24 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote:

 Hi,

 I'm pleased to announce the availability of the first beta release of
 NumPy 1.7.0b1.

 Sources and binary installers can be found at
 https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b1/

 Please test this release and report any issues on the numpy-discussion
 mailing list. The following problems are known and
 we'll work on fixing them before the final release:

 http://projects.scipy.org/numpy/ticket/2187
 http://projects.scipy.org/numpy/ticket/2185
 http://projects.scipy.org/numpy/ticket/2066
 http://projects.scipy.org/numpy/ticket/1588
 http://projects.scipy.org/numpy/ticket/2076
 http://projects.scipy.org/numpy/ticket/2101
 http://projects.scipy.org/numpy/ticket/2108
 http://projects.scipy.org/numpy/ticket/2150
 http://projects.scipy.org/numpy/ticket/2189

 I would like to thank Ralf for a lot of help with creating binaries
 and other help for this release.

 Cheers,
 Ondrej



At http://docs.scipy.org/doc/numpy/contents.html, it looks like the TOC
tree is a bit messed up.  For example, I see that masked arrays are listed
multiple times, and I think some of the sub-entries for masked arrays show
up multiple times within an entry for masked arrays.  Some of the bullets
appear as  instead of dots.

Don't know what version that page is generated from, but we might want to
double-check that 1.7.0's docs don't have the same problem.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] broadcasting question

2012-08-30 Thread Benjamin Root

On Thursday, August 30, 2012, Neal Becker wrote:

 I think this should be simple, but I'm drawing a blank

 I have 2 2d matrixes

 Matrix A has indexes (i, symbol)
 Matrix B has indexes (state, symbol)

 I combined them into a 3d matrix:

 C = A[:,newaxis,:] + B[newaxis,:,:]
 where C has indexes (i, state, symbol)

 That works fine.

 Now suppose I want to omit B (for debug), like:

 C = A[:,newaxis,:]

 In other words, all I want is to add a dimension into A and force it to
 broadcast along that axis.  How do I do that?


np.tile would help you there, I think.


Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

2012-09-07 Thread Benjamin Root

An issue just reported on the matplotlib-users list involved a user who ran
out of memory while attempting to do an imshow() on a large array.  While
this wouldn't be totally unexpected, the user's traceback shows that they
ran out of memory before any actual building of the image occurred.  Memory
usage sky-rocketed when imshow() attempted to determine the min and max of
the image.  The input data was a masked array, and it appears that the
implementation of min() for masked arrays goes something like this
(paraphrasing here):

obj.filled(inf).min()

The idea is that any masked element is set to the largest possible value
for their dtype in a copied array of itself, and then a min() is performed
on that copied array.  I am assuming that max() does the same thing.

Can this be done differently/more efficiently?  If the filled approach
has to be done, maybe it would be a good idea to make the copy in chunks
instead of all at once?  Ideally, it would be nice to avoid the copying
altogether and utilize some of the special iterators that Mark Weibe
created last year.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-17 Thread Benjamin Root

Consider the following code:

import numpy as np
a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
a *= float(255) / 15

In v1.6.x, this yields:
array([17, 34, 51, 68, 85], dtype=int16)

But in master, this throws an exception about failing to cast via same_kind.

Note that numpy was smart about this operation before, consider:
a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
a *= float(128) / 256

yields:
array([0, 1, 1, 2, 2], dtype=int16)

Of course, this is different than if one does it in a non-in-place manner:
np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5

which yields an array with floating point dtype in both versions.  I can
appreciate the arguments for preventing this kind of implicit casting
between non-same_kind dtypes, but I argue that because the operation is
in-place, then I (as the programmer) am explicitly stating that I desire to
utilize the current array to store the results of the operation, dtype and
all.  Obviously, we can't completely turn off this rule (for example, an
in-place addition between integer array and a datetime64 makes no sense),
but surely there is some sort of happy medium that would allow these sort
of operations to take place?

Lastly, if it is determined that it is desirable to allow in-place
operations to continue working like they have before, I would like to see
such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
as matplotlib, where this issue was first found) would have to change their
code anyway just to be compatible with numpy.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root

On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant tra...@continuum.iowrote:


 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:

  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast via
 same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256

  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a non-in-place
 manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both versions.  I
 can appreciate the arguments for preventing this kind of implicit casting
 between non-same_kind dtypes, but I argue that because the operation is
 in-place, then I (as the programmer) am explicitly stating that I desire to
 utilize the current array to store the results of the operation, dtype and
 all.  Obviously, we can't completely turn off this rule (for example, an
 in-place addition between integer array and a datetime64 makes no sense),
 but surely there is some sort of happy medium that would allow these sort
 of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow in-place
 operations to continue working like they have before, I would like to see
 such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
 as matplotlib, where this issue was first found) would have to change their
 code anyway just to be compatible with numpy.

 I agree that in-place operations should allow different casting rules.
  There are different opinions on this, of course, but generally this is how
 NumPy has worked in the past.

 We did decide to change the default casting rule to same_kind but
 making an exception for in-place seems reasonable.


 I think that in these cases same_kind will flag what are most likely
 programming errors and sloppy code. It is easy to be explicit and doing so
 will make the code more readable because it will be immediately obvious
 what the multiplicand is without the need to recall what the numpy casting
 rules are in this exceptional case. IISTR several mentions of this before
 (Gael?), and in some of those cases it turned out that bugs were being
 turned up. Catching bugs with minimal effort is a good thing.

 Chuck


True, it is quite likely to be a programming error, but then again, there
are many cases where it isn't.  Is the problem strictly that we are trying
to downcast the float to an int, or is it that we are trying to downcast to
a lower precision?  Is there a way for one to explicitly relax the
same_kind restriction?

Thanks,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

2012-09-18 Thread Benjamin Root

On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith n...@pobox.com wrote:

 On 7 Sep 2012 14:38, Benjamin Root ben.r...@ou.edu wrote:
 
  An issue just reported on the matplotlib-users list involved a user who
 ran out of memory while attempting to do an imshow() on a large array.
 While this wouldn't be totally unexpected, the user's traceback shows that
 they ran out of memory before any actual building of the image occurred.
 Memory usage sky-rocketed when imshow() attempted to determine the min and
 max of the image.  The input data was a masked array, and it appears that
 the implementation of min() for masked arrays goes something like this
 (paraphrasing here):
 
  obj.filled(inf).min()
 
  The idea is that any masked element is set to the largest possible value
 for their dtype in a copied array of itself, and then a min() is performed
 on that copied array.  I am assuming that max() does the same thing.
 
  Can this be done differently/more efficiently?  If the filled approach
 has to be done, maybe it would be a good idea to make the copy in chunks
 instead of all at once?  Ideally, it would be nice to avoid the copying
 altogether and utilize some of the special iterators that Mark Weibe
 created last year.

 I think what you're looking for is where= support for ufunc.reduce. This
 isn't implemented yet but at least it's straightforward in principle...
 otherwise I don't know anything better than reimplementing .min() by hand.

 -n


Yes, it was the where= support that I was thinking of.  I take it that it
was pulled out of the 1.7 branch with the rest of the NA stuff?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root

On Tue, Sep 18, 2012 at 3:19 PM, Ralf Gommers ralf.gomm...@gmail.comwrote:



 On Tue, Sep 18, 2012 at 9:13 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root ben.r...@ou.edu wrote:



 On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
 tra...@continuum.iowrote:


 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:

  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast via
 same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256

  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a non-in-place
 manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both versions.
  I can appreciate the arguments for preventing this kind of implicit
 casting between non-same_kind dtypes, but I argue that because the
 operation is in-place, then I (as the programmer) am explicitly stating
 that I desire to utilize the current array to store the results of the
 operation, dtype and all.  Obviously, we can't completely turn off this
 rule (for example, an in-place addition between integer array and a
 datetime64 makes no sense), but surely there is some sort of happy medium
 that would allow these sort of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow in-place
 operations to continue working like they have before, I would like to see
 such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
 as matplotlib, where this issue was first found) would have to change 
 their
 code anyway just to be compatible with numpy.

 I agree that in-place operations should allow different casting
 rules.  There are different opinions on this, of course, but generally 
 this
 is how NumPy has worked in the past.

 We did decide to change the default casting rule to same_kind but
 making an exception for in-place seems reasonable.


 I think that in these cases same_kind will flag what are most likely
 programming errors and sloppy code. It is easy to be explicit and doing so
 will make the code more readable because it will be immediately obvious
 what the multiplicand is without the need to recall what the numpy casting
 rules are in this exceptional case. IISTR several mentions of this before
 (Gael?), and in some of those cases it turned out that bugs were being
 turned up. Catching bugs with minimal effort is a good thing.

 Chuck


 True, it is quite likely to be a programming error, but then again,
 there are many cases where it isn't.  Is the problem strictly that we are
 trying to downcast the float to an int, or is it that we are trying to
 downcast to a lower precision?  Is there a way for one to explicitly relax
 the same_kind restriction?


 I think the problem is down casting across kinds, with the result that
 floats are truncated and the imaginary parts of imaginaries might be
 discarded. That is, the value, not just the precision, of the rhs changes.
 So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
 integer.

 It is true that this forces downstream to code up to a higher standard,
 but I don't see that as a bad thing, especially if it exposes bugs. And it
 isn't difficult to fix.

 Chuck


 Mind you, in my case, casting the rhs as an integer before doing the
 multiplication would be a bug, since our value for the rhs is usually
 between zero and one.  Multiplying first by the integer numerator before
 dividing by the integer denominator would likely cause issues with
 overflowing the 16 bit integer.


 Then you'd have to do


  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  np.multiply(a, 0.5, out=a, casting=unsafe)

 array([0, 1, 1, 2, 2], dtype=int16)

 Ralf


That is exactly what I am looking for!  When did the casting kwarg come
about?  I am unfamiliar with it.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root

On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root ben.r...@ou.edu wrote:



 On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
 tra...@continuum.iowrote:


 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:

  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast via
 same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256

  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a non-in-place
 manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both versions.
  I can appreciate the arguments for preventing this kind of implicit
 casting between non-same_kind dtypes, but I argue that because the
 operation is in-place, then I (as the programmer) am explicitly stating
 that I desire to utilize the current array to store the results of the
 operation, dtype and all.  Obviously, we can't completely turn off this
 rule (for example, an in-place addition between integer array and a
 datetime64 makes no sense), but surely there is some sort of happy medium
 that would allow these sort of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow in-place
 operations to continue working like they have before, I would like to see
 such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
 as matplotlib, where this issue was first found) would have to change 
 their
 code anyway just to be compatible with numpy.

 I agree that in-place operations should allow different casting
 rules.  There are different opinions on this, of course, but generally 
 this
 is how NumPy has worked in the past.

 We did decide to change the default casting rule to same_kind but
 making an exception for in-place seems reasonable.


 I think that in these cases same_kind will flag what are most likely
 programming errors and sloppy code. It is easy to be explicit and doing so
 will make the code more readable because it will be immediately obvious
 what the multiplicand is without the need to recall what the numpy casting
 rules are in this exceptional case. IISTR several mentions of this before
 (Gael?), and in some of those cases it turned out that bugs were being
 turned up. Catching bugs with minimal effort is a good thing.

 Chuck


 True, it is quite likely to be a programming error, but then again,
 there are many cases where it isn't.  Is the problem strictly that we are
 trying to downcast the float to an int, or is it that we are trying to
 downcast to a lower precision?  Is there a way for one to explicitly relax
 the same_kind restriction?


 I think the problem is down casting across kinds, with the result that
 floats are truncated and the imaginary parts of imaginaries might be
 discarded. That is, the value, not just the precision, of the rhs changes.
 So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
 integer.

 It is true that this forces downstream to code up to a higher standard,
 but I don't see that as a bad thing, especially if it exposes bugs. And it
 isn't difficult to fix.

 Chuck


 Mind you, in my case, casting the rhs as an integer before doing the
 multiplication would be a bug, since our value for the rhs is usually
 between zero and one.  Multiplying first by the integer numerator before
 dividing by the integer denominator would likely cause issues with
 overflowing the 16 bit integer.


 For the case in point I'd do

 In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16)

 In [2]: a //= 2

 In [3]: a
 Out[3]: array([0, 1, 1, 2, 2], dtype=int16)

 Although I expect you would want something different in practice. But the
 current code already looks fragile to me and I think it is a good thing you
 are taking a closer look at it. If you really intend going through a float,
 then it should be something like

 a = (a*(float(128)/256)).astype(int16)

 Chuck


And thereby losing the memory benefit of an in-place multiplication?  That
is sort of the point of all this.  We are using 16 bit integers because we
wanted to be as efficient as possible and didn't need anything larger.
Note, that is what we changed the code to, I am just wondering if we are
being too cautious.  The casting kwarg looks to be what I might want,
though it isn't as clean as just writing an *= statement.

Ben Root

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root

On Tue, Sep 18, 2012 at 4:42 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant tra...@continuum.iowrote:


 On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote:



 On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root ben.r...@ou.eduwrote:



 On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
 tra...@continuum.io wrote:


 On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:

  Consider the following code:
 
  import numpy as np
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(255) / 15
 
  In v1.6.x, this yields:
  array([17, 34, 51, 68, 85], dtype=int16)
 
  But in master, this throws an exception about failing to cast
 via same_kind.
 
  Note that numpy was smart about this operation before, consider:
  a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
  a *= float(128) / 256

  yields:
  array([0, 1, 1, 2, 2], dtype=int16)
 
  Of course, this is different than if one does it in a
 non-in-place manner:
  np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
 
  which yields an array with floating point dtype in both
 versions.  I can appreciate the arguments for preventing this kind of
 implicit casting between non-same_kind dtypes, but I argue that 
 because the
 operation is in-place, then I (as the programmer) am explicitly 
 stating
 that I desire to utilize the current array to store the results of the
 operation, dtype and all.  Obviously, we can't completely turn off 
 this
 rule (for example, an in-place addition between integer array and a
 datetime64 makes no sense), but surely there is some sort of happy 
 medium
 that would allow these sort of operations to take place?
 
  Lastly, if it is determined that it is desirable to allow
 in-place operations to continue working like they have before, I 
 would like
 to see such a fix in v1.7 because if it isn't in 1.7, then other 
 libraries
 (such as matplotlib, where this issue was first found) would have to 
 change
 their code anyway just to be compatible with numpy.

 I agree that in-place operations should allow different casting
 rules.  There are different opinions on this, of course, but 
 generally this
 is how NumPy has worked in the past.

 We did decide to change the default casting rule to same_kind
 but making an exception for in-place seems reasonable.


 I think that in these cases same_kind will flag what are most
 likely programming errors and sloppy code. It is easy to be explicit 
 and
 doing so will make the code more readable because it will be 
 immediately
 obvious what the multiplicand is without the need to recall what the 
 numpy
 casting rules are in this exceptional case. IISTR several mentions of 
 this
 before (Gael?), and in some of those cases it turned out that bugs were
 being turned up. Catching bugs with minimal effort is a good thing.

 Chuck


 True, it is quite likely to be a programming error, but then again,
 there are many cases where it isn't.  Is the problem strictly that we 
 are
 trying to downcast the float to an int, or is it that we are trying to
 downcast to a lower precision?  Is there a way for one to explicitly 
 relax
 the same_kind restriction?


 I think the problem is down casting across kinds, with the result
 that floats are truncated and the imaginary parts of imaginaries might be
 discarded. That is, the value, not just the precision, of the rhs 
 changes.
 So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
 integer.

 It is true that this forces downstream to code up to a higher
 standard, but I don't see that as a bad thing, especially if it exposes
 bugs. And it isn't difficult to fix.

 Chuck


 Mind you, in my case, casting the rhs as an integer before doing the
 multiplication would be a bug, since our value for the rhs is usually
 between zero and one.  Multiplying first by the integer numerator before
 dividing by the integer denominator would likely cause issues with
 overflowing the 16 bit integer.


 For the case in point I'd do

 In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16)

 In [2]: a //= 2

 In [3]: a
 Out[3]: array([0, 1, 1, 2, 2], dtype=int16)

 Although I expect you would want something different in practice. But
 the current code already looks fragile to me and I think it is a good thing
 you are taking a closer look at it. If you really intend going through a
 float, then it should be something like

 a = (a*(float(128)/256)).astype(int16)

 Chuck


 And thereby losing the memory benefit of an in-place multiplication?


 What makes you think you are getting that? I'd have

Re: [Numpy-discussion] specifying numpy as dependency in your project, install_requires

2012-09-21 Thread Benjamin Root

On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant tra...@continuum.iowrote:


 On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote:

 Hi,

 An issue I keep running into is that packages use:
 install_requires = [numpy]
 or
 install_requires = ['numpy = 1.6']

 in their setup.py. This simply doesn't work a lot of the time. I actually
 filed a bug against patsy for that (
 https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it
 would be better to bring it up on this list.

 The problem is that if you use pip, it doesn't detect numpy (may work
 better if you had installed numpy with setuptools) and tries to
 automatically install or upgrade numpy. That won't work if users don't have
 the right compiler. Just as bad would be that it does work, and the user
 didn't want to upgrade for whatever reason.

 This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw
 other people have the exact same problem. My recommendation would be to not
 use install_requires for numpy, but simply do something like this in
 setup.py:

 try:
 import numpy
 except ImportError:
 raise ImportError(my_package requires numpy)

 or

 try:
 from numpy.version import short_version as npversion
 except ImportError:
 raise ImportError(my_package requires numpy)
 if npversion  '1.6':
raise ImportError(Numpy version is %s; required is version = 1.6
 % npversion)

 Any objections, better ideas? Is there a good place to put it in the numpy
 docs somewhere?


 I agree.   I would recommend against using install requires.

 -Travis



Why?  I have personally never had an issue with this.  The only way I could
imagine that this wouldn't work is if numpy was installed via some other
means and there wasn't an entry in the easy-install.pth (or whatever
equivalent pip uses).  If pip is having a problem detecting numpy, then
that is a bug that needs fixing somewhere.

As for packages getting updated unintentionally, easy_install and pip both
require an argument to upgrade any existing packages (I think -U), so I am
not sure how you are running into such a situation.

I have found install_requires to be a powerful feature in my setup.py
scripts, and I have seen no reason to discourage it.  Perhaps I am the only
one?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] how to pipe into numpy arrays?

2012-10-24 Thread Benjamin Root

On Wed, Oct 24, 2012 at 3:00 PM, Michael Aye kmichael@gmail.com wrote:

 As numpy.fromfile seems to require full file object functionalities
 like seek, I can not use it with the sys.stdin pipe.
 So how could I stream a binary pipe directly into numpy?
 I can imagine storing the data in a string and use StringIO but the
 files are 3.6 GB large, just the binary, and that will most likely be
 much more as a string object.
 Reading binary files on disk is NOT the problem, I would like to avoid
 the temporary file if possible.


I haven't tried this myself, but there is a numpy.frombuffer() function as
well.  Maybe that could be used here?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root

This error started showing up in the test suite for mpl when using numpy
master.

AttributeError: incompatible shape for a non-contiguous array

The tracebacks all point back to various code points where we are trying to
set the shape of an array, e.g.,

offsets.shape = (-1, 2)

Those lines haven't changed in a couple of years, and was intended to be
done this way to raise an error when reshaping would result in a copy
(since we needed to use the original in those places).  I don't know how
these arrays have become non-contiguous, so I am wondering if there was
some sort of attribute that got screwed up somewhere (maybe with views?)

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root

On Mon, Oct 29, 2012 at 10:33 AM, Sebastian Berg sebast...@sipsolutions.net
 wrote:

 Hey,

 On Mon, 2012-10-29 at 09:54 -0400, Benjamin Root wrote:
  This error started showing up in the test suite for mpl when using
  numpy master.
 
  AttributeError: incompatible shape for a non-contiguous array
 
  The tracebacks all point back to various code points where we are
  trying to set the shape of an array, e.g.,
 
  offsets.shape = (-1, 2)
 
 Could you give a hint what these arrays history (how it was created) and
 maybe .shape/.strides is? Sounds like the array is not contiguous when
 it is expected to be, or the attribute setting itself fails in some
 corner cases on master?

 Regards,

 Sebastian


The original reporter of the bug dug into the commit list and suspects it
was this one:

https://github.com/numpy/numpy/commit/02ebf8b3e7674a6b8a06636feaa6c761fcdf4e2d

However, it might be earlier than that (he is currently doing a clean
rebuild to make sure).

As for the history:

offsets = np.asanyarray(offsets)
offsets.shape = (-1, 2) # Make it Nx2

Where offsets comes in from (possibly) user-supplied data.  Nothing
really all that special.  I will see if I can get stride information.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root

On Mon, Oct 29, 2012 at 11:04 AM, Patrick Marsh patrickmars...@gmail.comwrote:

 Turns out it isn't the commit I thought it was. I'm currently going
 through a git bisect to track down the actual commit that introduced this
 bug. I'll post back when I've found it.


  PTM
 ---
 Patrick Marsh
 Ph.D. Candidate / Liaison to the HWT
 School of Meteorology / University of Oklahoma
 Cooperative Institute for Mesoscale Meteorological Studies
 National Severe Storms Laboratory
 http://www.patricktmarsh.com



 On Mon, Oct 29, 2012 at 9:43 AM, Benjamin Root ben.r...@ou.edu wrote:



 On Mon, Oct 29, 2012 at 10:33 AM, Sebastian Berg 
 sebast...@sipsolutions.net wrote:

 Hey,

 On Mon, 2012-10-29 at 09:54 -0400, Benjamin Root wrote:
  This error started showing up in the test suite for mpl when using
  numpy master.
 
  AttributeError: incompatible shape for a non-contiguous array
 
  The tracebacks all point back to various code points where we are
  trying to set the shape of an array, e.g.,
 
  offsets.shape = (-1, 2)
 
 Could you give a hint what these arrays history (how it was created) and
 maybe .shape/.strides is? Sounds like the array is not contiguous when
 it is expected to be, or the attribute setting itself fails in some
 corner cases on master?

 Regards,

 Sebastian


 The original reporter of the bug dug into the commit list and suspects it
 was this one:


 https://github.com/numpy/numpy/commit/02ebf8b3e7674a6b8a06636feaa6c761fcdf4e2d

 However, it might be earlier than that (he is currently doing a clean
 rebuild to make sure).

 As for the history:

 offsets = np.asanyarray(offsets)
 offsets.shape = (-1, 2) # Make it Nx2

 Where offsets comes in from (possibly) user-supplied data.  Nothing
 really all that special.  I will see if I can get stride information.

 Ben Root


Further digging reveals that the code fails when the array is originally
1-D.  I had an array with shape (2,) and stride (8,).  The reshaping should
result in a shape of (1, 2).

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Simple question about scatter plot graph

2012-10-31 Thread Benjamin Root

On Wednesday, October 31, 2012, wrote:

 On Wed, Oct 31, 2012 at 8:59 PM, klo uo klo...@gmail.com javascript:;
 wrote:
  Thanks for your reply
 
  I suppose, variable length signals are split on equal parts and dominant
  harmonic is extracted. Then scatter plot shows this pattern, which has
 some
  low correlation, but I can't abstract what could be concluded from grid
  pattern, as I lack statistical knowledge.
  Maybe it's saying that data is quantized, which can't be easily seen from
  single sample bar chart, but perhaps scatter plot suggests that? That's
 only
  my wild guess

 http://pandasplotting.blogspot.ca/2012/06/lag-plot.html
 In general you would see a lag autocorrelation structure in the plot.

 My guess is that even if there is a pattern in your data we might not
 see it because we don't see plots that are plotted on top of each
 other. We only see the support of the y_t, y_{t+1} transition (points
 that are at least once in the sample), but not the frequencies (or
 conditional distribution).

 If that's the case, then
 reduce alpha level so many points on top of each other are darker, or
 colorcode the histogram for each y_t: bincount for each y_t and
 normalize, or use np.histogram directly for each y_t, then assign to
 each point a colorscale depending on it's frequency.

 Did you calculate the correlation? (But maybe linear correlation won't
 show much.)

 Josef


The answer is hexbin() in matplotlib when you have many points laying on or
near each other.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root

On Monday, November 12, 2012, Olivier Delalleau wrote:

 2012/11/12 Nathaniel Smith n...@pobox.com javascript:_e({}, 'cvml',
 'n...@pobox.com');

 On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett 
 matthew.br...@gmail.comjavascript:_e({}, 'cvml', 
 'matthew.br...@gmail.com');
 wrote:
  Hi,
 
  I wanted to check that everyone knows about and is happy with the
  scalar casting changes from 1.6.0.
 
  Specifically, the rules for (array, scalar) casting have changed such
  that the resulting dtype depends on the _value_ of the scalar.
 
  Mark W has documented these changes here:
 
  http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
 
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
 
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
 
  Specifically, as of 1.6.0:
 
  In [19]: arr = np.array([1.], dtype=np.float32)
 
  In [20]: (arr + (2**16-1)).dtype
  Out[20]: dtype('float32')
 
  In [21]: (arr + (2**16)).dtype
  Out[21]: dtype('float64')
 
  In [25]: arr = np.array([1.], dtype=np.int8)
 
  In [26]: (arr + 127).dtype
  Out[26]: dtype('int8')
 
  In [27]: (arr + 128).dtype
  Out[27]: dtype('int16')
 
  There's discussion about the changes here:
 
 
 http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
  http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
 
 http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
 
  It seems to me that this change is hard to explain, and does what you
  want only some of the time, making it a false friend.

 The old behaviour was that in these cases, the scalar was always cast
 to the type of the array, right? So
   np.array([1], dtype=np.int8) + 256
 returned 1? Is that the behaviour you prefer?

 I agree that the 1.6 behaviour is surprising and somewhat
 inconsistent. There are many places where you can get an overflow in
 numpy, and in all the other cases we just let the overflow happen. And
 in fact you can still get an overflow with arr + scalar operations, so
 this doesn't really fix anything.

 I find the specific handling of unsigned - signed and float32 -
 float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
 representable as a float32, but it doesn't *overflow*, it just gives
 you 2.0**16... if I'm using float32 then I presumably don't care that
 much about exact representability, so it's surprising that numpy is
 working to enforce it, and definitely a separate decision from what to
 do about overflow.)

 None of those threads seem to really get into the question of what the
 best behaviour here *is*, though.

 Possibly the most defensible choice is to treat ufunc(arr, scalar)
 operations as performing an implicit cast of the scalar to arr's
 dtype, and using the standard implicit casting rules -- which I think
 means, raising an error if !can_cast(scalar, arr.dtype,
 casting=safe)


 I like this suggestion. It may break some existing code, but I think it'd
 be for the best. The current behavior can be very confusing.

 -=- Olivier



break some existing code

I really should set up an email filter for this phrase and have it send
back an email automatically: Are you nuts?!

We just resolved an issue where the safe casting rule unexpectedly broke
existing code with regards to unplaced operations.  The solution was to
warn about the change in the upcoming release and to throw errors in a
later release.  Playing around with fundemental things like this need to be
done methodically and carefully.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root

On Monday, November 12, 2012, Benjamin Root wrote:



 On Monday, November 12, 2012, Olivier Delalleau wrote:

 2012/11/12 Nathaniel Smith n...@pobox.com

 On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  I wanted to check that everyone knows about and is happy with the
  scalar casting changes from 1.6.0.
 
  Specifically, the rules for (array, scalar) casting have changed such
  that the resulting dtype depends on the _value_ of the scalar.
 
  Mark W has documented these changes here:
 
  http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
 
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
 
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
 
  Specifically, as of 1.6.0:
 
  In [19]: arr = np.array([1.], dtype=np.float32)
 
  In [20]: (arr + (2**16-1)).dtype
  Out[20]: dtype('float32')
 
  In [21]: (arr + (2**16)).dtype
  Out[21]: dtype('float64')
 
  In [25]: arr = np.array([1.], dtype=np.int8)
 
  In [26]: (arr + 127).dtype
  Out[26]: dtype('int8')
 
  In [27]: (arr + 128).dtype
  Out[27]: dtype('int16')
 
  There's discussion about the changes here:
 
 
 http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
 
 http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
 
 http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
 
  It seems to me that this change is hard to explain, and does what you
  want only some of the time, making it a false friend.

 The old behaviour was that in these cases, the scalar was always cast
 to the type of the array, right? So
   np.array([1], dtype=np.int8) + 256
 returned 1? Is that the behaviour you prefer?

 I agree that the 1.6 behaviour is surprising and somewhat
 inconsistent. There are many places where you can get an overflow in
 numpy, and in all the other cases we just let the overflow happen. And
 in fact you can still get an overflow with arr + scalar operations, so
 this doesn't really fix anything.

 I find the specific handling of unsigned - signed and float32 -
 float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
 representable as a float32, but it doesn't *overflow*, it just gives
 you 2.0**16... if I'm using float32 then I presumably don't care that
 much about exact representability, so it's surprising that numpy is
 working to enforce it, and definitely a separate decision from what to
 do about overflow.)

 None of those threads seem to really get into the question of what the
 best behaviour here *is*, though.

 Possibly the most defensible choice is to treat ufunc(arr, scalar)
 operations as performing an implicit cast of the scalar to arr's
 dtype, and using the standard implicit casting rules -- which I think
 means, raising an error if !can_cast(scalar, arr.dtype,
 casting=safe)


 I like this suggestion. It may break some existing code, but I think it'd
 be for the best. The current behavior can be very confusing.

 -=- Olivier



 break some existing code

 I really should set up an email filter for this phrase and have it send
 back an email automatically: Are you nuts?!

 We just resolved an issue where the safe casting rule unexpectedly broke
 existing code with regards to unplaced operations.  The solution was to
 warn about the change in the upcoming release and to throw errors in a
 later release.  Playing around with fundemental things like this need to be
 done methodically and carefully.

 Cheers!
 Ben Root



Stupid autocorrect:  unplaced -- inplace
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root

On Monday, November 12, 2012, Matthew Brett wrote:

 Hi,

 On Mon, Nov 12, 2012 at 8:15 PM, Benjamin Root ben.r...@ou.edu wrote:
 
 
  On Monday, November 12, 2012, Olivier Delalleau wrote:
 
  2012/11/12 Nathaniel Smith n...@pobox.com
 
  On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett 
 matthew.br...@gmail.com
  wrote:
   Hi,
  
   I wanted to check that everyone knows about and is happy with the
   scalar casting changes from 1.6.0.
  
   Specifically, the rules for (array, scalar) casting have changed such
   that the resulting dtype depends on the _value_ of the scalar.
  
   Mark W has documented these changes here:
  
   http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
  
  
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
  
  
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
  
   Specifically, as of 1.6.0:
  
   In [19]: arr = np.array([1.], dtype=np.float32)
  
   In [20]: (arr + (2**16-1)).dtype
   Out[20]: dtype('float32')
  
   In [21]: (arr + (2**16)).dtype
   Out[21]: dtype('float64')
  
   In [25]: arr = np.array([1.], dtype=np.int8)
  
   In [26]: (arr + 127).dtype
   Out[26]: dtype('int8')
  
   In [27]: (arr + 128).dtype
   Out[27]: dtype('int16')
  
   There's discussion about the changes here:
  
  
  
 http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
  
 http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
  
  
 http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
  
   It seems to me that this change is hard to explain, and does what you
   want only some of the time, making it a false friend.
 
  The old behaviour was that in these cases, the scalar was always cast
  to the type of the array, right? So
np.array([1], dtype=np.int8) + 256
  returned 1? Is that the behaviour you prefer?
 
  I agree that the 1.6 behaviour is surprising and somewhat
  inconsistent. There are many places where you can get an overflow in
  numpy, and in all the other cases we just let the overflow happen. And
  in fact you can still get an overflow with arr + scalar operations, so
  this doesn't really fix anything.
 
  I find the specific handling of unsigned - signed and float32 -
  float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
  representable as a float32, but it doesn't *overflow*, it just gives
  you 2.0**16... if I'm using float32 then I presumably don't care that
  much about exact representability, so it's surprising that numpy is
  working to enforce it, and definitely a separate decision from what to
  do about overflow.)
 
  None of those threads seem to really get into the question of what the
  best behaviour here *is*, though.
 
  Possibly the moWell, hold on though, I was asking earlier in the
 thread what we
 thought the behavior should be in 2.0 or maybe better put, sometime in
 the future.

 If we know what we think the best answer is, and we think the best
 answer is worth shooting for, then we can try to think of sensible
 ways of getting there.

 I guess that's what Nathaniel and Olivier were thinking of but they
 can correct me if I'm wrong...

 Cheers,

 Matthew


I am fine with migrating to better solutions (I have yet to decide on this
current situation, though), but whatever change is adopted must go through
a deprecation process, which was my point.  Outright breaking of code as a
first step is the wrong choice, and I was merely nipping it in the bud.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] the fast way to loop over ndarray elements?

2012-11-17 Thread Benjamin Root

On Saturday, November 17, 2012, Chao YUE wrote:

 Dear all,

 I need to make a linear contrast of the 2D numpy array data from an
 interval to another, the approach is:
 I have another two list: base  target, then I check for each ndarray
 element data[i,j],
 if   base[m] = data[i,j] = base[m+1], then it will be linearly converted
 to be in the interval of (target[m], target[m+1]),
 using another function called lintrans.


 #The way I do is to loop each row and column of the 2D array, and finally
 loop the intervals constituted by base list:

 for row in range(data.shape[0]):
 for col in range(data.shape[1]):
 for i in range(len(base)-1):
 if data[row,col]=base[i] and data[row,col]=base[i+1]:

 data[row,col]=lintrans(data[row,col],(base[i],base[i+1]),(target[i],target[i+1]))
 break  #use break to jump out of loop as the data have to
 be ONLY transferred ONCE.


 Now the profiling result shows that most of the time has been used in this
 loop over the array (plot_array_transg),
 and less time in calling the linear transformation fuction lintrans:

ncalls tottime  percallcumtimepercall
 filename:lineno(function)
   180470.1100.000  0.1100.000
 mathex.py:132(lintrans)
   112.495  12.495   19.061  19.061
 mathex.py:196(plot_array_transg)


 so is there anyway I can speed up this loop?  Thanks for any suggestions!!

 best,

 Chao


If the values in base are ascending, you can use searchsorted() to find out
where values from data can be placed into base while maintaining order.
 Don't know if it is faster, but it would certainly be easier to read.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] float32 to float64 casting

2012-11-17 Thread Benjamin Root

On Saturday, November 17, 2012, Charles R Harris wrote:



 On Sat, Nov 17, 2012 at 1:00 PM, Olivier Delalleau 
 sh...@keba.bejavascript:_e({}, 'cvml', 'sh...@keba.be');
  wrote:

 2012/11/17 Gökhan Sever gokhanse...@gmail.com javascript:_e({},
 'cvml', 'gokhanse...@gmail.com');



 On Sat, Nov 17, 2012 at 9:47 AM, Nathaniel Smith 
 n...@pobox.comjavascript:_e({}, 'cvml', 'n...@pobox.com');
  wrote:

 On Fri, Nov 16, 2012 at 9:53 PM, Gökhan Sever 
 gokhanse...@gmail.comjavascript:_e({}, 'cvml', 
 'gokhanse...@gmail.com');
 wrote:
  Thanks for the explanations.
 
  For either case, I was expecting to get float32 as a resulting data
 type.
  Since, float32 is large enough to contain the result. I am wondering
 if
  changing casting rule this way, requires a lot of modification in the
 NumPy
  code. Maybe as an alternative to the current casting mechanism?
 
  I like the way that NumPy can convert to float64. As if these
 data-types are
  continuation of each other. But just the conversation might happen
 too early
  --at least in my opinion, as demonstrated in my example.
 
  For instance comparing this example to IDL surprises me:
 
  I16 np.float32()*5e38
  O16 2.77749998e+42
 
  I17 (np.float32()*5e38).dtype
  O17 dtype('float64')

 In this case, what's going on is that 5e38 is a Python float object,
 and Python float objects have double-precision, i.e., they're
 equivalent to np.float64's. So you're multiplying a float32 and a
 float64. I think most people will agree that in this situation it's
 better to use float64 for the output?

 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org javascript:_e({}, 'cvml',
 'NumPy-Discussion@scipy.org');
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 OK, I see your point. Python numeric data objects and NumPy data objects
 mixed operations require more attention.

 The following causes float32 overflow --rather than casting to float64
 as in the case for Python float multiplication, and behaves like in IDL.

 I3 (np.float32()*np.float32(5e38))
 O3 inf

 However, these two still surprises me:

 I5 (np.float32()*1).dtype
 O5 dtype('float64')

 I6 (np.float32()*np.int32(1)).dtype
 O6 dtype('float64')


 That's because the current way of finding out the result's dtype is based
 on input dtypes only (not on numeric values), and numpy.can_cast('int32',
 'float32') is False, while numpy.can_cast('int32', 'float64') is True (and
 same for int64).
 Thus it decides to cast to float64.


 It might be nice to revisit all the casting rules at some point, but
 current experience suggests that any changes will lead to cries of pain and
 outrage ;)

 Chuck


Can we at least put these examples into the tests?  Also, I think the
bigger issue was that, unlike deprecation of a function, it is much harder
to grep for particular operations, especially in a dynamic language like
python. What were intended as minor bugfixes ended up becoming much larger.

Has the casting table been added to the tests?  I think that will bring
much more confidence and assurances for future changes going forward.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Allowing 0-d arrays in np.take

2012-12-04 Thread Benjamin Root

On Tue, Dec 4, 2012 at 8:57 AM, Sebastian Berg
sebast...@sipsolutions.netwrote:

 Hey,

 Maybe someone has an opinion about this (since in fact it is new
 behavior, so it is undefined). `np.take` used to not allow 0-d/scalar
 input but did allow any other dimensions for the indices. Thinking about
 changing this, meaning that:

 np.take(np.arange(5), 0)

 works. I was wondering if anyone has feelings about whether this should
 return a scalar or a 0-d array. Typically numpy prefers scalars for
 these cases (indexing would return a scalar too) for good reasons, so I
 guess that is correct. But since I noticed this wondering if maybe it
 returns a 0-d array, I thought I would ask here.

 Regards,

 Sebastian


At first, I was thinking that the output type should be based on what the
input type is.  So, if a scalar index was used, then a scalar value should
be returned.  But this wouldn't be true if the array had other dimensions.
So, perhaps it should always be an array.  The only other option is to
mimic the behavior of the array indexing, which wouldn't be a bad choice.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8

2012-12-13 Thread Benjamin Root

As a point of reference, python 2.4 is on RH5/CentOS5.  While RH6 is the
current version, there are still enterprises that are using version 5.  Of
course, at this point, one really should be working on a migration plan and
shouldn't be doing new development on those machines...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Benjamin Root

On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris 
charlesr.har...@gmail.com wrote:

 The previous proposal to drop python 2.4 support garnered no opposition.
 How about dropping support for python 2.5 also?

 Chuck


matplotlib 1.2 supports py2.5.  I haven't seen any plan to move off of that
for 1.3.  Is there a compelling reason for dropping 2.5?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Benjamin Root

My apologies... we support 2.6 and above.  +1 on dropping 2.5 support.

Ben

On Thu, Dec 13, 2012 at 1:12 PM, Benjamin Root ben.r...@ou.edu wrote:

 On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:

 The previous proposal to drop python 2.4 support garnered no opposition.
 How about dropping support for python 2.5 also?

 Chuck


 matplotlib 1.2 supports py2.5.  I haven't seen any plan to move off of
 that for 1.3.  Is there a compelling reason for dropping 2.5?

 Ben Root


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Insights / lessons learned from NumPy design

2013-01-09 Thread Benjamin Root

On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith n...@pobox.com wrote:

 On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac alan.is...@gmail.com wrote:
  I'm just a Python+NumPy user and not a CS type.
  May I ask a naive question on this thread?
 
  Given the work that has (as I understand it) gone into
  making NumPy usable as a C library, why is the discussion not
  going in a direction like the following:
  What changes to the NumPy code base would be required for it
  to provide useful ndarray functionality in a C extension
  to Clojure?  Is this simply incompatible with the goal that
  Clojure compile to JVM byte code?

 IIUC that work was done on a fork of numpy which has since been
 abandoned by its authors, so... yeah, numpy itself doesn't have much
 to offer in this area right now. It could in principle with a bunch of
 refactoring (ideally not on a fork, since we saw how well that went),
 but I don't think most happy current numpy users are wishing they
 could switch to writing Lisp on the JVM or vice-versa, so I don't
 think it's surprising that no-one's jumped up to do this work.


If I could just point out that the attempt to fork numpy for the .NET work
was done back in the subversion days, and there was little-to-no effort to
incrementally merge back changes to master, and vice-versa.  With git as
our repository now, such work may be more feasible.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root

On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig pierre.haes...@crans.orgwrote:

 Hi,

 Le 14/01/2013 00:39, Nathaniel Smith a écrit :
  (The nice thing about np.filled() is that it makes np.zeros() and
  np.ones() feel like clutter, rather than the reverse... not that I'm
  suggesting ever getting rid of them, but it makes the API conceptually
  feel smaller, not larger.)
 Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
 numpy for Matlab (and maybe others ?) compatibilty and are useful for
 that. Now that I've been enlightened by Python, I think that those
 functions (especially np.ones) are indeed clutter. Therefore I favor the
 introduction of these two new functions.

 However, I think Eric's remark about masked array API compatibility is
 important. I don't know what other names are possible ? np.const ?

 Or maybe np.tile is also useful for that same purpose ? In that case
 adding a dtype argument to np.tile would be useful.

 best,
 Pierre


I am also +1 on the idea of having a filled() and filled_like() function (I
learned a long time ago to just do a = np.empty() and a.fill() rather than
the multiplication trick I learned from Matlab).  However, the collision
with the masked array API is a non-starter for me.  np.const() and
np.const_like() probably make the most sense, but I would prefer a verb
over a noun.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root

On Mon, Jan 14, 2013 at 12:27 PM, Eric Firing efir...@hawaii.edu wrote:

 On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
  - I agree the name collision with np.ma.filled is a problem. I have no
  better suggestion though at this point.

 How about initialized()?


A verb! +1 from me!

For those wondering, I have a personal rule that because functions *do*
something, they really should have verbs for their names.  I have to learn
to read functions like ones and empty like give me ones or give me
an empty array.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root

On Mon, Jan 14, 2013 at 1:56 PM, David Warde-Farley 
d.warde.far...@gmail.com wrote:

 On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
 pierre.haes...@crans.org wrote:
  In [8]: tile(nan, (3,3)) # (it's a verb ! )

 tile, in my opinion, is useful in some cases (for people who think in
 terms of repmat()) but not very NumPy-ish. What I'd like is a function
 that takes

 - an initial array_like a
 - a shape s
 - optionally, a dtype (otherwise inherit from a)

 and broadcasts a to the shape s. In the case of scalars this is
 just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
 this broadcasts across rows, etc.

 I don't think it's worth special-casing scalar fills (except perhaps
 as an implementation detail) when you have rich broadcasting semantics
 that are already a fundamental part of NumPy, allowing for a much
 handier primitive.


I have similar problems with tile.  I learned it for a particular use in
numpy, and it would be hard for me to see it for another (contextually)
different use.

I do like the way you are thinking in terms of the broadcasting semantics,
but I wonder if that is a bit awkward.  What I mean is, if one were to use
broadcasting semantics for creating an array, wouldn't one have just simply
used broadcasting anyway?  The point of broadcasting is to _avoid_ the
creation of unneeded arrays.  But maybe I can be convinced with some
examples.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Shouldn't all in-place operations simply return self?

2013-01-17 Thread Benjamin Root

On Thu, Jan 17, 2013 at 8:54 AM, Jim Vickroy jim.vick...@noaa.gov wrote:

  On 1/16/2013 11:41 PM, Nathaniel Smith wrote:

 On 16 Jan 2013 17:54, josef.p...@gmail.com wrote:
   a = np.random.random_integers(0, 5, size=5)
   b = a.sort()
   b
   a
  array([0, 1, 2, 5, 5])
 
   b = np.random.shuffle(a)
   b
   b = np.random.permutation(a)
   b
  array([0, 5, 5, 2, 1])
 
  How do I remember if shuffle shuffles or permutes ?
 
  Do we have a list of functions that are inplace?

 I rather like the convention used elsewhere in Python of naming in-place
 operations with present tense imperative verbs, and out-of-place operations
 with past participles. So you have sort/sorted, reverse/reversed, etc.

 Here this would suggest we name these two operations as either shuffle()
 and shuffled(), or permute() and permuted().


 I like this (tense) suggestion.  It seems easy to remember.  --jv



And another score for functions as verbs!

:-P

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-17 Thread Benjamin Root

On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing efir...@hawaii.edu wrote:

 On 2013/01/17 4:13 AM, Pierre Haessig wrote:
  Hi,
 
  Le 14/01/2013 20:05, Benjamin Root a écrit :
  I do like the way you are thinking in terms of the broadcasting
  semantics, but I wonder if that is a bit awkward.  What I mean is, if
  one were to use broadcasting semantics for creating an array, wouldn't
  one have just simply used broadcasting anyway?  The point of
  broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
  I can be convinced with some examples.
 
  I feel that one of the point of the discussion is : although a new (or
  not so new...) function to create a filled array would be more elegant
  than the existing pair of functions np.zeros and np.ones, there are
  maybe not so many usecases for filled arrays *other than zeros values*.
 
  I can remember having initialized a non-zero array *some months ago*.
  For the anecdote it was a vector of discretized vehicule speed values
  which I wanted to be initialized with a predefined mean speed value
  prior to some optimization. In that usecase, I really didn't care about
  the performance of this initialization step.
 
  So my overall feeling after this thread is
- *yes* a single dedicated fill/init/someverb function would give a
  slightly better API,
-  but *no* it's not important because np.empty and np.zeros covers 95
  % usecases !

 I agree with your summary and conclusion.

 Eric


Can we at least have a np.nans() and np.infs() functions?  This should
cover an additional 4% of use-cases.

Ben Root

P.S. - I know they aren't verbs...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-18 Thread Benjamin Root

On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi dani...@grinta.netwrote:

 On 17/01/2013 23:27, Mark Wiebe wrote:
  Would it be too weird or clumsy to extend the empty and empty_like
  functions to do the filling?
 
  np.empty((10, 10), fill=np.nan)
  np.empty_like(my_arr, fill=np.nan)

 Wouldn't it be more natural to extend the ndarray constructor?

 np.ndarray((10, 10), fill=np.nan)

 It looks more natural to me. In this way it is not possible to have the
 _like extension, but I don't see it as a major drawback.


 Cheers,
 Daniele


This isn't a bad idea.  Although, I would wager that most people, like
myself, use np.array() and np.array_like() instead of np.ndarray().  We
should also double-check and see how well that would fit in with the other
contructors like masked arrays and matrix objects.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-18 Thread Benjamin Root

On Fri, Jan 18, 2013 at 11:36 AM, Daniele Nicolodi dani...@grinta.netwrote:

 On 18/01/2013 15:19, Benjamin Root wrote:
 
 
  On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi dani...@grinta.net
  mailto:dani...@grinta.net wrote:
 
  On 17/01/2013 23:27, Mark Wiebe wrote:
   Would it be too weird or clumsy to extend the empty and empty_like
   functions to do the filling?
  
   np.empty((10, 10), fill=np.nan)
   np.empty_like(my_arr, fill=np.nan)
 
  Wouldn't it be more natural to extend the ndarray constructor?
 
  np.ndarray((10, 10), fill=np.nan)
 
  It looks more natural to me. In this way it is not possible to have
 the
  _like extension, but I don't see it as a major drawback.
 
 
  Cheers,
  Daniele
 
 
  This isn't a bad idea.  Although, I would wager that most people, like
  myself, use np.array() and np.array_like() instead of np.ndarray().  We
  should also double-check and see how well that would fit in with the
  other contructors like masked arrays and matrix objects.

 Hello Ben,

 I don't really get what you mean with this. np.array() construct a numpy
 array from an array-like object, np.ndarray() accepts a dimensions tuple
 as first parameter, I don't see any np.array_like in the current numpy
 release.

 Cheers,
 Daniele


My bad, I had a brain-fart and got mixed up.  I was thinking of
np.empty().  In fact, I never use np.ndarray(), I use np.empty().  Besides
np.ndarray() being the actual constructor, what is the difference between
them?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] np.where: x and y need to have the same shape as condition ?

2013-01-29 Thread Benjamin Root

On Tue, Jan 29, 2013 at 6:16 AM, denis denis-bz...@t-online.de wrote:

 Folks,
   the doc for `where` says x and y need to have the same shape as
 condition
 http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.where.html
 But surely
 where is equivalent to:
 [xv if c else yv for (c,xv,yv) in zip(condition,x,y)]
 holds as long as len(condition) == len(x) == len(y) ?
 And `condition` can be broadcast ?
 n = 3
 all01 = np.array([ t for t in np.ndindex( n * (2,) )]) # 000 001 ...
 x = np.zeros(n)
 y = np.ones(n)
 w = np.where( all01, y, x )  # 2^n x n

 Can anyone please help me understand `where`
 / extend where is equivalent to ... ?
 Thanks,
 cheers
   -- denis


Do keep in mind the difference between len() and shape (they aren't the
same for 2 and greater dimension arrays).  But, ultimately, yes, the arrays
have to have the same shape, or use scalars.  I haven't checked
broadcast-ability though.  Perhaps a note should be added into the
documentation to explicitly say whether the arrays can be broadcastable.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issues to fix for 1.7.0rc2.

2013-02-06 Thread Benjamin Root

On Wed, Feb 6, 2013 at 4:18 AM, Dag Sverre Seljebotn 
d.s.seljeb...@astro.uio.no wrote:

 On 02/06/2013 08:41 AM, Charles R Harris wrote:
 
 
  On Tue, Feb 5, 2013 at 11:50 PM, Jason Grout
  jason-s...@creativetrax.com mailto:jason-s...@creativetrax.com
 wrote:
 
  On 2/6/13 12:46 AM, Charles R Harris wrote:
if we decide to do so
 
  I should mention that we don't really depend on either behavior (we
  probably should have a better doctest testing for an array of None
  values anyway), but we noticed the oddity and thought we ought to
  mention it.  So it doesn't matter to us which way the decision goes.
 
 
  More Python craziness
 
  In [6]: print None or 0
  0
 
  In [7]: print 0 or None
  None

 To me this seems natural and is just how Python works? I think the rule
 for or is simply evaluate __nonzero__ of left operand, if it is
 False, return right operand.

 The reason is so that you can use it like this:

 x = get_foo() or get_bar() # if get_foo() returns None
 # use result of get_bar

 or

 def f(x=None):
  x = x or create_default_x()
  ...


And what if the user passes in a zero or an empty string or an empty list,
or if the return value from get_foo() is a perfectly valid zero?  This is
one of the very few things I have disagreed with PEP8, and Python in
general about.  I can understand implicit casting of numbers to booleans in
order to attract the C/C++ crowd (but I don't have to like it), but what
was so hard about x is not None or len(x) == 0?

I like my languages explicit.  Less magic, more WYSIWYM.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dealing with the mode argument in qr.

2013-02-06 Thread Benjamin Root

On Tue, Feb 5, 2013 at 4:23 PM, Charles R Harris
charlesr.har...@gmail.comwrote:

 Hi All,

 This post is to bring the discussion of PR 
 #2965https://github.com/numpy/numpy/pull/2965to the attention of the list. 
 There are at least three issues in play here.

 1) The PR adds modes 'big' and 'thin' to the current modes 'full', 'r',
 'economic' for qr factorization. The problem is that the current 'full' is
 actually 'thin' and 'big' should be 'full'. The solution here was to raise
 a FutureWarning on use of 'full', alias it to 'thin' for the time being,
 and at some distant time change 'full' to alias 'big'.

 2) The 'economic' mode serves little purpose. I propose to deprecate it
 and add a 'qrf' mode instead, corresponding to scipy's 'raw' mode. We can't
 use 'raw' itself as traditionally the mode may be specified using the first
 letter only and that leads to a conflict with 'r'.

 3) As suggested in 2, the use of single letter abbreviations can constrain
 the options in choosing mode names and they are not as informative as the
 full name. A possibility here is to deprecate the use of the abbreviations
 in favor of the full names.

 A longer term problem is the divergence between the numpy and scipy
 versions of qr. The divergence is enough that I don't see any easy way to
 come to a common interface, but that is something that would be desirable
 if possible.

 Thoughts?

 Chuck


I would definitely be in favor of deprecating abbreviations.

And while we are on the topic of mode names,
scipy.ndimage.filters.percentile_filter() has modes of 'mirror' and
'reflect', and I don't see any documentation stating if they are the same,
or what are different about them.  I just came across this yesterday.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Where's that function?

2013-02-06 Thread Benjamin Root

On Wed, Feb 6, 2013 at 1:08 PM, josef.p...@gmail.com wrote:

 I'm convinced that I saw a while ago a function that uses a list of
 interval boundaries to index into an array, either to iterate or to
 take.
 I thought that's very useful, but didn't make a note.

 Now, I have no idea where I saw this (I thought numpy), and I cannot
 find it anywhere.

 any clues?


Some possibilities:

np.array_split()
np.split()
np.ndindex()
np.nditer()
np.nested_iters()
np.ravel_multi_index()

Your description reminded me of a function I came across once, but I can't
remember if one of these was it or if it was another one.

IHTH,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Array accumulation in numpy

2013-02-19 Thread Benjamin Root

On Tue, Feb 19, 2013 at 10:00 AM, Tony Ladd tl...@che.ufl.edu wrote:

 I want to accumulate elements of a vector (x) to an array (f) based on
 an index list (ind).

 For example:

 x=[1,2,3,4,5,6]
 ind=[1,3,9,3,4,1]
 f=np.zeros(10)

 What I want would be produced by the loop

 for i=range(6):
  f[ind[i]]=f[ind[i]]+x[i]

 The answer is f=array([ 0.,  7.,  0.,  6.,  5.,  0.,  0.,  0.,  0., 3.])

 When I try to use implicit arguments

 f[ind]=f[ind]+x

 I get f=array([ 0.,  6.,  0.,  4.,  5.,  0.,  0.,  0.,  0.,  3.])


 So it takes the last value of x that is pointed to by ind and adds it to
 f, but its the wrong answer when there are repeats of the same entry in
 ind (e.g. 3 or 1)

 I realize my code is incorrect, but is there a way to make numpy
 accumulate without using loops? I would have thought so but I cannot
 find anything in the documentation.

 Would much appreciate any help - probably a really simple question.

 Thanks

 Tony


I believe you are looking for the equivalent of accumarray in Matlab?

Try this:

http://www.scipy.org/Cookbook/AccumarrayLike

It is a bit touchy about lists and 1-D numpy arrays, but it does the job.
Also, I think somebody posted an optimized version for simple sums recently
to this list.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding .abs() method to the array object

2013-02-23 Thread Benjamin Root

On Sat, Feb 23, 2013 at 8:20 PM, josef.p...@gmail.com wrote:

 On Sat, Feb 23, 2013 at 3:33 PM, Robert Kern robert.k...@gmail.com
 wrote:
  On Sat, Feb 23, 2013 at 7:25 PM, Nathaniel Smith n...@pobox.com wrote:
  On Sat, Feb 23, 2013 at 3:38 PM, Till Stensitzki mail.t...@gmx.de
 wrote:
  Hello,
  i know that the array object is already crowded, but i would like
  to see the abs method added, especially doing work on the console.
  Considering that many much less used functions are also implemented
  as a method, i don't think adding one more would be problematic.
 
  My gut feeling is that we have too many methods on ndarray, not too
  few, but in any case, can you elaborate? What's the rationale for why
  np.abs(a) is so much harder than a.abs(), and why this function and
  not other unary functions?
 
  Or even abs(a).


 my reason is that I often use

 arr.max()
 but then decide I want to us abs and need
 np.max(np.abs(arr))
 instead of arr.abs().max() (and often I write that first to see the
 error message)

 I don't like
 np.abs(arr).max()
 because I have to concentrate to much on the braces, especially if arr
 is a calculation

 I wrote several times
 def maxabs(arr):
 return np.max(np.abs(arr))

 silly, but I use it often and np.is_close is not useful (doesn't show how
 close)

 Just a small annoyance, but I think it's the method that I miss most often.

 Josef


My issue is having to remember which ones are methods and which ones are
functions.  There doesn't seem to be a rhyme or reason for the choices, and
I would rather like to see that a line is drawn, but I am not picky as to
where it is drawn.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] a question about freeze on numpy 1.7.0

2013-02-24 Thread Benjamin Root

On Sun, Feb 24, 2013 at 8:16 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote:

 Hi Gelin,

 On Sun, Feb 24, 2013 at 12:08 AM, Gelin Yan dynami...@gmail.com wrote:
  Hi All
 
   When I used numpy 1.7.0 with cx_freeze 4.3.1 on windows, I quickly
  found out even a simple import numpy may lead to program failed with
  following exception:
 
  AttributeError: 'module' object has no attribute 'sys'
 
  After a poking around some codes I noticed /numpy/core/__init__.py has a
  line 'del sys' at the bottom. After I commented this line, and repacked
 the
  whole program, It ran fine.
  I also noticed this 'del sys' didn't exist on numpy 1.6.2
 
  I am curious why this 'del sys' should be here and whether it is safe to
  omit it. Thanks.

 The del sys line was introduced in the commit:


 https://github.com/numpy/numpy/commit/4c0576fe9947ef2af8351405e0990cebd83ccbb6

 and it seems to me that it is needed so that the numpy.core namespace is
 not
 cluttered by it.

 Can you post the full stacktrace of your program (and preferably some
 instructions
 how to reproduce the problem)? It should become clear where the problem is.

 Thanks,
 Ondrej


I have run into issues with doing del sys before, but usually with
respect to my pythonrc file.  Because the import of sys has already
happened, python won't let you import the module again in the same
namespace (in my case, the runtime environment).  I don't know how the
frozen binaries work, but maybe something along those lines is happening?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] reshaping arrays

2013-03-04 Thread Benjamin Root

On Sat, Mar 2, 2013 at 11:35 PM, Sudheer Joseph sudheer.jos...@yahoo.comwrote:

 Hi Brad,
 I am not getting the attribute reshape for the array, are
 you having a different version of numpy than mine?

 I have
 In [55]: np.__version__
 Out[55]: '1.7.0'
 and detail of the shape

 details of variable

 In [57]: ssh??
 Type:   NetCDFVariable
 String Form:NetCDFVariable object at 0x492d3d8
 Namespace:  Interactive
 Length: 75
 Docstring:  NetCDF Variable

 In [58]: ssh.shape
 Out[58]: (75, 140, 180)

 ssh??
 Type:   NetCDFVariable
 String Form:NetCDFVariable object at 0x492d3d8
 Namespace:  Interactive
 Length: 75
 Docstring:  NetCDF Variable

 In [66]: ssh.shape
 Out[66]: (75, 140, 180)

 In [67]: ssh.reshape(75,140*180)
 ---
 AttributeErrorTraceback (most recent call last)
 /home/sjo/RAMA_20120807/adcp/ipython-input-67-1a21dae1d18d in module()
  1 ssh.reshape(75,140*180)

 AttributeError: reshape



Ah, you have a NetCDF variable, which in many ways purposefully looks like
a NumPy array, but isn't.  Just keep in mind that a NetCDF variable is
merely a way to have the data available without actually reading it in
until you need it.  If you do:

ssh_data = ssh[:]

Then the NetCDF variable will read all the data in the file and return it
as a numpy array that can be manipulated as you wish.

I hope that helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Implementing a find first style function

2013-03-06 Thread Benjamin Root

On Tue, Mar 5, 2013 at 9:15 AM, Phil Elson pelson@gmail.com wrote:

 The ticket https://github.com/numpy/numpy/issues/2269 discusses the
 possibility of implementing a find first style function which can
 optimise the process of finding the first value(s) which match a predicate
 in a given 1D array. For example:


  a = np.sin(np.linspace(0, np.pi, 200))
  print find_first(a, lambda a: a  0.9)
 ((71, ), 0.900479032457)


 This has been discussed in several locations:

 https://github.com/numpy/numpy/issues/2269
 https://github.com/numpy/numpy/issues/2333

 http://stackoverflow.com/questions/7632963/numpy-array-how-to-find-index-of-first-occurrence-of-item


 *Rationale*

 For small arrays there is no real reason to avoid doing:

  a = np.sin(np.linspace(0, np.pi, 200))
  ind = (a  0.9).nonzero()[0][0]
  print (ind, ), a[ind]
 (71,) 0.900479032457


 But for larger arrays, this can lead to massive amounts of work even if
 the result is one of the first to be computed. Example:

  a = np.arange(1e8)
  print (a == 5).nonzero()[0][0]
 5


 So a function which terminates when the first matching value is found is
 desirable.

 As mentioned in #2269, it is possible to define a consistent ordering
 which allows this functionality for 1D arrays, but IMHO it overcomplicates
 the problem and was not a case that I personally needed, so I've limited
 the scope to 1D arrays only.


 *Implementation*

 My initial assumption was that to get any kind of performance I would need
 to write the *find* function in C, however after prototyping with some
 array chunking it became apparent that a trivial python function would be
 quick enough for my needs.

 The approach I've implemented in the code found in #2269 simply breaks the
 array into sub-arrays of maximum length *chunk_size* (2048 by default,
 though there is no real science to this number), applies the given
 predicating function, and yields the results from *nonzero()*. The given
 function should be a python function which operates on the whole of the
 sub-array element-wise (i.e. the function should be vectorized). Returning
 a generator also has the benefit of allowing users to get the first 
 *n*matching values/indices.


 *Results*


 I timed the implementation of *find* found in my comment at
 https://github.com/numpy/numpy/issues/2269#issuecomment-14436725 with an
 obvious test:


 In [1]: from np_utils import find

 In [2]: import numpy as np

 In [3]: import numpy.random

 In [4]: np.random.seed(1)

 In [5]: a = np.random.randn(1e8)

 In [6]: a.min(), a.max()
 Out[6]: (-6.1194900990552776, 5.9632246301166321)

 In [7]: next(find(a, lambda a: np.abs(a)  6))
 Out[7]: ((33105441,), -6.1194900990552776)

 In [8]: (np.abs(a)  6).nonzero()
 Out[8]: (array([33105441]),)

 In [9]: %timeit (np.abs(a)  6).nonzero()
 1 loops, best of 3: 1.51 s per loop

 In [10]: %timeit next(find(a, lambda a: np.abs(a)  6))
 1 loops, best of 3: 912 ms per loop

 In [11]: %timeit next(find(a, lambda a: np.abs(a)  6, chunk_size=10))
 1 loops, best of 3: 470 ms per loop

 In [12]: %timeit next(find(a, lambda a: np.abs(a)  6, chunk_size=100))
 1 loops, best of 3: 483 ms per loop


 This shows that picking a sensible *chunk_size* can yield massive
 speed-ups (nonzero is x3 slower in one case). A similar example with a much
 smaller 1D array shows similar promise:

 In [41]: a = np.random.randn(1e4)

 In [42]: %timeit next(find(a, lambda a: np.abs(a)  3))
 1 loops, best of 3: 35.8 us per loop

 In [43]: %timeit (np.abs(a)  3).nonzero()
 1 loops, best of 3: 148 us per loop


 As I commented on the issue tracker, if you think this function is worth
 taking forward, I'd be happy to open up a pull request.

 Feedback greatfully received.

 Cheers,

 Phil



In the interest of generalizing code and such, could such approaches be
used for functions like np.any() and np.all() for short-circuiting if True
or False (respectively) are found?  I wonder what other sort of functions
in NumPy might benefit from this?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] feature tracking in numpy/scipy

2013-03-07 Thread Benjamin Root

On Sat, Mar 2, 2013 at 5:32 PM, Scott Collis scollis.a...@gmail.com wrote:

 Good afternoon list,
 I am looking at feature tracking in a 2D numpy array, along the lines of
 Dixon and Wiener 1993 (for tracking precipitating storms)

 Identifying features based on threshold is quite trivial using
 ndimage.label

 b_fld=np.zeros(mygrid.fields['rain_rate_A']['data'].shape)
 rr=10
 b_fld[mygrid.fields['rain_rate_A']['data']  rr]=1.0
 labels, numobjects = ndimage.label(b_fld[0,0,:,:])
 (note mygrid.fields['rain_rate_A']['data'] is dimensions time,height, y, x)

 using the matplotlib contouring and fetching the vertices I can get a nice
 list of polygons of rain rate above a certain threshold… Now from here I
 can just go and implement the Dixon and Wiener methodology but I thought I
 would check here first to see if anyone know of a object/feature tracking
 algorithm in numpy/scipy or using numpy arrays (it just seems like
 something people would want to do!).. i.e. something that looks back and
 forward in time and identifies polygon movement and identifies objects with
 temporal persistence..

 Cheers!
 Scott

 Dixon, M., and G. Wiener, 1993: TITAN: Thunderstorm Identification,
 Tracking, Analysis, and Nowcasting—A Radar-based Methodology. *Journal of
 Atmospheric and Oceanic Technology*, *10*, 785–797,
 doi:10.1175/1520-0426(1993)0100785:TTITAA2.0.CO;2.

 http://journals.ametsoc.org/doi/abs/10.1175/1520-0426%281993%29010%3C0785%3ATTITAA%3E2.0.CO%3B2



Say hello to my PhD project: https://github.com/WeatherGod/ZigZag

In it, I have the centroid-tracking portion of the TITAN code, along with
SCIT, and hooks into MHT.  Several of the dependencies are also available
in my repositories.

Cheers!
Ben

P.S. - I have personally met Dr. Dixon on multiple occasions and he is a
great guy to work with.  Feel free to email him or myself with questions
about TITAN.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Benjamin Root

On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:


 Personally, I never need finer resolution than seconds, nor more than
 a century, so it's no big deal to me, but just wondering


A use case for finer resolution than seconds (in our field, no less!) is
lightning data.  At the last SciPy conference,  a fellow meteorologist
mentioned how difficult it was to plot out lightning data at resolutions
finer than microseconds (which is the resolution of the python datetime
objects).  Matplotlib has not supported the datetime64 object yet (John
passed before he could write up that patch).

Cheers!
Ben

By the way, my 12th Rule of Programming is Never roll your own datetime
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] datetime64 1970 issue

2013-04-16 Thread Benjamin Root

On Tue, Apr 16, 2013 at 7:45 PM, Ondřej Čertík ondrej.cer...@gmail.comwrote:

 On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop bob.nnamt...@gmail.com
 wrote:
  I am curious if others have noticed an issue with datetime64 at the
  beginning of 1970. First:
 
  In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31'))
  Out[144]: numpy.timedelta64(1,'D')
 
  OK this look fine, they are one day apart. But look at this:
 
  In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31
 00'))
  Out[145]: numpy.timedelta64(31,'h')
 
  Hmmm, seems like there are 7 extra hours? Am I missing something? I don't
  see this at any other year. This discontinuity makes it hard to use the
  datetime64 object without special adjustment in ones code. I assume this
 a
  bug?

 Indeed, this looks like a bug, I can reproduce it on linux as well:

 In [1]: import numpy as np

 In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31')
 Out[2]: numpy.timedelta64(1,'D')

 In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00')
 Out[3]: numpy.timedelta64(31,'h')


Maybe, maybe not... were you alive then?  For all we know, Charles and co.
were partying an extra 7 hours every day back then?

Just sayin'

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] datetime64 1970 issue

2013-04-17 Thread Benjamin Root

On Wed, Apr 17, 2013 at 7:10 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 On Wed, Apr 17, 2013 at 1:09 PM, Bob Nnamtrop bob.nnamt...@gmail.com
 wrote:
  It would seem that before 1970 the dates do not include the time zone
  adjustment while after 1970 they do. This is the source of the extra 7
  hours.
 
  In [21]: np.datetime64('1970-01-01 00')
  Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h')
 
  In [22]: np.datetime64('1969-12-31 00')
  Out[22]: numpy.datetime64('1969-12-31T00:00Z','h')

  In [111]: np.datetime64('1970-01-01 00').view(np.int64)
 Out[111]: 8

 indicates that it is doing the input transition differently, as the
 underlying value is wrong for one.
 (another weird note -- I;m in pacific time, which is -7 now, with
 DSTso why the 8?)


Aren't we on standard time at Jan 1st?  So, at that date, you would have
been -8.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] datetime64 1970 issue

2013-04-18 Thread Benjamin Root

On Thu, Apr 18, 2013 at 2:27 AM, Joris Van den Bossche 
jorisvandenboss...@gmail.com wrote:

 ANyone tested this on Windows?



 On Windows 7, numpy 1.7.0 (Anaconda 1.4.0 64 bit), I don't even get a
 wrong answer, but an error:

 In [3]: np.datetime64('1969-12-31 00')
 Out[3]: numpy.datetime64('1969-12-31T00:00Z','h')

 In [4]: np.datetime64('1970-01-01 00')
 ---
 OSError   Traceback (most recent call last)
 ipython-input-4-ebf323268a4e in module()
  1 np.datetime64('1970-01-01 00')

 OSError: Failed to use 'mktime' to convert local time to UTC


Have you tried np.test()?  I know some of the tests I added awhile back
utilized pre-epoch dates to test sorting and min/max finding.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] type conversion question

2013-04-18 Thread Benjamin Root

On Thu, Apr 18, 2013 at 7:31 PM, K.-Michael Aye kmichael@gmail.comwrote:

 I don't understand why sometimes a direct assignment of a new dtype is
 possible (but messes up the values), and why at other times a seemingly
 harmless upcast (in my potentially ignorant point of view) is not
 possible.
 So, maybe a direct assignment of a new dtype is actually never a good
 idea? (I'm asking), and one should always go the route of newarray=
 array(oldarray, dtype=newdtype), but why then sometimes the upcast
 provides an error and forbids it and sometimes not?


 Examples:

 In [140]: slope.read_center_window()

 In [141]: slope.data.dtype
 Out[141]: dtype('float32')

 In [142]: slope.data[1,1]
 Out[142]: 10.044398

 In [143]: val = slope.data[1,1]

 In [144]: slope.data.dtype='float64'

 In [145]: slope.data[1,1]
 Out[145]: 586.98938070189865

 #-
 #Here, the value of data[1,1] has completely changed (and so has the
 rest of the array), and no error was given.
 # But then...
 #

 In [146]: val.dtype
 Out[146]: dtype('float32')

 In [147]: val
 Out[147]: 10.044398

 In [148]: val.dtype='float64'
 ---
 AttributeErrorTraceback (most recent call last)
 ipython-input-148-52a373a41cac in module()
  1 val.dtype='float64'

 AttributeError: attribute 'dtype' of 'numpy.generic' objects is not
 writable

 === end of code

 So why is there an error in the 2nd case, but no error in the first
 case? Is there a logic to it?


When you change a dtype like that in the first one, you aren't really
upcasting anything.  You are changing how numpy interprets the underlying
bits.  Because you went from a 32-bit element size to a 64-bit element
size, you are actually seeing the double-precision representation of 2 of
your original data points together.

The correct way to cast is to do something like a =
slope.data.astype('float64').  That makes a copy and does the casting as
safely as possible.

As for the second one, you have what is called a numpy scalar.  These
aren't quite the same thing as a numpy array, and can be a bit more
restrictive.  Can you imagine what sort of issues that would pose if one
could start viewing and modifying neighboring chunks of memory without ever
having to mess around with pointers?  It would be a hacker's dream!

I hope that clears things up.
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.8 release

2013-04-29 Thread Benjamin Root

On Thu, Apr 25, 2013 at 11:16 AM, Charles R Harris 
charlesr.har...@gmail.com wrote:

 Hi All,

 I think it is time to start the runup to the 1.8 release. I don't know of
 any outstanding blockers but if anyone has a PR/issue that they feel needs
 to be in the next Numpy release now is the time to make it known.

 Chuck


Has a np.minmax() function been added yet?  I know it keeps getting +1's
whenever suggested, but I haven't seen it done yet.  Another annoyance is
the lack of a np.nanmean() and np.nanstd() function.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] nanmean(), nanstd() and other missing functions for 1.8

2013-04-30 Thread Benjamin Root

Currently, I am in the process of migrating some co-workers from Matlab and
IDL, and the number one complaint I get is that numpy has nansum() but no
nanmean() and nanstd().  While we do have an alternative in the form of
masked arrays, most of these people are busy enough trying to port their
existing code over to python that this sort of stumbling block becomes
difficult to explain away.

Given how relatively simple these functions are, I can not think of any
reason not to include these functions in v1.8.  Of course, the
documentation for these functions should certainly include mention of
masked arrays.

There is one other non-trivial function that have been discussed before:
np.minmax().  My thinking is that it would return a 2xN array (where N is
whatever size of the result that would be returned if just np.min() was
used).  This would allow one to do min, max = np.minmax(X).

Are there any other functions that others feel are missing from numpy and
would like to see for v1.8?  Let's discuss them here.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nanmean(), nanstd() and other missing functions for 1.8

2013-05-01 Thread Benjamin Root

On Wed, May 1, 2013 at 1:13 AM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

  Of course, the documentation for discussed before: np.minmax().  My
 thinking is that it would return a 2xN array

 How about a tuple: (min, max)?


I am not familiar enough with numpy internals to know which is the better
approach to implement.  I kind of feel that the 2xN array approach would be
more flexible in case a user wants all of this information in a single
array, while still allowing for unpacking as if it was a tuple.  I would
rather enable unforeseen use-cases rather than needlessly restricting them.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal of new function: iteraxis()

2013-05-01 Thread Benjamin Root

On Mon, Apr 29, 2013 at 2:10 PM, Andrew Giessel 
andrew_gies...@hms.harvard.edu wrote:

 Matthew:  Thanks for the link to array order discussion.

 Any more thoughts on Phil's slice() function?



I rather like Phil's solution.  Just some caveats.  Will it always return
views or copies?  It should be one or the other (I haven't looked closely
enough to check), and it should be documented to that affect.  Plus, tests
should be added to make sure it does that.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nanmean(), nanstd() and other missing functions for 1.8

2013-05-01 Thread Benjamin Root

So, to summarize the thread so far:

Consensus:
np.nanmean()
np.nanstd()
np.minmax()
np.argminmax()

Vague Consensus:
np.sincos()

No Consensus (possibly out of scope for this topic):
Better constructors for complex types

I can probably whip up the PR for the nanmean() and nanstd(), and can
certainly help out with the minmax funcs.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nanmean(), nanstd() and other missing functions for 1.8

2013-05-01 Thread Benjamin Root

I have created a PR for the first two (and got np.nanvar() for free).

https://github.com/numpy/numpy/pull/3297

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.nanmin, numpy.nanmax, and scipy.stats.nanmean

2013-05-16 Thread Benjamin Root

On Thu, May 16, 2013 at 6:09 PM, Phillip Feldman 
phillip.m.feld...@gmail.com wrote:

 It seems odd that `nanmin` and `nanmax` are in NumPy, while `nanmean` is
 in SciPy.stats.  I'd like to propose that a `nanmean` function be added to
 NumPy.

 Have no fear.  There is already plans for its inclusion in the next
release:

https://github.com/numpy/numpy/pull/3297/files

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumPy sprints at Scipy 2013, Austin: call for topics and hands to help

2013-05-25 Thread Benjamin Root

On Sat, May 25, 2013 at 12:37 PM, Charles R Harris 
charlesr.har...@gmail.com wrote:



 On Sat, May 25, 2013 at 9:51 AM, David Cournapeau courn...@gmail.comwrote:

 On Sat, May 25, 2013 at 4:19 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sat, May 25, 2013 at 8:23 AM, David Cournapeau courn...@gmail.com
  wrote:
 
  Hi there,
 
  I agreed to help organising NumPy sprints during the scipy 2013
  conference in Austin.
 
  As some of you may know, Stéfan and me will present a tutorial on
  NumPy C code, so if we do our job correctly, we should have a few new
  people ready to help out during the sprints.
 
  It would be good to:
- have some focus topics for improvements
- know who is going to be there at the sprint to work on things
  and/or help newcomers
 
  Things I'd like to work on myself is looking into splitting things
  from multiarray, think about a better internal API for dtype
  registration/hooks (with the goal to remove any date dtype hardcoding
  in both multiarray and ufunc machinery), but I am sure others have
  more interesting ideas :)
 
 
  I'd like to get a 1.8 beta out or at least get to the point where we can
  make that leap.

 Sure, I am fine doing this in a branch post 1.8.x, I am not in a hurry.

  There is a lot of new stuff that needs to be tested, PR's to
  go through, and I have a suspicion that a memory allocation error might
 have
  crept in somewhere.

 Will you be there at the conference ?


 Yes. I'm not very good at sprinting though. I prefer to amble with a big
 screen, nice keyboard, and a cup of coffee ;)

 Chuck


Oh, I am sure we could get you set up with a projector screen and a nice
bluetooth keyboard... Now, I think you just hit on something with the
coffee.  I don't recall previous sprints having coffee available.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] genfromtxt() skips comments

2013-05-31 Thread Benjamin Root

On Fri, May 31, 2013 at 5:08 PM, Albert Kottke albert.kot...@gmail.comwrote:

 I noticed that genfromtxt() did not skip comments if the keyword names is
 not True. If names is True, then genfromtxt() would take the first line as
 the names. I am proposing a fix to genfromtxt that skips all of the
 comments in a file, and potentially using the last comment line for names.
 This will allow reading files with and without comments and/or names.

 The difference is here:
 https://github.com/arkottke/numpy/compare/my-genfromtxt


Careful with semantics here.  First off, using the last comment line as the
source for names might initially make sense, except when there are comments
within the data file.  I would suggest going for last comment line before
the first line of data.  Second, sometimes the names come from an
un-commented first line, but comments are still used within the file
elsewhere.

Just some food for thought.  I don't know if the current design is best or
not.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] suggested change of behavior for interp

2013-06-04 Thread Benjamin Root

Could non-monotonicity be detected as part of the interp process? Perhaps a
sign switch in the deltas?

I have been bitten by this problem too.

Cheers!
Ben Root

On Jun 4, 2013 9:08 PM, Eric Firing efir...@hawaii.edu wrote:

 On 2013/06/04 2:05 PM, Charles R Harris wrote:
 
 
  On Tue, Jun 4, 2013 at 12:07 PM, Slavin, Jonathan
  jsla...@cfa.harvard.edu mailto:jsla...@cfa.harvard.edu wrote:
 
  Hi,
 
  I would like to suggest that the behavior of numpy.interp be changed
  regarding treatment of situations in which the x-coordinates are not
  monotonically increasing.  Specifically, it seems to me that interp
  should work correctly when the x-coordinate is decreasing
  monotonically.  Clearly it cannot work if the x-coordinate is not
  monotonic, but in that case it should raise an exception.  Currently
  if x is not increasing it simply silently fails, providing incorrect
  values.  This fix could be as simple as a monotonicity test and
  inversion if necessary (plus a raise statement for non-monotonic
cases).
 
 
  Seems reasonable, although it might add a bit of execution time.

 The monotonicity test should be an option if it is available at all;
 when interpolating a small number of points from a large pair of arrays,
 the single sweep through the whole array could dominate the execution
 time.  Checking for increasing versus decreasing, in contrast, can be
 done fast, so handling the decreasing case transparently is reasonable.

 Eric

 
  Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] floats coerced to string with {:f}.format() ?

2013-06-06 Thread Benjamin Root

You can treat a record in a record array like a tuple or a dictionary when
it comes to formatting.  So, either refer to the index element you want
formatted as a float, or refer to it by name (in the formatting language).
By just doing {:f}, you are just grabbing the first one, which is XXYYZZ
and trying to format that.  But remember, you can only do this to a single
record at a time, not the entire record array at once.

Regular numpy arrays can not be formatted in this manner, hence your other
attempt failures.

Cheers!
Ben Root


On Thu, Jun 6, 2013 at 3:48 PM, Maccarthy, Jonathan K jkm...@lanl.govwrote:

 Hi everyone,

 I've looked in the mailing list archives and with the googles, but haven't
 yet found any hints with this question...

 I have a float field in a NumPy record that looks like it's being
 substituted as a string in the Python {:f}.format() mini-language, thus
 throwing an error:


 In [1]: tmp = np.rec.array([('XYZZ', 2001123, -23.82396)],
 dtype=np.dtype([('sta', '|S6'), ('ondate', 'i8'), ('lat', 'f4')]))[0]

 In [2]: type(tmp)
 Out[3]: numpy.core.records.record

 In [3]: tmp
 Out[3]: ('XYZZ', 2001123, -23.823917388916016)

 In [4]: tmp.sta, tmp.ondate, tmp.lat
 Out[4]: ('XYZZ', 2001123, -23.823917)

 # strings and integers work
 In [5]: '{0.sta:6.6s} {0.ondate:8d}'.format(tmp)
 Out[5]: 'XYZZ2001123'

 # lat is a float, but it seems to be coerced to a string first, and
 failing
 In [6]: '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)
 ---
 ValueErrorTraceback (most recent call last)
 /Users/jkmacc/ipython-input-312-bff8066cfde8 in module()
  1 '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)

 ValueError: Unknown format code 'f' for object of type 'str'

 # string formatting for doesn't fail
 In [7]: '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6s}'.format(tmp)
 Out[7]: 'XYZZ2001123  -23.82'


 This also fails:

 In [7]: {:f}.format(np.array(3.2, dtype='f4'))
 ---
 ValueErrorTraceback (most recent call last)
 /Users/jkmacc/ipython-input-314-33119128e3e6 in module()
  1 {:f}.format(np.array(3.2, dtype='f4'))

 ValueError: Unknown format code 'f' for object of type 'str'



 Does anyone understand what's happening?

 Thanks for your help.

 Best,
 Jon
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.filled, again

2013-06-13 Thread Benjamin Root

On Thu, Jun 13, 2013 at 9:36 AM, Aldcroft, Thomas 
aldcr...@head.cfa.harvard.edu wrote:




 On Wed, Jun 12, 2013 at 2:55 PM, Eric Firing efir...@hawaii.edu wrote:

 On 2013/06/12 8:13 AM, Warren Weckesser wrote:
  That's why I suggested 'filledwith' (add the underscore if you like).
  This also allows a corresponding masked implementation, 'ma.filledwith',
  without clobbering the existing 'ma.filled'.

 Consensus on np.filled? absolutely not, you do not have a consensus.

 np.filledwith or filled_with: fine with me, maybe even with
 everyone--let's see.  I would prefer the underscore version.


 +1 on np.filled_with.  It's unique the meaning is extremely obvious.  We
 do use np.ma.filled in astropy so a big -1 on deprecating that (which would
 then require doing numpy version checks to get the right method).  Even
 when there is an NA dtype the numpy.ma users won't go away anytime soon.


I like np.filled_with(), but just to be devil's advocate, think of the
syntax:

np.filled_with((10, 24), np.nan)

As I read that, I am filling the array with (10, 24), not NaNs.  Minor
issue, for sure, but just thought I raise that.

-1 on deprecation of np.ma.filled().  -1 on np.filled() due to collision
with np.ma (both conceptually and programatically).

np.values() might be a decent alternative.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.filled, again

2013-06-14 Thread Benjamin Root

On Fri, Jun 14, 2013 at 1:21 PM, Robert Kern robert.k...@gmail.com wrote:

 On Fri, Jun 14, 2013 at 6:18 PM, Eric Firing efir...@hawaii.edu wrote:
  On 2013/06/14 5:15 AM, Alan G Isaac wrote:
  On 6/14/2013 9:27 AM, Aldcroft, Thomas wrote:
  If I just saw np.values(..) in some code I would never guess what it
 is doing from the name
 
  That suggests np.fromvalues.
  But more important than the name I think
  is allowing broadcasting of the values,
  based on NumPy's broadcasting rules.
  Broadcasting a scalar is then a special case,
  even if it is the case that has dominated this thread.
 
  True, but this looks to me like mission creep.  All of this fuss is
  about replacing two lines of user code with a single line.  If it can't
  be kept simple, both in implementation and in documentation, it
  shouldn't be done at all.  I'm not necessarily opposed to your
  suggestion, but I'm skeptical.

 It's another two-liner:

 [~]
 |1 x = np.empty([3,4,5])

 [~]
 |2 x[...] = np.arange(5)

 [~]
 |3 x
 array([[[ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.]],

[[ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.]],

[[ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.],
 [ 0.,  1.,  2.,  3.,  4.]]])

 It's wafer-thin!


True, but wouldn't we rather want to encourage the use of broadcasting in
the numerical operations rather than creating new arrays from broadcasted
arrays?

a = np.arange(5) + np.ones((3, 4, 5))
b = np.filled((3, 4, 5), np.arange(5)) + np.ones((3, 4, 5))

The first one is much easier to read, and is more efficient than the second
(theoretical) one because it needs to create two (3, 4, 5) arrays rather
than just one.  That being said, one could make a similar argument against
ones(), zeros(), etc.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.filled, again

2013-06-14 Thread Benjamin Root

On Fri, Jun 14, 2013 at 1:22 PM, Nathaniel Smith n...@pobox.com wrote:

 On Wed, Jun 12, 2013 at 7:43 PM, Eric Firing efir...@hawaii.edu wrote:
  On 2013/06/12 2:10 AM, Nathaniel Smith wrote:
  Personally I think that overloading np.empty is horribly ugly, will
  continue confusing newbies and everyone else indefinitely, and I'm
  100% convinced that we'll regret implementing such a warty interface
  for something that should be so idiomatic. (Unfortunately I got busy
  and didn't actually say this in the previous thread though.) So I
  think we should just merge the PR as is. The only downside is the
  np.ma inconsistency, but, np.ma is already inconsistent (cf.
  masked_array.fill versus masked_array.filled!), somewhat deprecated,
 
  somewhat deprecated?  Really?  Since when?  By whom?  Replaced by what?

 Sorry, not trying to start a fight, just trying to summarize the
 situation. As far as I can tell:


Oh... (puts away iron knuckles)


 Despite heroic efforts on the part of its authors, numpy.ma has a
 number of weird quirks (masked data can still trigger invalid value
 errors), misfeatures (hard versus soft masks), and just plain old pain
 points (ongoing issues with whether any given operation will respect
 or preserve the mask).


Actually, now that we have a context manager for warning capture, we could
actually fix that.



 It's been in deep maintenance mode for some time; we merge the
 occasional bug fix that people send in, and that's it. (To be fair,
 numpy as a whole is fairly slow-moving, but numpy.ma still gets much
 less attention.)

 Even if there were active maintainers, no-one really has any idea how
 to fix any of the problems above; they're not so much bugs as
 intrinsic limitations of the design.


Therefore, my impression is that a majority (not all, but a majority)
 of numpy developers strongly recommend against the use of numpy.ma in
 new projects.


Such a recommendation should be in writing in the documentation and
elsewhere.  Furthermore, a proper replacement would also be needed.  Just
simiply deprecating it without some sort of decent alternative leaves
everybody in a lurch.  I have high hopes for NA to be that replacement, and
the sooner, the better.


 I could be wrong! And I know there's nothing to really replace it. I'd
 like to fix that. But I think semi-deprecated is not an unfair
 shorthand for the above.


You will have to pry np.ma from my cold, dead hands!  (or distract me with
a sufficiently shiny alternative)



 (I'll even admit that I'd *like* to actually deprecate it. But what I
 mean by that is, I don't think it's possible to fix it to the point
 where it's actually a solid/clean/robust library, so I'd like to reach
 a point where everyone who's currently using it is happier switching
 to something else and is happy to sign off on deprecating it.)


As far as many people are concerned, it is a solid, clean, robust library.



  and AFAICT there are far more people who will benefit from a clean
  np.filled idiom than who actually use np.ma (and in particular its
  fill-value functionality). So there would be two
 
  I think there are more np.ma users than you realize.  Everyone who uses
  matplotlib is using np.ma at least implicitly, if not explicitly.  Many
  of the matplotlib examples put np.ma to good use.  np.ma.filled is an
  essential long-standing part of the np.ma API.  I don't see any good
  rationale for generating a conflict with it, when an adequate
  non-conflicting alternative ('np.initialized', maybe others) exists.

 I'm aware of that. If I didn't care about the opinions of numpy.ma
 users, I wouldn't go starting long and annoying mailing list threads
 about features that are only problematic because of their affect on
 numpy.ma :-).

 But, IMHO given the issues with numpy.ma, our number #1 priority ought
 to be making numpy proper as clean and beautiful as possible; my
 position that started this thread is basically just that we shouldn't
 make numpy proper worse just for numpy.ma's sake. That's the tail
 wagging the dog. And this 'conflict' seems a bit overstated given that
 (1) np.ma.filled already has multiple names (and 3/4 of the uses in
 matplotlib use the method version, not the function version), (2) even
 if we give it a non-conflicting name, np.ma's lack of maintenance
 means that it'd probably be years before someone got around to
 actually adding a parallel function to np.ma. [Unless this thread
 spurs someone into submitting one just to prove me wrong ;-).]


Actually, IIRC, np.ma does some sort of auto-wrapping of numpy functions.
This is why adding np.filled() would cause a namespace clobbering, I think.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] time to revisit NA/ma ideas

2013-06-15 Thread Benjamin Root

On Fri, Jun 14, 2013 at 6:38 PM, Eric Firing efir...@hawaii.edu wrote:

 A nice summary of the discussions from a year ago is here:

 http://www.numpy.org/NA-overview.html

 It provides food for thought.

 Eric


Perhaps a BoF session should be put together for SciPy 2013, and possibly
even have a google hangout session for it to bring in interested parties to
the discussion?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Allow == and != to raise errors

2013-07-12 Thread Benjamin Root

I can see where you are getting at, but I would have to disagree.  First of
all, when a comparison between two mis-shaped arrays occur, you get back a
bone fide python boolean, not a numpy array of bools. So if any action was
taken on the result of such a comparison assumed that the result was some
sort of an array, it would fail (yes, this does make it a bit difficult to
trace back the source of the problem, but not impossible).

Second, no semantics are broken with this. Are the arrays equal or not? If
they weren't broadcastible, then returning False for == and True for !=
makes perfect sense to me. At least, that is my take on it.

Cheers!
Ben Root



On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg
sebast...@sipsolutions.netwrote:

 Hey,

 the array comparisons == and != never raise errors but instead simply
 return False for invalid comparisons.

 The main example are arrays of non-matching dimensions, and object
 arrays with invalid element-wise comparisons:

 In [1]: np.array([1,2,3]) == np.array([1,2])
 Out[1]: False

 In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
 Out[2]: False

 This seems wrong to me, and I am sure not just me. I doubt any large
 projects makes use of such comparisons and assume that most would prefer
 the shape mismatch to raise an error, so I would like to change it. But
 I am a bit unsure especially about smaller projects. So to keep the
 transition a bit safer could imagine implementing a FutureWarning for
 these cases (and that would at least notify new users that what they are
 doing doesn't seem like the right thing).

 So the question is: Is such a change safe enough, or is there some good
 reason for the current behavior that I am missing?

 Regards,

 Sebastian

 (There may be other issues with structured types that would continue
 returning False I think, because neither side knows how to compare)

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What should be the result in some statistics corner cases?

2013-07-15 Thread Benjamin Root

This is going to need to be heavily documented with doctests. Also, just to
clarify, are we talking about a ValueError for doing a nansum on an empty
array as well, or will that now return a zero?

Ben Root


On Mon, Jul 15, 2013 at 9:52 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Sun, Jul 14, 2013 at 2:55 PM, Warren Weckesser 
 warren.weckes...@gmail.com wrote:

 On 7/14/13, Charles R Harris charlesr.har...@gmail.com wrote:
  Some corner cases in the mean, var, std.
 
  *Empty arrays*
 
  I think these cases should either raise an error or just return nan.
  Warnings seem ineffective to me as they are only issued once by
 default.
 
  In [3]: ones(0).mean()
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
  RuntimeWarning: invalid value encountered in double_scalars
ret = ret / float(rcount)
  Out[3]: nan
 
  In [4]: ones(0).var()
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
  RuntimeWarning: invalid value encountered in true_divide
out=arrmean, casting='unsafe', subok=False)
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
  RuntimeWarning: invalid value encountered in double_scalars
ret = ret / float(rcount)
  Out[4]: nan
 
  In [5]: ones(0).std()
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
  RuntimeWarning: invalid value encountered in true_divide
out=arrmean, casting='unsafe', subok=False)
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
  RuntimeWarning: invalid value encountered in double_scalars
ret = ret / float(rcount)
  Out[5]: nan
 
  *ddof = number of elements*
 
  I think these should just raise errors. The results for ddof =
 #elements
  is happenstance, and certainly negative numbers should never be
 returned.
 
  In [6]: ones(2).var(ddof=2)
 
 /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
  RuntimeWarning: invalid value encountered in double_scalars
ret = ret / float(rcount)
  Out[6]: nan
 
  In [7]: ones(2).var(ddof=3)
  Out[7]: -0.0
  *
  nansum*
 
  Currently returns nan for empty arrays. I suspect it should return nan
 for
  slices that are all nan, but 0 for empty slices. That would make it
  consistent with sum in the empty case.
 


 For nansum, I would expect 0 even in the case of all nans.  The point
 of these functions is to simply ignore nans, correct?  So I would aim
 for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])


 Agreed, although that changes current behavior. What about the other
 cases?


 Looks like there isn't much interest in the topic, so I'll just go ahead
 with the following choices:

 Non-NaN case

 1) Empty array - ValueError

 The current behavior with stats is an accident, i.e., the nan arises from
 0/0. I like to think that in this case the result is any number, rather
 than not a number, so *the* value is simply not defined. So in this case
 raise a ValueError for empty array.

 2) ddof = n - ValueError

 If the number of elements, n, is not zero and ddof = n, raise a
 ValueError for the ddof value.

 Nan case

 1) Empty array - Value Error
 2) Empty slice - NaN
 3) For slice ddof = n - Nan

  Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What should be the result in some statistics corner cases?

2013-07-15 Thread Benjamin Root

On Jul 15, 2013 11:47 AM, Charles R Harris charlesr.har...@gmail.com
wrote:



 On Mon, Jul 15, 2013 at 8:58 AM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg 
 sebast...@sipsolutions.net wrote:

 On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
 
 
  On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
  charlesr.har...@gmail.com wrote:
 

 snip

 
  For nansum, I would expect 0 even in the case of all
  nans.  The point
  of these functions is to simply ignore nans, correct?
   So I would aim
  for this behaviour:  nanfunc(x) behaves the same as
  func(x[~isnan(x)])
 
 
  Agreed, although that changes current behavior. What about the
  other cases?
 
 
 
  Looks like there isn't much interest in the topic, so I'll just go
  ahead with the following choices:
 
  Non-NaN case
 
  1) Empty array - ValueError
 
  The current behavior with stats is an accident, i.e., the nan arises
  from 0/0. I like to think that in this case the result is any number,
  rather than not a number, so *the* value is simply not defined. So in
  this case raise a ValueError for empty array.
 
 To be honest, I don't mind the current behaviour much sum([]) = 0,
 len([]) = 0, so it is in a way well defined. At least I am not sure if I
 would prefer always an error. I am a bit worried that just changing it
 might break code out there, such as plotting code where it makes
 perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
 would probably be visible fast.

  2) ddof = n - ValueError
 
  If the number of elements, n, is not zero and ddof = n, raise a
  ValueError for the ddof value.
 
 Makes sense to me, especially for ddof  n. Just returning nan in all
 cases for backward compatibility would be fine with me too.


 Currently if ddof  n it returns a negative number for variance, the NaN
 only comes when ddof == 0 and n == 0, leading to 0/0 (float is NaN, integer
 is zero division).



  Nan case
 
  1) Empty array - Value Error
  2) Empty slice - NaN
  3) For slice ddof = n - Nan
 
 Personally I would somewhat prefer if 1) and 2) would at least default
 to the same thing. But I don't use the nanfuncs anyway. I was wondering
 about adding the option for the user to pick what the fill is (and i.e.
 if it is None (maybe default) - ValueError). We could also allow this
 for normal reductions without an identity, but I am not sure if it is
 useful there.


 In the NaN case some slices may be empty, others not. My reasoning is
 that that is going to be data dependent, not operator error, but if the
 array is empty the writer of the code should deal with that.


 In the case of the nanvar, nanstd, it might make more sense to handle ddof
 as

 1) if ddof is = axis size, raise ValueError
 2) if ddof is = number of values after removing NaNs, return NaN

 The first would be consistent with the non-nan case, the second accounts
 for the variable nature of data containing NaNs.

 Chuck



I think this is a good idea in that it naturally follows well with the
conventions of what to do with empty arrays / empty slices with nanmean,
etc. Note, however, I am not a very big fan of the idea of having two
different behaviors for what I see as semantically the same thing.

But, my objections are not strong enough to veto it, and I do think this
proposal is well thought-out.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] What should be the result in some statistics corner cases?

2013-07-15 Thread Benjamin Root

To add a bit of context to the question of nansum on empty results, we
currently differ from MATLAB and R in this respect, they return zero no
matter what. Personally, I think it should return zero, but our current
behavior of returning nans has existed for a long time.

Personally, I think we need a deprecation warning and possibly wait to
change this until 2.0, with plenty of warning that this will change.

Ben Root
On Jul 15, 2013 8:46 PM, Charles R Harris charlesr.har...@gmail.com
wrote:



 On Mon, Jul 15, 2013 at 6:22 PM, Stéfan van der Walt ste...@sun.ac.zawrote:

 On Mon, 15 Jul 2013 08:33:47 -0600, Charles R Harris wrote:
  On Mon, Jul 15, 2013 at 8:25 AM, Benjamin Root ben.r...@ou.edu wrote:
 
   This is going to need to be heavily documented with doctests. Also,
 just
   to clarify, are we talking about a ValueError for doing a nansum on an
   empty array as well, or will that now return a zero?
  
  
  I was going to leave nansum as is, as it seems that the result was by
  choice rather than by accident.

 That makes sense--I like Sebastian's explanation whereby operations that
 define an identity yields that upon empty input.


 So nansum should return zeros rather than the current NaNs?

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] azip

2013-07-18 Thread Benjamin Root

Forgive my ignorance, but has numpy and scipy stopped doing that weird doc
editing thing that existed back in the days of Trac? I have actually held
back on submitting doc edits because I hated using that thing so much.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] azip

2013-07-18 Thread Benjamin Root

Well, that's nice to know now. However, I distinctly remember being told
that any changes made to the docstrings directly in the source would end up
getting replaced by whatever was in the doc edit system whenever a merge
from it happens. Therefore, if one wanted their edits to be persistent,
they had to submit it through the doc edit system.

Note, much of my animosity towards the doc edit system was due to issues
with the scipy.org being so sluggish back then, and the length of time it
took for any edits to finally make it down to the docstrings. Now that
scipy.org is much more responsive, and that numpy and scipy has moved on to
git, perhaps those two issues are gone now?

Sorry for hijacking the thread, this is just the first I am hearing that
one can submit documentation edits via PRs and was surprised.

Cheers!
Ben Root



On Thu, Jul 18, 2013 at 1:51 PM, Pauli Virtanen p...@iki.fi wrote:

 18.07.2013 20:18, Benjamin Root kirjoitti:
  Forgive my ignorance, but has numpy and scipy stopped doing that weird
  doc editing thing that existed back in the days of Trac? I have actually
  held back on submitting doc edits because I hated using that thing so
 much.

 You were never required to use it.

 --
 Pauli Virtanen


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv 30% slowdown

2013-07-22 Thread Benjamin Root

On Mon, Jul 22, 2013 at 10:55 AM, Yaroslav Halchenko
li...@onerussian.comwrote:

 At some point I hope to tune up the report with an option of viewing the
 plot using e.g. nvd3 JS so it could be easier to pin point/analyze
 interactively.


shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg
backend that allows for interactivity.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] .flat (was: add .H attribute?)

2013-07-23 Thread Benjamin Root

On Tue, Jul 23, 2013 at 10:11 AM, Stéfan van der Walt ste...@sun.ac.zawrote:

 On Tue, Jul 23, 2013 at 3:39 PM, Alan G Isaac alan.is...@gmail.com
 wrote:
  On 7/23/2013 9:09 AM, Pauli Virtanen wrote:
  .flat which I think
  is rarely used
 


Don't assume .flat is not commonly used.  A common idiom in matlab is
a[:] to flatten an array. When porting code over from matlab, it is
typical to replace that with either a.flat or a.flatten(), depending on
whether an iterator or an array is needed.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] .flat

2013-07-23 Thread Benjamin Root

On Tue, Jul 23, 2013 at 10:46 AM, Pauli Virtanen p...@iki.fi wrote:

 23.07.2013 17:34, Benjamin Root kirjoitti:
 [clip]
  Don't assume .flat is not commonly used.  A common idiom in matlab is
  a[:] to flatten an array. When porting code over from matlab, it is
  typical to replace that with either a.flat or a.flatten(), depending
  on whether an iterator or an array is needed.

 It is much more rarely used than `ravel()` and `flatten()`, as can be
 verified by grepping e.g. the matplotlib source code.


The matplotlib source code is not a port from Matlab, so grepping that
wouldn't prove anything.

Meanwhile, the NumPy for Matlab users page notes that a.flatten() makes a
copy. A newbie to NumPy would then (correctly) look up the documentation
for a.flatten() and see in the See Also section that a.flat is just an
iterator rather than a copy, and would often use that to avoid the copy.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv 30% slowdown

2013-07-23 Thread Benjamin Root

On Mon, Jul 22, 2013 at 1:28 PM, Yaroslav Halchenko li...@onerussian.comwrote:


 On Mon, 22 Jul 2013, Benjamin Root wrote:
   At some point I hope to tune up the report with an option of
 viewing the
   plot using e.g. nvd3 JS so it could be easier to pin point/analyze
   interactively.
 shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg
 backend that allows for interactivity.

 that's just sick!

 do you know about any motion in python-sphinx world on supporting it?

 is there any demo page you would recommend to assess what to expect
 supported in upcoming webagg?


Oldie but goodie:
http://mdboom.github.io/blog/2012/10/11/matplotlib-in-the-browser-its-coming/
Official Announcement:
http://matplotlib.org/1.3.0/users/whats_new.html#webagg-backend

Note, this is different than what is now available in IPython Notebook (it
isn't really interactive there). As for what is supported, just about
everything you can do normally, can be done in WebAgg. I have no clue about
sphinx-level support.

Now, back to your regularly scheduled program.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] add .H attribute?

2013-07-24 Thread Benjamin Root

On Wed, Jul 24, 2013 at 8:47 AM, Daπid davidmen...@gmail.com wrote:

 An idea:

 If .H is ideally going to be a view, and we want to keep it this way,
 we could have a .h() method with the present implementation. This
 would preserve the name .H for the conjugate view --when someone finds
 the way to do it.

 This way we would increase the readability, simplify some matrix
 algebra code, and keep the API consistency.


I could get behind a .h() method until .H attribute is ready.

+1

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] import overhead of numpy.testing

2013-08-10 Thread Benjamin Root

On Aug 10, 2013 12:50 PM, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Sat, Aug 10, 2013 at 5:21 PM, Andrew Dalke da...@dalkescientific.com
wrote:

 [Short version: It doesn't look like my proposal or any
 simple alternative is tenable.]

 On Aug 10, 2013, at 10:28 AM, Ralf Gommers wrote:
  It does break backwards compatibility though, because now you can do:
 
 import numpy as np
 np.testing.assert_equal(x, y)

 Yes, it does.

 I realize that a design goal in numpy was that (most?) submodules are
 available without any additional imports. This is the main reason for
 the import numpy overhead. The tension between ease-of-use for some
 and overhead for others is well known. For example, Sage tickets 3577,
 6494, and 11714 relate to deferring numpy import during startup.


 The three relevant questions are:

 1) is numpy.testing part of that promise? This can be split
into multiple ways.

 o The design goal could be that only the numerics that people use
for interactive/REPL computing are accessible without
additional explicit imports, which implies that the import of
numpy.testing is an implementation side-effect of providing
submodule-level test() and bench() APIs

 o all NumPy modules with user-facing APIs should be accessible
   from numpy without additional imports

 While I would like to believe that the import of numpy.testing
 is an implementation side-effect of providing test() and bench(),
 I believe that I'm unlikely to convince the majority.


 It likely is a side-effect rather than intentional design, but at this
point that doesn't matter much anymore. There never was a clear distinction
between private and public modules and now, as your investigation shows,
the cost of removing the import is quite high.

 For justifiable reasons, the numpy project is loath to break
 backwards compatibility, and I don't think there's an existing
 bright-line policy which would say that import numpy; numpy.testing
 should be avoided.


 2) If it isn't a promise that numpy.testing is usable after an
import numpy then how many people will be affected by an
 implementation change, and at what level of severity?



 I looked to see which packages might fail. A Debian code
 search of numpy.testing showed no problems, and no one
 uses np.testing.

 I did a code search at http://code.ohloh.net . Of the first
 200 or so hits for numpy.testing, nearly all of them fell
 into uses like:

 from numpy.testing import Tester
 from numpy.testing import assert_equal, TestCase
 from numpy.testing.utils import *
 from numpy.testing import *

 There were, however, several packages which would fail:

  test_io.py and test_image.py and test_array_bridge.py in MediPy
 (Interestingly, test_circle.py has a import numpy.testing,
  so it's not universal practice in that package)
  calculators_test.py in  OpenQuake Engine
  ForcePlatformsExtractorTest.py in b-tk

 Note that these failures are in the test programs, and not
 in the main body code, so are unlikely to break end-user
 programs.


 HOWEVER!

 The real test is for people who do import numpy as np then
 refer to np.testing. There are about 454 such matches in
 Ohloh.

 One example is 'test_polygon.py' from scikit-image. Others are:
  test_anova.py in statsmodel
  test_graph.py in scikit-learn
  test_rmagic.py in IPython
  test_mlab.py in matplotlib

 Nearly all the cases I looked at were in files starting test,
 or a handful which ended in test.py or Test.py. Others use
 np.test only as part of a unit test, such as:

  affine_grid.py and others in pyMor (as part of in-file unit tests)
  technical_indicators.py in QuantPy (as part of unittest.TestCase)
  coord_tools.py in NiPy-OLD (as part of in-file unit tests)
  predstd.py and others in statsmodels (as a main-line unit test)
  galsim_test_helpers.py in GalSim

 These would likely not break end-user code.

 Sadly, not all are that safe. For examples:
  simple_contrast.py  example program for nippy
  try_signal_lti.py in joePython
  run.py in python-seminar
  verify.py in bell_d_project (a final project for a CS class)
  ex_shrink_pickle.py in statsmodels (as an example?)
  parametric_design.py in nippy (uses assert_almost_equal to verify an
example)
  model.py in pymc-devs's pymc
  model.py in duffy
  zipline in olmar
  utils.py in MNE
  and I gave up at result 320 of 454.

 Based on this, about 1% of the programs which use numpy.testing
 would break. This tells me that there are enough user programs
 which would fail that I don't think numpy will decide to make
 this change.




 And the third question is

   3) Are there other alternatives?

 Or as Ralf Gommers wrote:
  Do you have more detailed timings? I'm guessing the bottleneck is
importing nose.


 I do have more detailed timings. nose is not imported
 during an import numpy. (For one, import nose takes
 a full 0.11 seconds on my laptop and adds 199 modules
 to sys.modules!)


 The hit is

Re: [Numpy-discussion] import overhead of numpy.testing

2013-08-11 Thread Benjamin Root

On Aug 11, 2013 5:02 AM, Ralf Gommers ralf.gomm...@gmail.com wrote:




 On Sun, Aug 11, 2013 at 3:35 AM, Benjamin Root ben.r...@ou.edu wrote:

 Would there be some sort of way to detect that numpy.testing wasn't
explicitly imported and issue a deprecation warning? Say, move the code
into numpy._testing, import in into the namespace as testing, but then have
the testing.py file set a flag in _testing to indicate an explicit import
has occurred?

 Eventually, even _testing would no longer get imported by default and
all will be well.

 Of course, that might be too convoluted?

 I'm not sure how that would work (you didn't describe how to decide that
the import was explicit), but imho the impact would be too high.

 Ralf


The idea would be that within numpy (and we should fix SciPy as well), we
would always import numpy._testing as testing, and not import testing.py
ourselves. Then, there would be a flag in _testing.py that would be set to
emit, by default, warnings about using np.testing without an explicit
import, and stating which version all code will have to be switched by
perhaps 2.0?).

testing.py would do a from _testing import *, but also set the flag in
_testing to not emit warnings, because only a non-numpy (and SciPy) module
would have imported it.

It isn't foolproof. If a project has multiple dependencies that use
np.testing, and only one of them explicitly imports np.testing, then the
warning becomes hidden for the non-compliant parts. However, if we make
sure that the core SciPy projects use np._testing, it would go a long way
to get the word out.

Again, I am just throwing it out there as an idea. The speedups we are
getting right now so far are nice, so it is entirely possible that this
kludge is just not worth the last remaining bits of extra time.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] import overhead of numpy.testing

2013-08-11 Thread Benjamin Root

On Aug 11, 2013 4:37 PM, Andrew Dalke da...@dalkescientific.com wrote:

 On Aug 11, 2013, at 10:24 PM, Benjamin Root wrote:
  The idea would be that within numpy (and we should fix SciPy as well),
we would always import numpy._testing as testing, and not import testing.py
ourselves.

 The problem is the existing code out there which does:

 import numpy as np
  ...
 np.testing.utils.assert_almost_equal(x, y)

 (That is, without an additional import), and other code which does

 from numpy.testing import *


I wouldn't consider having then both emit a warning. The latter one is an
explicit import (albeit horrible). Iirc, that should import the testing.py,
and deactivate the warnings.

However, from numpy import testing would be a problem... Drat...

Forget I said anything. The idea wouldn't work.

Ben
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] RAM problem during code execution - Numpya arrays

2013-08-23 Thread Benjamin Root

On Fri, Aug 23, 2013 at 10:34 AM, Francesc Alted franc...@continuum.iowrote:

 Hi José,

 The code is somewhat longish for a pure visual inspection, but my advice
 is that you install memory profiler (
 https://pypi.python.org/pypi/memory_profiler).  This will help you
 determine which line or lines are hugging the memory the most.

 Saludos,
 Francesc

 On Fri, Aug 23, 2013 at 3:58 PM, Josè Luis Mietta 
 joseluismie...@yahoo.com.ar wrote:

 Hi ecperts. I need your help with a RAM porblem during execution of my
 script.
 I wrote the next code. I use SAGE. In 1-2 hours of execution time the RAM
 of my laptop (8gb) is filled and the sistem crash:

 from scipy.stats import uniformimport numpy as np

 cant_de_cadenas =[700,800,900]

 cantidad_de_cadenas=np.array([])
 for k in cant_de_cadenas:
 cantidad_de_cadenas=np.append(cantidad_de_cadenas,k)

 cantidad_de_cadenas=np.transpose(cantidad_de_cadenas)

 b=10
 h=bLongitud=1
 numero_experimentos=150

 densidad_de_cadenas =cantidad_de_cadenas/(b**2)

 prob_perc=np.array([])

 tiempos=np.array([])

 S_int=np.array([])

 S_medio=np.array([])

 desviacion_standard=np.array([])

 desviacion_standard_nuevo=np.array([])

 anisotropia_macroscopica_porcentual=np.array([])

 componente_y=np.array([])

 componente_x=np.array([])
 import time
 for N in cant_de_cadenas:

 empieza=time.clock()

 PERCOLACION=np.array([])

 size_medio_intuitivo = np.array([])
 size_medio_nuevo = np.array([])
 std_dev_size_medio_intuitivo = np.array([])
 std_dev_size_medio_nuevo = np.array([])
 comp_y = np.array([])
 comp_x = np.array([])


 for u in xrange(numero_experimentos):


 perco = False

 array_x1=uniform.rvs(loc=-b/2, scale=b, size=N)
 array_y1=uniform.rvs(loc=-h/2, scale=h, size=N)
 array_angle=uniform.rvs(loc=-0.5*(np.pi), scale=np.pi, size=N)


 array_pendiente_x=1./np.tan(array_angle)

 random=uniform.rvs(loc=-1, scale=2, size=N)
 lambda_sign=np.zeros([N])
 for t in xrange(N):
 if random[t]0:
 lambda_sign[t]=-1
 else:
 lambda_sign[t]=1
 array_lambdas=(lambda_sign*Longitud)/np.sqrt(1+array_pendiente_x**2)


 array_x2= array_x1 + array_lambdas*array_pendiente_x
 array_y2= array_y1 + array_lambdas*1

 array_x1 = np.append(array_x1, [-b/2, b/2, -b/2, -b/2])
 array_y1 = np.append(array_y1, [-h/2, -h/2, -h/2, h/2])
 array_x2 = np.append(array_x2, [-b/2, b/2, b/2, b/2])
 array_y2 = np.append(array_y2, [h/2, h/2, -h/2, h/2])

 M = np.zeros([N+4,N+4])

 for j in xrange(N+4):
 if j0:
 x_A1B1 = array_x2[j]-array_x1[j]
 y_A1B1 = array_y2[j]-array_y1[j]
 x_A1A2 = array_x1[0:j]-array_x1[j]
 y_A1A2 = array_y1[0:j]-array_y1[j]
 x_A2A1 = -1*x_A1A2
 y_A2A1 = -1*y_A1A2
 x_A2B2 = array_x2[0:j]-array_x1[0:j]
 y_A2B2 = array_y2[0:j]-array_y1[0:j]
 x_A1B2 = array_x2[0:j]-array_x1[j]
 y_A1B2 = array_y2[0:j]-array_y1[j]
 x_A2B1 = array_x2[j]-array_x1[0:j]
 y_A2B1 = array_y2[j]-array_y1[0:j]

 p1 = x_A1B1*y_A1A2 - y_A1B1*x_A1A2
 p2 = x_A1B1*y_A1B2 - y_A1B1*x_A1B2
 p3 = x_A2B2*y_A2B1 - y_A2B2*x_A2B1
 p4 = x_A2B2*y_A2A1 - y_A2B2*x_A2A1

 condicion_1=p1*p2
 condicion_2=p3*p4

 for k in xrange (j):
 if condicion_1[k]=0 and condicion_2[k]=0:
 M[j,k]=1
 del condicion_1
 del condicion_2


 if j+1N+4:
 x_A1B1 = array_x2[j]-array_x1[j]
 y_A1B1 = array_y2[j]-array_y1[j]
 x_A1A2 = array_x1[j+1:]-array_x1[j]
 y_A1A2 = array_y1[j+1:]-array_y1[j]
 x_A2A1 = -1*x_A1A2
 y_A2A1 = -1*y_A1A2
 x_A2B2 = array_x2[j+1:]-array_x1[j+1:]
 y_A2B2 = array_y2[j+1:]-array_y1[j+1:]
 x_A1B2 = array_x2[j+1:]-array_x1[j]
 y_A1B2 = array_y2[j+1:]-array_y1[j]
 x_A2B1 = array_x2[j]-array_x1[j+1:]
 y_A2B1 = array_y2[j]-array_y1[j+1:]

 p1 = x_A1B1*y_A1A2 - y_A1B1*x_A1A2
 p2 = x_A1B1*y_A1B2 - y_A1B1*x_A1B2
 p3 = x_A2B2*y_A2B1 - y_A2B2*x_A2B1
 p4 = x_A2B2*y_A2A1 - y_A2B2*x_A2A1

 condicion_1=p1*p2
 condicion_2=p3*p4

 for k in xrange ((N+4)-j-1):
 if condicion_1[k]=0 and condicion_2[k]=0:
 M[j,k+j+1]=1
 del condicion_1
 del condicion_2

 M[N,N+2]=0
 M[N,N+3]=0
 M[N+1,N+2]=0
 M[N+1,N+3]=0
 M[N+2,N]=0

Re: [Numpy-discussion] 1.8.0 branch reminder

2013-08-26 Thread Benjamin Root

On Mon, Aug 26, 2013 at 11:01 AM, Ralf Gommers ralf.gomm...@gmail.comwrote:




 On Sun, Aug 18, 2013 at 6:36 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:




 On Sun, Aug 18, 2013 at 12:17 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:

 Just a reminder that 1.8.0 will be branched tonight. I've put up a big STY:
 PR https://github.com/numpy/numpy/pull/3635 that removes trailing
 whitespace and fixes spacing after commas. I would like to apply before the
 branch, but it may cause merge difficulties down the line. I'd like
 feedback on that option.


 I've also run autopep8 on the code and it does a nice job of cleaning up
 things. It gets a little lost in deeply nested lists, but there aren't too
 many of those. By default it doesn't fix spaces about operator (it seems).
 I can apply that also if there is interest in doing so.


 Depends on how many lines of code it touches. For scipy we decided not to
 do this, because it would make git blame pretty much useless.

 Ralf


At some point, you just have to bite the bullet. Matplotlib has been doing
pep8 work for about a year now. We adopted very specific rules on how that
work was to be done (make pep8 only PRs, each pep8 PR would be for at most
one module at a time, etc). Yes, it does look like NelleV has taken over
the project, but the trade-off is readability. We even discovered a
non-trivial number of bugs this way. For a core library like NumPy that has
lots of very obscure-looking code that almost never gets changed, avoiding
PEP8 is problematic because it always becomes Somebody else's problem.

Of course, it is entirely up to you, the devs, on what to do for NumPy and
SciPy, but that is what matplotlib is doing.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Array addition inconsistency

2013-08-29 Thread Benjamin Root

On Thu, Aug 29, 2013 at 8:04 AM, Robert Kern robert.k...@gmail.com wrote:

 On Thu, Aug 29, 2013 at 12:00 PM, Martin Luethi lue...@vaw.baug.ethz.ch
 wrote:
 
  Dear all,
 
  After some surprise, I noticed an inconsistency while adding array
  slices:
 
   a = np.arange(5)
   a[1:] = a[1:] + a[:-1]
   a
  array([0, 1, 3, 5, 7])
 
  versus inplace
 
   a = np.arange(5)
   a[1:] += a[:-1]
   a
  array([ 0,  1,  3,  6, 10])
 
  My suspicition is that the second variant does not create intermediate
  storage, and thus works on the intermediate result, effectively
  performing a.cumsum().

 Correct. Not creating intermediate storage is the point of using augmented
 assignment.


This can be very sneaky.

 a = np.arange(5)
 a[:-1] = a[:-1] + a[1:]
 a
array([1, 3, 5, 7, 4])

 a = np.arange(5)
 a[:-1] += a[1:]
 a
array([1, 3, 5, 7, 4])

So, if someone is semi-careful and tries out that example, they might
(incorrectly) assume that such operations are safe without realizing that
it was safe because the values of a[1:] were ahead of the values of a[:-1]
in memory. I could easily imagine a situation where views of an array are
passed around only to finally end up in an in-place operation like this and
sometimes be right and sometimes be wrong. Maybe there can be some simple
check that could be performed to detect this sort of situation?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Relative speed

2013-08-29 Thread Benjamin Root

On Aug 29, 2013 4:11 PM, Jonathan T. Niehof jnie...@lanl.gov wrote:

 On 08/29/2013 01:48 PM, Ralf Gommers wrote:

  Thanks. I had read that quite differently, and I'm sure I'm not the only
  one. Some context would have helped

 My apologies--that was a rather obtuse reference.


Just for future reference, the language and the community is full of
references like these. IDLE, is named for Eric Idle, one of the members of
Monty Python, while Guido's title of BDFL is a reference to a sketch.

But I am sure you'd never expected that... :-p

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Benjamin Root

The two lists are of different sizes.

Had to count twice to catch that.

Ben Root

On Mon, Sep 9, 2013 at 9:46 AM, Chad Kidder cckid...@gmail.com wrote:

 I'm trying to enter a 2-D array and np.array() is returning a 1-D array of
 lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's the
 code that is giving me issues.

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
 -62, -70, -72, -73.5, -77]]
  f1a = np.array(f1)
  f1a
 array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
[-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

 What am I missing?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PEP8

2013-09-09 Thread Benjamin Root

On Sat, Sep 7, 2013 at 7:56 PM, Charles R Harris
charlesr.har...@gmail.comwrote:

 Hi All,

 I've been doing some PEP8 work using autopep8. One problem that has turned
 up is that the default behavior of autopep8 is version dependent. I'd like
 to put a script in numpy tools that runs autopep8 with some features
 disabled, namely

1. E226 -- puts spaces around arithmetic operators (+, -, *, /, **).
2. E241 -- allows only single spaces after ','

 Something we have done in matplotlib is that we have made PEP8 a part of
the tests. We are transitioning, but the idea is that eventually, with
Travis, all pull requests will get PEP8 checked. I am very leary of
automatic PEP8-ing. I would rather have the tests fail and let me manually
fix it rather than have code automatically changed.

The first leaves expression formatting in the hands of the coder and avoids
 things like 2 ** 3. The second allows array entries to be vertically
 aligned, which can be useful in clarifying the values used in tests. A few
 other things that might need decisions:

1. [:,:, 2] or [:, :, 2]
2. Blank line before first function after class Foo():

 For the first one, I prefer spaces. For the second one, I prefer no blank
lines.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] can argmax be used to return row and column indices?

2013-09-13 Thread Benjamin Root

On Fri, Sep 13, 2013 at 4:27 AM, Mark Bakker mark...@gmail.com wrote:

 Thanks, Gregorio.
 I would like it if argmax had a keyword option to return the row,column
 index automatically (or whatever the dimension of the array).
 Afterall, argmax already knows the shape of the array.
 Calling np.unravel_index( np.argmax( A ) ) seems unnecessarily long. But
 it works well though!
 I am not sure that such a PR would get much support
 Thanks again,
 Mark


What should it do when np.argmax() gets axis=1 argument? I see confusion
occurring with parsing the returned results for arbitrary dimension inputs.
-1 on any such PR. +1 on making sure all arg*() functions have
unravel_index() very prominent in their documentation (which it does right
now for argmax()).

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing changes/deprecations

2013-09-27 Thread Benjamin Root

On Fri, Sep 27, 2013 at 8:27 AM, Sebastian Berg
sebast...@sipsolutions.netwrote:

 Hey,

 since I am working on the indexing. I was wondering about a few smaller
 things:

   * 0-d boolean array, `np.array(0)[True]` (will work now) would
 give np.array([0]) as a copy, instead of the original array.
 I guess I could add a FutureWarning or so, but I am not sure
 and overall the chance of creating bugs seems low.

 (The boolean index should always add 1 dimension and here,
 remove 0 dimensions - 1-d result.)

   * All index operations return a view; never the object. This
 means that `v = arr[...]` is slightly slower. But since it
 does not affect `arr[...] = vals`, I think the speed
 implications are negligible.

   * Does anyone have an idea if there is a way to change the subclass
 logic that view based item setting is implemented as:
 np.asarray(subclass[index]) = vals

 I somewhat think the subclass should rather implement `__setitem__`
 instead of relying on numpy calling its `__getitem__`, but I
 don't see how it can be changed.

   * Still thinking a bit about implementing a keepdims keyword or
 function, to handle matrix type logic mostly in the C-code.

 And most importantly, is there any behaviour thing in the index
 machinery that is bugging you, which I may have forgotten until now?

 - Sebastian


Boolean indexing could use a facelift.  First, consider the following
(albeit minor) annoyance:

 import numpy as np
 a = np.arange(5)
 a[[True, False, True, False, True]]
array([1, 0, 1, 0, 1])
 b = np.array([True, False, True, False, True])
 a[b]
array([0, 2, 4])

Next, it would be nice if boolean indexing returned a view (wishful
thinking, I know):

 c = a[b]
 c
array([0, 2, 4])
 c[1] = 7
 c
array([0, 7, 4])
 a
array([0, 1, 2, 3, 4])

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] 1.8.0rc1

2013-10-02 Thread Benjamin Root

On Wed, Oct 2, 2013 at 11:43 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 Hi Stefan,


 On Wed, Oct 2, 2013 at 9:29 AM, Stéfan van der Walt ste...@sun.ac.zawrote:

 Hi Chuck

 On Tue, Oct 1, 2013 at 1:07 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
  I'll bet the skimage problems come from
  https://github.com/numpy/numpy/pull/3811. They may be doing something
  naughty...
 
 
  Reverting that commit fixes those skimage failures. However, there are a
  number of python2.7 failures that look pretty strange.

 What is the exact change in behavior with that PR?  I'm trying to
 figure out what skimage does wrong in this case.


 The current master, and reverted for the 1.8 release only, is stricter
 about np.bool only taking values 0 or 1. Apparently the convolve returns
 boolean (I haven't checked) for boolean input, and consequently the check
 if the return value matches the number of 1 elements in the convolution
 kernel will fail when that number is greater than one. That is why the
 proposed fix is to view the boolean as uint8 instead. Note that
 out=(boolean) will still cause problems.

 Chuck


So, just to be clear... what would happen if I had an array of floats
between 0 and 1 inclusive and I cast that as a boolean using astype()?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

2013-10-02 Thread Benjamin Root

On Wed, Oct 2, 2013 at 1:05 PM, Charles R Harris
charlesr.har...@gmail.comwrote:




 On Wed, Oct 2, 2013 at 10:56 AM, josef.p...@gmail.com wrote:

 On Wed, Oct 2, 2013 at 12:37 PM, Stéfan van der Walt ste...@sun.ac.za
 wrote:
  On 2 Oct 2013 18:04, Charles R Harris charlesr.har...@gmail.com
 wrote:
 
  The question is what to do when all-nan slices are encountered in the
  nan{max,min} and nanarg{max, min} functions. Currently in 1.8.0, the
 first
  returns nan and raises a warning, the second returns intp.min and
 raises a
  warning. It is proposed that the nanarg{max, min} functions, and
 possibly
  the nan{max, min} also, raise an error instead.
 
  I agree with Nathan; this sounds like more reasonable behaviour to me.

 If I understand what you are proposing

 -1 on raising an error with nan{max, min},

 an empty array is empty in all columns
 an array with nans, might be empty in only some columns.

 as far as I understand, nan{max, min} only make sense with arrays that
 can hold a nan, so we can return nans.


 That was my original thought.



 If a user calls with ints or bool, then there are either no nans or
 the array is empty, and I don't care.

 ---
 aside
 with nanarg{max, min} I would just return 0 in an all nan column,
 since the max or min is nan, and one is at zero.
 (but I'm not arguing)


 That is an interesting proposal. I like it.

 Chuck



And it is logically consistent, I think.  a[nanargmax(a)] == nanmax(a)
(ignoring the silly detail that you can't do an equality on nans).

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

< 1 2 3 4 5 6 >

101 - 200 of 585 matches

Mail list logo