[Numpy-discussion] PyArray_EMPTY and Cython

2008-12-02 Thread Gabriel Gellner
After some discussion on the Cython lists I thought I would try my hand at
writing some Cython accelerators for empty and zeros. This will involve using
PyArray_EMPTY, I have a simple prototype I would like to get working, but
currently it segfaults. Any tips on what I might be missing?

import numpy as np
cimport numpy as np

cdef extern from numpy/arrayobject.h:
PyArray_EMPTY(int ndims, np.npy_intp* dims, int type, bint fortran)

cdef np.ndarray empty(np.npy_intp length):
cdef np.ndarray[np.double_t, ndim=1] ret
cdef int type = np.NPY_DOUBLE
cdef int ndims = 1

cdef np.npy_intp* dims
dims = length

print dims[0]
print type

ret = PyArray_EMPTY(ndims, dims, type, False)

return ret

def test():
cdef np.ndarray[np.double_t, ndim=1] y = empty(10)

return y


The code seems to print out the correct dims and type info but segfaults when
the PyArray_EMPTY call is made.

Thanks,

Gabriel

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for changing the names of inverse trigonometrical/hyperbolic functions

2008-11-24 Thread Gabriel Gellner
On Mon, Nov 24, 2008 at 07:45:56PM +0100, Francesc Alted wrote:
 Hi,
 
 After dealing with another issue, I realized that the names of inverse 
 trigonometrical/hyperbolic functions in NumPy don't follow the main 
 standards in computer science.  For example, where Python writes:
 
 asin, acos, atan, asinh, acosh, atanh
 
 NumPy choose:
 
 arcsin, arccos, arctan, arcsinh, arccosh, arctanh
 
 And not only Python, the former also seems to be the standard in 
 computer science.  Quoting:
 
 http://en.wikipedia.org/wiki/Inverse_hyperbolic_function
 
 
 The usual abbreviations for them in mathematics are arsinh, arcsinh (in 
 the USA) or asinh (in computer science).
 ...
 The acronyms arcsinh, arccosh etc. are commonly used, even though they 
 are misnomers, since the prefix arc is the abbreviation for arcus, 
 while the prefix ar stands for area.
 
 
 So, IMHO, I think it would be better to rename the inverse trigonometric 
 functions from ``arc*`` to ``a*`` prefix.  Of course, in order to do 
 that correctly, one should add the new names and add a 
 ``DeprecationWarning`` informing that people should start to use the 
 new names.  After two or three NumPy versions, the old function names 
 can be removed safely.
 
 What people think?
 
+1

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal for changing the names of inverse trigonometrical/hyperbolic functions

2008-11-24 Thread Gabriel Gellner
 There is resistance. Please don't remove the old names. Also note that 
 your proposed change will alter people's code in subtle, but potentially 
 very interesting ways:
 
  from math import *
  from numpy import *
  type(arcsin(1)) is type(asin(1))
 False
  from numpy import arcsin as transformacion_del_arco_seno
  arcsin == transformacion_del_arco_seno
 True
 
 asin(1j) raises an exception, arcsin doesn't. They are *different* 
 functions, hence the names.
 
Yet:

 type(np.sin(1)) == type(math.sin(1))
False

And they have the same name. Isn't this what name spaces are for? I think it is
strange that some of the math functions have the same name, and some don't. I
can't see how being different functions justifies this, or we need to rename
the normal trig functions.

I can see not wanting to break API compatibility but I don't find the
`different functions` argument compelling.

Gabriel

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] setting element

2008-11-12 Thread Gabriel Gellner
On Wed, Nov 12, 2008 at 09:43:34AM -0800, Charles سمير Doutriaux wrote:
 Hello,
 
 I'm wondering if there's aquick way to do the following:
 
 s[:,5]=value
 
 in a general function
 def setval(array,index,value,axis=0):
   ## code here
 
 The issue is to put enough : before the index value inside the  
 square bracket of the assignement.
 
Make some slice objects!

def setval(array, index, value, axis=0):
key = [slice(None)]*len(array.shape)
key[axis] = index
array[key] = value

Gabriel

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] setting element

2008-11-12 Thread Gabriel Gellner
On Wed, Nov 12, 2008 at 12:34:51PM -0600, Ryan May wrote:
 Charles سمير Doutriaux wrote:
  Hello,
  
  I'm wondering if there's aquick way to do the following:
  
  s[:,5]=value
  
  in a general function
  def setval(array,index,value,axis=0):
  ## code here
 
 Assuming that axis specifies where the index goes, that would be:
 
 def setval(array, index, value, axis=0):
   slices = [slice(None)] * len(array.shape)
   slices[axis] = index
   array[slices] = value
 
 (Adapted from the code for numpy.diff)
 
 Ryan
 
Jinx!
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] by axis iterator

2008-11-12 Thread Gabriel Gellner
Something I use a lot is a little generator that iterates over a ndarray by a
given axis. I was wondering if this is already built-in to numpy (and not
using the apply_along_axis which I find ugly) and if not would there be
interest in adding it?

the function is just:

def by_axis(nobj, axis=0):
index_set = [slice(None)]*len(ndobj.shape)
for i in xrange(ndobj.shape[axis]):
index_set[axis] = i
yield ndobj[index_set]

and can be just like

 [sum(x) for x in by_axis(a, 1)]
 for col in by_axis(a, 1):
... print col

I use it when porting R code that uses a lot of apply like logic. I know most
numpy functions have the axis argument built in, but when writing my own
functions I find this a real time saver.

Anyway, if someone can show be a better way I would be overjoyed, or if people
like this I can make a ticket on Trac.

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] random number generation in python compared to gsl

2008-11-05 Thread Gabriel Gellner
On Wed, Nov 05, 2008 at 03:19:09PM +0100, Matthieu Brucher wrote:
  Not in this case: I always get the same sequence with seed=0
  (different for both implementation, but the same each time I run it.)
  I got around it by installing pygsl and taking random numbers from
  there instead of from numpy.
 
  But I still find it strange to get two different sequences from two
  implementation that claim to be the same algorithm...
 
  Giovanni
 
 Hi,
 
 I didn't check which MT was used, because there are several different.
 MT is based on one Mersenne prime, but for instance the Boost library
 wraps two generators with two different values. Perhaps the same
 occurs here.
 Other explanations include:
 - MT must use an array of starting values, perhaps the first is 0, but
 one is 0, 1, 2, 3, ... and the other uses 0, 0, 0, ...
This is the key, there is a lot of variation in the seeding algorithm that is
used in mersenne twister algorithms, usually they use some kind of linear
congruential algorithm to get started. I think that GSL uses the published
mersenne twister seed algorithm, what you can do is find out how gsl does this
and just set the full seed array in numpy.random.RandomState yourself. I have
done this for comparison with R's mersenne twister algorithm using my own MT
implementation and it works like a charm.

Gabriel 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to do: y[yT] = y+T

2008-10-27 Thread Gabriel Gellner
On Mon, Oct 27, 2008 at 12:41:06PM +0100, Nicolas ROUX wrote:
 Hello,
 
 I hope this is not a silly question ;-)
 I have a Numpy array, and I want to process it with :
   if the value is lower than Threshold, then increase by Threshold
 
 
 I would like to translate it as:
  y[yTreshold] = y + Treshold
 
You are close, you just need to have the right hand side also use vector
indexing (since otherwise you are trying to add something of length y to a
subset of y):

y[y  Threshold] = y[y  Threshold] + Threshold

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [mailinglist] How to do: y[yT] = y+T

2008-10-27 Thread Gabriel Gellner
On Mon, Oct 27, 2008 at 12:45:44PM +0100, Uwe Schmitt wrote:
 Nicolas ROUX schrieb:
  Hello,
 
  I hope this is not a silly question ;-)
  I have a Numpy array, and I want to process it with :
if the value is lower than Threshold, then increase by Threshold
 
 
  I would like to translate it as:
   y[yTreshold] = y + Treshold

 Hi,
 
 your solution does not work, becaus the arrays on both
 side do not have the same size in generall.
 
 You can do it in place:
 
 y[yT] += T
 
Nice, I didn't know this :-) Thanks.

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: Python programs for epidemic modelling

2008-10-25 Thread Gabriel Gellner
Since, the programs are heavily using numpy, scipy and matplotlib
libraries, I send this announcement to all the three lists and the main
python-list; sorry for double-posting. The announcement with the related
links is uploaded here [1]http://blog.deductivethinking.com/?p=29. The
programs are at

 [2]http://wiki.deductivethinking.com/wiki/Python_Programs_for_Modelling_Infectious_Diseases_book.
Thanks for this, it is great!

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (Late) summary of PEP-225 discussion at Scipy

2008-10-22 Thread Gabriel Gellner
On Wed, Oct 22, 2008 at 02:00:20PM -0700, Fernando Perez wrote:
 Hi all,
 
 much delayed, but here it is, finally.  The doc regarding our
 discussion about PEP 225 is attached, and I'm keeping a public copy
 for viewing convenience (with html) here:
 
 https://cirl.berkeley.edu/fperez/static/numpy-pep225/
 
 Note that, as indicated in the link above, the real doc is in bzr, so
 you can check it out to generate patches or a  branch (the preferred
 formats for big changes).
 
 This is just a first cut, going from memory and notes.  I'd appreciate
 any feedback, corrections, etc.  I'm giving a talk at the BayPiggies
 group Nov 13 about SciPy and will take the opportunity to bring this
 document up, since that user group has a lot of python people,
 including some core developers.
 
 Since this is from memory and written by me, it's likely pretty
 biased.  But I really want to clarify that I'm trying to act as a
 scribe here, not to push my own agenda (I do have one :).  So please
 bring up anything you feel is missing/misstated here; I'd really like
 to have a doc that people feel is a fair representation of the
 community's views to bring to the core python team.  That way we
 either get our operators or not once and for all, but this issue can
 be put to rest in the language for good (as of 2.7/3.1, of course).
 
 So let's have Nov 11 or so as a wrapup deadline on this discussion, so
 I can have the summary ready for the Nov 13 talk.
 
 Best,
 
 f
Thanks for taking the lead on this. Just in case you want more examples Fortran 
90/95
allows for operator definition like R see:
http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0184.htm#generic_inter_op

I think this is ideal as I am used to it, but at the least I vote for a
multiplication operator.

Gabriel

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] var bias reason?

2008-10-15 Thread Gabriel Gellner
On Wed, Oct 15, 2008 at 09:45:39AM -0500, Travis E. Oliphant wrote:
 Gabriel Gellner wrote:
  Some colleagues noticed that var uses biased formula's by default in numpy,
  searching for the reason only brought up:
 
  http://article.gmane.org/gmane.comp.python.numeric.general/12438/match=var+bias
 
  which I totally agree with, but there was no response? Any reason for this?
 I will try to respond to this as it was me who made the change.  I think 
 there have been responses, but I think I've preferred to stay quiet 
 rather than feed a flame war.   Ultimately, it is a matter of preference 
 and I don't think there would be equal weights given to all the 
 arguments surrounding the decision by everybody.
 
 I will attempt to articulate my reasons:  dividing by n is the maximum 
 likelihood estimator of variance and I prefer that justification more 
 than the un-biased justification for a default (especially given that 
 bias is just one part of the error in an estimator).Having every 
 package that computes the mean return the un-biased estimate gives it 
 more cultural weight than than the concept deserves, I think.  Any 
 surprise that is created by the different default should be mitigated by 
 the fact that it's an opportunity to learn something about what you are 
 doing.Here is a paper I wrote on the subject that you might find 
 useful:
 
 https://contentdm.lib.byu.edu/cdm4/item_viewer.php?CISOROOT=/EERCISOPTR=134CISOBOX=1REC=1
 (Hopefully, they will resolve a link problem at the above site soon, but 
 you can read the abstract).
 
Thanks for the reply, I look forward to reading the paper when it is
available. The major issue in my mind is not the technical issue but the
surprise factor. I can't think of single other package that uses this as the
default, and since it is also a method of ndarray (which is a built in type
and can't be monkey patched) there is no way of taking a different view (that
is supplying my on function) without the confusion I am feeling in my own lab
. . . (less technical people need to understand that they shouldn't
use a method of the same name) 

I worry about having numpy take this unpopular stance (as far as packages go)
simply to fight the good fight, as a built in method/behaviour of any ndarray,
rather than as a built in function, which presents no such problem, as it
allows dissent over a clearly muddy issue.

Sorry for the noise, and I am happy to see their is a reason, but I can't help
but find this a wort for purely pedagogical reasons. 

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Record arrays

2008-06-26 Thread Gabriel Gellner
 Let's be clear, there are two very closely related things: recarrays
 and record arrays. Record arrays are just ndarrays with a complicated
 dtype. E.g.
 
 In [1]: from numpy import *
 
 In [2]: ones(3, dtype=dtype([('foo', int), ('bar', float)]))
 Out[2]:
 array([(1, 1.0), (1, 1.0), (1, 1.0)],
   dtype=[('foo', 'i4'), ('bar', 'f8')])
 
 In [3]: r = _
 
 In [4]: r['foo']
 Out[4]: array([1, 1, 1])
 
 
 recarray is a subclass of ndarray that just adds attribute access to
 record arrays.
 
 In [10]: r2 = r.view(recarray)
 
 In [11]: r2
 Out[11]:
 recarray([(1, 1.0), (1, 1.0), (1, 1.0)],
   dtype=[('foo', 'i4'), ('bar', 'f8')])
 
 In [12]: r2.foo
 Out[12]: array([1, 1, 1])
 
 
 One downside of this is that the attribute access feature slows down
 all field accesses, even the r['foo'] form, because it sticks a bunch
 of pure Python code in the middle. Much code won't notice this, but if
 you end up having to iterate over an array of records (as I have),
 this will be a hotspot for you.
 
 Record arrays are fundamentally a part of numpy, and no one is even
 suggesting that they would go away. No one is seriously suggesting
 that we should remove recarray, but some of us hesitate to recommend
 its use over plain record arrays.
 
 Does that clarify the discussion for you?
 
Thanks! This has always been something that has confused me . . . This is
awesome, I guess I build by DataFrame object for nothing :-)

Gabriel
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion