[Numpy-discussion] PyArray_EMPTY and Cython
After some discussion on the Cython lists I thought I would try my hand at writing some Cython accelerators for empty and zeros. This will involve using PyArray_EMPTY, I have a simple prototype I would like to get working, but currently it segfaults. Any tips on what I might be missing? import numpy as np cimport numpy as np cdef extern from numpy/arrayobject.h: PyArray_EMPTY(int ndims, np.npy_intp* dims, int type, bint fortran) cdef np.ndarray empty(np.npy_intp length): cdef np.ndarray[np.double_t, ndim=1] ret cdef int type = np.NPY_DOUBLE cdef int ndims = 1 cdef np.npy_intp* dims dims = length print dims[0] print type ret = PyArray_EMPTY(ndims, dims, type, False) return ret def test(): cdef np.ndarray[np.double_t, ndim=1] y = empty(10) return y The code seems to print out the correct dims and type info but segfaults when the PyArray_EMPTY call is made. Thanks, Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for changing the names of inverse trigonometrical/hyperbolic functions
On Mon, Nov 24, 2008 at 07:45:56PM +0100, Francesc Alted wrote: Hi, After dealing with another issue, I realized that the names of inverse trigonometrical/hyperbolic functions in NumPy don't follow the main standards in computer science. For example, where Python writes: asin, acos, atan, asinh, acosh, atanh NumPy choose: arcsin, arccos, arctan, arcsinh, arccosh, arctanh And not only Python, the former also seems to be the standard in computer science. Quoting: http://en.wikipedia.org/wiki/Inverse_hyperbolic_function The usual abbreviations for them in mathematics are arsinh, arcsinh (in the USA) or asinh (in computer science). ... The acronyms arcsinh, arccosh etc. are commonly used, even though they are misnomers, since the prefix arc is the abbreviation for arcus, while the prefix ar stands for area. So, IMHO, I think it would be better to rename the inverse trigonometric functions from ``arc*`` to ``a*`` prefix. Of course, in order to do that correctly, one should add the new names and add a ``DeprecationWarning`` informing that people should start to use the new names. After two or three NumPy versions, the old function names can be removed safely. What people think? +1 Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal for changing the names of inverse trigonometrical/hyperbolic functions
There is resistance. Please don't remove the old names. Also note that your proposed change will alter people's code in subtle, but potentially very interesting ways: from math import * from numpy import * type(arcsin(1)) is type(asin(1)) False from numpy import arcsin as transformacion_del_arco_seno arcsin == transformacion_del_arco_seno True asin(1j) raises an exception, arcsin doesn't. They are *different* functions, hence the names. Yet: type(np.sin(1)) == type(math.sin(1)) False And they have the same name. Isn't this what name spaces are for? I think it is strange that some of the math functions have the same name, and some don't. I can't see how being different functions justifies this, or we need to rename the normal trig functions. I can see not wanting to break API compatibility but I don't find the `different functions` argument compelling. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting element
On Wed, Nov 12, 2008 at 09:43:34AM -0800, Charles سمير Doutriaux wrote: Hello, I'm wondering if there's aquick way to do the following: s[:,5]=value in a general function def setval(array,index,value,axis=0): ## code here The issue is to put enough : before the index value inside the square bracket of the assignement. Make some slice objects! def setval(array, index, value, axis=0): key = [slice(None)]*len(array.shape) key[axis] = index array[key] = value Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setting element
On Wed, Nov 12, 2008 at 12:34:51PM -0600, Ryan May wrote: Charles سمير Doutriaux wrote: Hello, I'm wondering if there's aquick way to do the following: s[:,5]=value in a general function def setval(array,index,value,axis=0): ## code here Assuming that axis specifies where the index goes, that would be: def setval(array, index, value, axis=0): slices = [slice(None)] * len(array.shape) slices[axis] = index array[slices] = value (Adapted from the code for numpy.diff) Ryan Jinx! ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] by axis iterator
Something I use a lot is a little generator that iterates over a ndarray by a given axis. I was wondering if this is already built-in to numpy (and not using the apply_along_axis which I find ugly) and if not would there be interest in adding it? the function is just: def by_axis(nobj, axis=0): index_set = [slice(None)]*len(ndobj.shape) for i in xrange(ndobj.shape[axis]): index_set[axis] = i yield ndobj[index_set] and can be just like [sum(x) for x in by_axis(a, 1)] for col in by_axis(a, 1): ... print col I use it when porting R code that uses a lot of apply like logic. I know most numpy functions have the axis argument built in, but when writing my own functions I find this a real time saver. Anyway, if someone can show be a better way I would be overjoyed, or if people like this I can make a ticket on Trac. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] random number generation in python compared to gsl
On Wed, Nov 05, 2008 at 03:19:09PM +0100, Matthieu Brucher wrote: Not in this case: I always get the same sequence with seed=0 (different for both implementation, but the same each time I run it.) I got around it by installing pygsl and taking random numbers from there instead of from numpy. But I still find it strange to get two different sequences from two implementation that claim to be the same algorithm... Giovanni Hi, I didn't check which MT was used, because there are several different. MT is based on one Mersenne prime, but for instance the Boost library wraps two generators with two different values. Perhaps the same occurs here. Other explanations include: - MT must use an array of starting values, perhaps the first is 0, but one is 0, 1, 2, 3, ... and the other uses 0, 0, 0, ... This is the key, there is a lot of variation in the seeding algorithm that is used in mersenne twister algorithms, usually they use some kind of linear congruential algorithm to get started. I think that GSL uses the published mersenne twister seed algorithm, what you can do is find out how gsl does this and just set the full seed array in numpy.random.RandomState yourself. I have done this for comparison with R's mersenne twister algorithm using my own MT implementation and it works like a charm. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to do: y[yT] = y+T
On Mon, Oct 27, 2008 at 12:41:06PM +0100, Nicolas ROUX wrote: Hello, I hope this is not a silly question ;-) I have a Numpy array, and I want to process it with : if the value is lower than Threshold, then increase by Threshold I would like to translate it as: y[yTreshold] = y + Treshold You are close, you just need to have the right hand side also use vector indexing (since otherwise you are trying to add something of length y to a subset of y): y[y Threshold] = y[y Threshold] + Threshold Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [mailinglist] How to do: y[yT] = y+T
On Mon, Oct 27, 2008 at 12:45:44PM +0100, Uwe Schmitt wrote: Nicolas ROUX schrieb: Hello, I hope this is not a silly question ;-) I have a Numpy array, and I want to process it with : if the value is lower than Threshold, then increase by Threshold I would like to translate it as: y[yTreshold] = y + Treshold Hi, your solution does not work, becaus the arrays on both side do not have the same size in generall. You can do it in place: y[yT] += T Nice, I didn't know this :-) Thanks. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: Python programs for epidemic modelling
Since, the programs are heavily using numpy, scipy and matplotlib libraries, I send this announcement to all the three lists and the main python-list; sorry for double-posting. The announcement with the related links is uploaded here [1]http://blog.deductivethinking.com/?p=29. The programs are at [2]http://wiki.deductivethinking.com/wiki/Python_Programs_for_Modelling_Infectious_Diseases_book. Thanks for this, it is great! Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] (Late) summary of PEP-225 discussion at Scipy
On Wed, Oct 22, 2008 at 02:00:20PM -0700, Fernando Perez wrote: Hi all, much delayed, but here it is, finally. The doc regarding our discussion about PEP 225 is attached, and I'm keeping a public copy for viewing convenience (with html) here: https://cirl.berkeley.edu/fperez/static/numpy-pep225/ Note that, as indicated in the link above, the real doc is in bzr, so you can check it out to generate patches or a branch (the preferred formats for big changes). This is just a first cut, going from memory and notes. I'd appreciate any feedback, corrections, etc. I'm giving a talk at the BayPiggies group Nov 13 about SciPy and will take the opportunity to bring this document up, since that user group has a lot of python people, including some core developers. Since this is from memory and written by me, it's likely pretty biased. But I really want to clarify that I'm trying to act as a scribe here, not to push my own agenda (I do have one :). So please bring up anything you feel is missing/misstated here; I'd really like to have a doc that people feel is a fair representation of the community's views to bring to the core python team. That way we either get our operators or not once and for all, but this issue can be put to rest in the language for good (as of 2.7/3.1, of course). So let's have Nov 11 or so as a wrapup deadline on this discussion, so I can have the summary ready for the Nov 13 talk. Best, f Thanks for taking the lead on this. Just in case you want more examples Fortran 90/95 allows for operator definition like R see: http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0184.htm#generic_inter_op I think this is ideal as I am used to it, but at the least I vote for a multiplication operator. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] var bias reason?
On Wed, Oct 15, 2008 at 09:45:39AM -0500, Travis E. Oliphant wrote: Gabriel Gellner wrote: Some colleagues noticed that var uses biased formula's by default in numpy, searching for the reason only brought up: http://article.gmane.org/gmane.comp.python.numeric.general/12438/match=var+bias which I totally agree with, but there was no response? Any reason for this? I will try to respond to this as it was me who made the change. I think there have been responses, but I think I've preferred to stay quiet rather than feed a flame war. Ultimately, it is a matter of preference and I don't think there would be equal weights given to all the arguments surrounding the decision by everybody. I will attempt to articulate my reasons: dividing by n is the maximum likelihood estimator of variance and I prefer that justification more than the un-biased justification for a default (especially given that bias is just one part of the error in an estimator).Having every package that computes the mean return the un-biased estimate gives it more cultural weight than than the concept deserves, I think. Any surprise that is created by the different default should be mitigated by the fact that it's an opportunity to learn something about what you are doing.Here is a paper I wrote on the subject that you might find useful: https://contentdm.lib.byu.edu/cdm4/item_viewer.php?CISOROOT=/EERCISOPTR=134CISOBOX=1REC=1 (Hopefully, they will resolve a link problem at the above site soon, but you can read the abstract). Thanks for the reply, I look forward to reading the paper when it is available. The major issue in my mind is not the technical issue but the surprise factor. I can't think of single other package that uses this as the default, and since it is also a method of ndarray (which is a built in type and can't be monkey patched) there is no way of taking a different view (that is supplying my on function) without the confusion I am feeling in my own lab . . . (less technical people need to understand that they shouldn't use a method of the same name) I worry about having numpy take this unpopular stance (as far as packages go) simply to fight the good fight, as a built in method/behaviour of any ndarray, rather than as a built in function, which presents no such problem, as it allows dissent over a clearly muddy issue. Sorry for the noise, and I am happy to see their is a reason, but I can't help but find this a wort for purely pedagogical reasons. Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Record arrays
Let's be clear, there are two very closely related things: recarrays and record arrays. Record arrays are just ndarrays with a complicated dtype. E.g. In [1]: from numpy import * In [2]: ones(3, dtype=dtype([('foo', int), ('bar', float)])) Out[2]: array([(1, 1.0), (1, 1.0), (1, 1.0)], dtype=[('foo', 'i4'), ('bar', 'f8')]) In [3]: r = _ In [4]: r['foo'] Out[4]: array([1, 1, 1]) recarray is a subclass of ndarray that just adds attribute access to record arrays. In [10]: r2 = r.view(recarray) In [11]: r2 Out[11]: recarray([(1, 1.0), (1, 1.0), (1, 1.0)], dtype=[('foo', 'i4'), ('bar', 'f8')]) In [12]: r2.foo Out[12]: array([1, 1, 1]) One downside of this is that the attribute access feature slows down all field accesses, even the r['foo'] form, because it sticks a bunch of pure Python code in the middle. Much code won't notice this, but if you end up having to iterate over an array of records (as I have), this will be a hotspot for you. Record arrays are fundamentally a part of numpy, and no one is even suggesting that they would go away. No one is seriously suggesting that we should remove recarray, but some of us hesitate to recommend its use over plain record arrays. Does that clarify the discussion for you? Thanks! This has always been something that has confused me . . . This is awesome, I guess I build by DataFrame object for nothing :-) Gabriel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion