Re: [Numpy-discussion] ANN: numtraits v0.2
Hi Sylvain, Sylvain Corlay wrote: > Hi Thomas, > > This is great news! > > FYI, the traitlets module has been undergoing significant refactoring > lately, improving the API to favor a broader usage in the community. > One reason for this is that several projects outside of the Jupyter > organization are considering adopting traitlets. You can find a summary > of the ongoing work and API changes > here: https://github.com/ipython/traitlets/issues/48 > > One of the items in this discussion is about what would be the best > place for a repository of trait types for standard data structures of > the scipy stack (numpy array, pandas series and dataframes, etc...) It > is unlikely that such trait types would be accepted in those libraries > at this moment, and the main traitlets package might not be the right > place for it either - hence the need for another repo. However, if we > don't work on a centralized project, we will probably see a number of > competing implementations in different libraries that are clients of > traitlets. > > Hence the idea would be to propose a new project in the Jupyter > incubator with a reference implementation. What would be cool would be > to join forces and work on a specification or start a discussion of what > the ideal implementation for such trait types would look like. I'm very open to collaborating on centralizing these kind of scipy-stack traits. I'm not particularly attached to the idea of keeping our numtraits implementation separate, and would be very happy to merge it in to a larger effort or only re-use parts of it. Realistically I won't be able to lead/write a proposal for the incubator in the next few weeks, but if no one gets to it first, I can try and work on it later in the year. Cheers, Tom > > Cheers, > > Sylvain > > > On Wed, Sep 23, 2015 at 12:39 PM, Thomas Robitaille > <thomas.robitai...@gmail.com <mailto:thomas.robitai...@gmail.com>> wrote: > > Hi everyone, > > We have released a small experimental package called numtraits that > builds on top of the traitlets package and provides a NumericalTrait > class that can be used to validate properties such as: > > * number of dimension (for arrays) > * shape (for arrays) > * domain (e.g. positive, negative, range of values) > * units (with support for astropy.units, pint, and quantities) > > The idea is to be able to write a class like: > > class Sphere(HasTraits): > > radius = NumericalTrait(domain='strictly-positive', ndim=0) > position = NumericalTrait(shape=(3,)) > > and all the validation will then be done automatically when the user > sets 'radius' or 'position'. > > In addition, tuples and lists can get automatically converted to > arrays, and default values can be specified. You can read more about > the package and see examples of it in use here: > > https://github.com/astrofrog/numtraits > > and it can be easily installed with > > pip install numtraits > > The package supports both Python 3.3+ and Legacy Python (2.7) :) > > At this point, we would be very interested in feedback - the package > is still very young and we can still change the API if needed. Please > open issues with suggestions! > > Cheers, > > Tom and Francesco > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org> > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: numtraits v0.2
Hi everyone, We have released a small experimental package called numtraits that builds on top of the traitlets package and provides a NumericalTrait class that can be used to validate properties such as: * number of dimension (for arrays) * shape (for arrays) * domain (e.g. positive, negative, range of values) * units (with support for astropy.units, pint, and quantities) The idea is to be able to write a class like: class Sphere(HasTraits): radius = NumericalTrait(domain='strictly-positive', ndim=0) position = NumericalTrait(shape=(3,)) and all the validation will then be done automatically when the user sets 'radius' or 'position'. In addition, tuples and lists can get automatically converted to arrays, and default values can be specified. You can read more about the package and see examples of it in use here: https://github.com/astrofrog/numtraits and it can be easily installed with pip install numtraits The package supports both Python 3.3+ and Legacy Python (2.7) :) At this point, we would be very interested in feedback - the package is still very young and we can still change the API if needed. Please open issues with suggestions! Cheers, Tom and Francesco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting up a newcomers label on the issue tracker ?
The issue with 'low hanging fruit' is that who is it low-hanging fruit for? Low hanging fruit for a core dev may be days of work for a newcomer. Also, 'newcomer' doesn't give a good idea of how long it will take. I would therefore like to second Tom Aldcroft's suggestion of following something like what we have in astropy: - effort-low, effort-medium, and effort-high (=hours, days, long-term) - package-novice, package-intermediate, package-expert This really covers the range of options. For newcomers that want to do something quick you can point them to package-novice effort-low. When someone new to the project wants to get more involved (or for e.g. GSoC), you can point them to e.g. package-novice effort-high. If one of the core devs is bored and wants to kill some time, they can go to package-expert effort-low. We've found this very helpful in Astropy and we use it in all related packages, so I want to put in a strong recommendation for following the same model here too, and I want to recommend the same for matplotlib and scipy. Cheers, Tom Benjamin Root wrote: FWIW, matplotlib calls it low hanging fruit. I think it is a better name than newcomers. On Wed, Nov 26, 2014 at 1:19 PM, Aldcroft, Thomas aldcr...@head.cfa.harvard.edu mailto:aldcr...@head.cfa.harvard.edu wrote: On Wed, Nov 26, 2014 at 8:24 AM, Charles R Harris charlesr.har...@gmail.com mailto:charlesr.har...@gmail.com wrote: On Wed, Nov 26, 2014 at 2:36 AM, Sebastian Berg sebast...@sipsolutions.net mailto:sebast...@sipsolutions.net wrote: On Mi, 2014-11-26 at 08:44 +, David Cournapeau wrote: Hi, Would anybody mind if I create a label newcomers on GH, and start labelling simple issues ? We actually have an easy fix label, which I think had this in mind. However, I admit that I think some of these issues may not be easy at all (I guess it depends on what you consider easy ;)). In any case, I think just go ahead with creating a new label or reusing the current one. easy fix might be a starting point to find some candidate issues. - Sebsatian This is in anticipation to the bloomberg lab event in London this WE. I will try to give a hand to people interested in numpy/scipy, There is also a documentation label, and about 30 tickets with that label. That should be good for just practicing the mechanics. FWIW in astropy we settled on two properties, level of effort and level of sub-package expertise, with corresponding labels: - effort-low, effort-medium, and effort-high - package-novice, package-intermediate, package-expert This has been used with reasonable success. - Tom Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Setting up a newcomers label on the issue tracker ?
Just to follow-on to my previous email, our labeling convention is described in more detail here: https://github.com/astropy/astropy/wiki/Issue-labeling-convention Cheers, Tom Thomas Robitaille wrote: The issue with 'low hanging fruit' is that who is it low-hanging fruit for? Low hanging fruit for a core dev may be days of work for a newcomer. Also, 'newcomer' doesn't give a good idea of how long it will take. I would therefore like to second Tom Aldcroft's suggestion of following something like what we have in astropy: - effort-low, effort-medium, and effort-high (=hours, days, long-term) - package-novice, package-intermediate, package-expert This really covers the range of options. For newcomers that want to do something quick you can point them to package-novice effort-low. When someone new to the project wants to get more involved (or for e.g. GSoC), you can point them to e.g. package-novice effort-high. If one of the core devs is bored and wants to kill some time, they can go to package-expert effort-low. We've found this very helpful in Astropy and we use it in all related packages, so I want to put in a strong recommendation for following the same model here too, and I want to recommend the same for matplotlib and scipy. Cheers, Tom Benjamin Root wrote: FWIW, matplotlib calls it low hanging fruit. I think it is a better name than newcomers. On Wed, Nov 26, 2014 at 1:19 PM, Aldcroft, Thomas aldcr...@head.cfa.harvard.edu mailto:aldcr...@head.cfa.harvard.edu wrote: On Wed, Nov 26, 2014 at 8:24 AM, Charles R Harris charlesr.har...@gmail.com mailto:charlesr.har...@gmail.com wrote: On Wed, Nov 26, 2014 at 2:36 AM, Sebastian Berg sebast...@sipsolutions.net mailto:sebast...@sipsolutions.net wrote: On Mi, 2014-11-26 at 08:44 +, David Cournapeau wrote: Hi, Would anybody mind if I create a label newcomers on GH, and start labelling simple issues ? We actually have an easy fix label, which I think had this in mind. However, I admit that I think some of these issues may not be easy at all (I guess it depends on what you consider easy ;)). In any case, I think just go ahead with creating a new label or reusing the current one. easy fix might be a starting point to find some candidate issues. - Sebsatian This is in anticipation to the bloomberg lab event in London this WE. I will try to give a hand to people interested in numpy/scipy, There is also a documentation label, and about 30 tickets with that label. That should be good for just practicing the mechanics. FWIW in astropy we settled on two properties, level of effort and level of sub-package expertise, with corresponding labels: - effort-low, effort-medium, and effort-high - package-novice, package-intermediate, package-expert This has been used with reasonable success. - Tom Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Issue with np.median and array subclasses in 1.8.0rc (worked with 1.7.0)
Hi, The behavior for ``np.median`` and array sub-classes has changed in 1.8.0rc, which breaks unit-handling code (such as the ``quantities`` package, or ``astropy.units``): https://github.com/numpy/numpy/issues/3846 This previously worked from Numpy 1.5 (at least) to Numpy 1.7. Is there a new (and better) way to override the ``np.median`` behavior? Cheers, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Equality not working as expected with ndarray sub-class
Hi everyone, The following example: import numpy as np class SimpleArray(np.ndarray): __array_priority__ = 1 def __new__(cls, input_array, info=None): return np.asarray(input_array).view(cls) def __eq__(self, other): return False a = SimpleArray(10) print (np.int64(10) == a) print (a == np.int64(10)) gives the following output $ python2.7 eq.py True False so that in the first case, SimpleArray.__eq__ is not called. Is this a bug, and if so, can anyone think of a workaround? If this is expected behavior, how do I ensure SimpleArray.__eq__ gets called in both cases? Thanks, Tom ps: cross-posting to stackoverflow ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ ignored if __array__ is present
Hi Frederic, On 16 May 2013 15:58, Frédéric Bastien no...@nouiz.org wrote: I looked yesterday rapidly in the code and didn't find the reason (I don't know it well, that is probably why). But last night I think of one possible cause. I found this code 2 times in the file core/src/umath/ufunc_object.c: if (nin == 2 nout == 1 dtypes[1]-type_num == NPY_OBJECT) { PyObject *_obj = PyTuple_GET_ITEM(args, 1); if (!PyArray_CheckExact(_obj)) { double self_prio, other_prio; self_prio = PyArray_GetPriority(PyTuple_GET_ITEM(args, 0), NPY_SCALAR_PRIORITY); other_prio = PyArray_GetPriority(_obj, NPY_SCALAR_PRIORITY); if (self_prio other_prio _has_reflected_op(_obj, ufunc_name)) { retval = -2; goto fail; } } } It is this code that will call _has_reflected_op() function. The if condition is: dtypes[1]-type_num == NPY_OBJECT I wouldn't be surprised if dtypes[1] isn't NPY_OBJECT when you implement __array__. dtypes is set with those line: retval = ufunc-type_resolver(ufunc, casting, op, type_tup, dtypes); Thanks for looking into this - should this be considered a bug? Tom HTH Fred On Thu, May 16, 2013 at 9:19 AM, Thomas Robitaille thomas.robitai...@gmail.com wrote: Hi everyone, (this was posted as part of another topic, but since it was unrelated, I'm reposting as a separate thread) I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Thanks! Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] __array_priority__ ignored if __array__ is present
Hi everyone, (this was posted as part of another topic, but since it was unrelated, I'm reposting as a separate thread) I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Thanks! Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] __array_priority__ don't work for gt, lt, ... operator
I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Cheers, Tom On 10 May 2013 18:34, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, May 10, 2013 at 10:08 AM, Frédéric Bastien no...@nouiz.org wrote: Hi, it popped again on the Theano mailing list that this don't work: np.arange(10) = a_theano_vector. The reason is that __array_priority__ isn't respected for that class of operation. This page explain the problem and give a work around: http://stackoverflow.com/questions/14619449/how-can-i-override-comparisons-between-numpys-ndarray-and-my-type The work around is to make a python function that will decide witch version of the comparator to call and do the call. Then we tell NumPy to use that function instead of its current function with: np.set_numeric_ops(...) But if we do that, when we import theano, we will slow down all normal numpy comparison for the user, as when = is execute, first there will be numpy c code executed, that will call the python function to decide witch version to do, then if it is 2 numpy ndarray, it will call again numpy c code. That isn't a good solution. We could do the same override in C, but then theano work the same when there isn't a c++ compiler. That isn't nice. What do you think of changing them to check for __array_priority__ before doing the comparison? This looks like an oversight and should be fixed. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Scalar output from sub-classed Numpy array
Hi everyone, I am currently trying to write a sub-class of Numpy ndarray, but am running into issues for functions that return scalar results rather than array results. For example, in the following case: import numpy as np class TestClass(np.ndarray): def __new__(cls, input_array, unit=None): return np.asarray(input_array).view(cls) def __array_finalize__(self, obj): if obj is None: return def __array_wrap__(self, out_arr, context=None): return np.ndarray.__array_wrap__(self, out_arr, context) I get: In [4]: a = TestClass([1,2,3]) In [5]: print type(np.dot(a,a)) type 'numpy.int64' In [6]: a = TestClass([[1,2],[1,2]]) In [7]: print type(np.dot(a,a)) class '__main__.TestClass' that is, in the case where the output is a scalar, it doesn't get wrapped, while in the case where the output is an array, it does. Could anyone explain this behavior to me, and most importantly, is there a way around this and have the above example return a wrapped 0-d TestClass array instead of a Numpy int64? Thanks, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numpy on Travis with Python 3
Hi everyone, I'm currently having issues with installing Numpy 1.6.2 with Python 3.1 and 3.2 using pip in Travis builds - see for example: https://travis-ci.org/astropy/astropy/jobs/3379866 The build aborts with a cryptic message: ValueError: underlying buffer has been detached Has anyone seen this kind of issue before? Thanks for any help, Cheers, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] PRs for MaskedArray bugs
I've recently opened a couple of pull requests that fix bugs with MaskedArray - these are pretty straightforward, so would it be possible to consider them in time for 1.7? https://github.com/numpy/numpy/pull/2703 https://github.com/numpy/numpy/pull/2733 Thanks! Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Issue with dtype and nx1 arrays
Hello, Is the following behavior normal? In [1]: import numpy as np In [2]: np.dtype([('a','f4',2)]) Out[2]: dtype([('a', 'f4', (2,))]) In [3]: np.dtype([('a','f4',1)]) Out[3]: dtype([('a', 'f4')]) I.e. in the second case, the second dimension of the dtype (1) is being ignored? Is there a way to avoid this? Thanks, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Array slices and number of dimensions
Hi, I'm trying to extract sub-sections of a multidimensional array while keeping the number of dimensions the same. If I just select a specific element along a given direction, then the number of dimensions goes down by one: import numpy as np a = np.zeros((10,10,10)) a.shape (10, 10, 10) a[0,:,:].shape (10, 10) This makes sense to me. If I want to retain the initial number of dimensions, I can do a[[0],:,:].shape (1, 10, 10) However, if I try and do this along two directions, I do get a reduction in the number of dimensions: a[[0],:,[5]].shape (1, 10) I'm wondering if this is normal, or is a bug? In fact, I can get what I want by doing: a[[0],:,:][:,:,[5]].shape (1, 10, 1) so I can get around the issue, but just wanted to check whether the issue with a[[0],:,[5]] is a bug? Thanks, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Bug in loadtxt
Hi, I am running into a precision issue with np.loadtxt. I have a data file with the following contents: $ cat data.txt -9.61922814E-01 -9.96192290E-01 -9.99619227E-01 -9.99961919E-01 -9.6192E-01 -9.9611E-01 -1.E+00 If I try and read this in using loadtxt, which should read numbers in using (64-bit) float by default, I get: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type help, copyright, credits or license for more information. import numpy as np np.__version__ '2.0.0.dev8657' np.loadtxt('data.txt') array([-1., -1., -1., -1., -1., -1., -1.]) If I now create a file called data2.txt with only the first line: $ cat data2.txt -9.61922814E-01 loadtxt works correctly: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type help, copyright, credits or license for more information. import numpy as np np.__version__ '2.0.0.dev8657' np.loadtxt('data2.txt') array(-0.961922814) I have submitted a bug report: http://projects.scipy.org/numpy/ticket/1589 Cheers, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in loadtxt
josef.pktd wrote: are you sure this is not just a print precision problem? Thanks for pointing this out, it does seem to be just to do with the printing precision. I didn't notice this before, because for the last few elements of the array, print still gives just -1: In [19]: for x in a: : print x : : -0.9619 -0.9962 -0.9996 -1.0 -1.0 -1.0 -1.0 But I now realize that to get the full precision, I should have done In [20]: for x in a: : repr(x) : : Out[20]: '-0.961922814' Out[20]: '-0.99619229' Out[20]: '-0.999619227' Out[20]: '-0.61919' Out[20]: '-0.96192' Out[20]: '-0.99611' Out[20]: '-1.0' Cheers, Tom -- View this message in context: http://old.nabble.com/Bug-in-loadtxt-tp29500522p29500609.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype.type for structured arrays
Thomas Robitaille wrote: I seem to remember that this used not to be the case, and that even for vector columns, one could access array.dtype[0].type to get the numerical type. Is this a bug, or deliberate? I submitted a bug report: http://projects.scipy.org/numpy/ticket/1557 Cheers, Tom -- View this message in context: http://old.nabble.com/dtype.type-for-structured-arrays-tp29256618p29276859.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.fromstring and Python 3
Pauli Virtanen-3 wrote: That's a bug. It apparently implicitly encodes the Unicode string you pass in to UTF-8, instead of trying to encode in ASCII and fail, like it does on Python 2: Thanks! Should I file a bug report? Cheers, Tom -- View this message in context: http://old.nabble.com/np.fromstring-and-Python-3-tp29260203p29268008.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.fromstring and Python 3
Hi, The following example illustrates a problem I'm encountering a problem with the np.fromstring function in Python 3: Python 3.1.2 (r312:79360M, Mar 24 2010, 01:33:18) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type help, copyright, credits or license for more information. import numpy as np string = .join(chr(i) for i in range(256)) a = np.fromstring(string, dtype=np.int8) print(len(string)) 256 print(len(a)) 384 The array 'a' should have the same size as 'string' since I'm using a 1-byte datatype. Is this a bug, or do I need to change the way I use this function in Python 3? I am using Numpy r8523 Cheers, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Broadcasting and indexing
Hello, I'm trying to understand how array broadcasting can be used for indexing. In the following, I use the term 'row' to refer to the first dimension of a 2D array, and 'column' to the second, just because that's how numpy prints them out. If I consider the following example: a = np.random.random((4,5)) b = np.random.random((5,)) a + b array([[ 1.45499556, 0.60633959, 0.48236157, 1.55357393, 1.4339261 ], [ 1.28614593, 1.11265001, 0.63308615, 1.28904227, 1.34070499], [ 1.26988279, 0.84683018, 0.98959466, 0.76388223, 0.79273084], [ 1.27859505, 0.9721984 , 1.02725009, 1.38852061, 1.56065028]]) I understand how this works, because it works as expected as described in http://docs.scipy.org/doc/numpy/reference/ufuncs.html#broadcasting So b gets broadcast to shape (1,5), then because the first dimension is 1, the operation is applied to all rows. Now I am trying to apply this to array indexing. So for example, I want to set specific columns, indicated by a boolean array, to zero, but the following fails: c = np.array([1,0,1,0,1], dtype=bool) a[c] = 0 Traceback (most recent call last): File stdin, line 1, in module IndexError: index (4) out of range (0=index3) in dimension 0 However, if I try reducing the size of c to 4, then it works, and sets rows, not columns, equal to zero c = np.array([1,0,1,0], dtype=bool) a[c] = 0 a array([[ 0., 0., 0., 0., 0.], [ 0.41526315, 0.7425491 , 0.39872546, 0.56141914, 0.69795153], [ 0., 0., 0., 0., 0.], [ 0.40771227, 0.60209749, 0.7928894 , 0.66089748, 0.91789682]]) But I would have thought that the indexing array would have been broadcast in the same way as for a sum, i.e. c would be broadcast to have dimensions (1,5) and then would have been able to set certain columns in all rows to zero. Why is it that for indexing, the broadcasting seems to happen in a different way than when performing operations like additions or multiplications? For background info, I'm trying to write a routine which performs a set of operations on an n-d array, where n is not known in advance, with a 1D array, so I can use broadcasting rules for most operations without knowing the dimensionality of the n-d array, but now that I need to perform indexing, and the convention seems to change, this is a real issue. Thanks in advance for any advice, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Structured array sorting
Warren Weckesser-3 wrote: Looks like 'sort' is not handling endianess of the column data correctly. If you change the type of the floating point data to 'f8', the sort works. Thanks for identifying the issue - should I submit a bug report? Thomas -- View this message in context: http://old.nabble.com/Structured-array-sorting-tp27200785p27210615.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Structured array sorting
I am having trouble sorting a structured array - in the example below, sorting by the first column (col1) seems to work, but not sorting by the second column (col2). Is this a bug? I am using numpy svn r8071 on MacOS 10.6. Thanks for any help, Thomas Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type help, copyright, credits or license for more information. import numpy as np data = np.array([('a ', 2.), ('b', 4.), ('d', 3.), ('c', 1.)], ... dtype=[('col1', '|S5'), ('col2', 'f8')]) data array([('a ', 2.0), ('b', 4.0), ('d', 3.0), ('c', 1.0)], dtype=[('col1', '|S5'), ('col2', 'f8')]) data.sort(order=['col1']) data array([('a ', 2.0), ('b', 4.0), ('c', 1.0), ('d', 3.0)], dtype=[('col1', '|S5'), ('col2', 'f8')]) data.sort(order=['col2']) data array([('a ', 2.0), ('d', 3.0), ('b', 4.0), ('c', 1.0)], dtype=[('col1', '|S5'), ('col2', 'f8')]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Problem with set_fill_value for masked structured array
Pierre GM-2 wrote: Well, that's a problem indeed, and I'd put that as a bug. However, you can use that syntax instead: t.fill_value['a']=10 or set all the fields at once: t.fill_value=(10,99) Thanks for your reply - should I submit a bug report on the numpy trac site? Thomas -- View this message in context: http://old.nabble.com/Problem-with-set_fill_value-for-masked-structured-array-tp26770843p26780052.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Problem with set_fill_value for masked structured array
Hi, The following code doesn't seem to work: import numpy.ma as ma t = ma.array(zip([1,2,3],[4,5,6]),dtype=[('a',int),('b',int)]) print repr(t['a']) t['a'].set_fill_value(10) print repr(t['a']) As the output is masked_array(data = [1 2 3], mask = [False False False], fill_value = 99) masked_array(data = [1 2 3], mask = [False False False], fill_value = 99) (and no exception is raised) Am I doing something wrong? Thanks in advance for any help, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] masked record arrays
Pierre GM-2 wrote: Mmh. With a recent (1.3) version of numpy, you should already be able to mask individual fields of a structured array without problems. If you need fields to be accessed as attributes the np.recarray way, you can give numpy.ma.mrecords.MaskedRecords a try. It's been a while I haven't touched it, so you may run into the occasional bug. FYI, I'm not a big fan of record arrays and tend to prefer structured ones... What two implementations were you talking about ? In any case, feel free to try and please, report any issue you run into with MaskedRecords. Cheers Thanks for the advice! I'm somewhat confused by the difference between structured and record arrays. My understanding is that record arrays allow you to access fields by attribute (e.g. r.field_name), but I imagine that there are much more fundamental differences for the two to be treated separately in numpy. I find the numpy documentation somewhat confusing in that respect - if you have a look at this page http://docs.scipy.org/doc/numpy/user/basics.rec.html I think the 'aka record arrays' is especially confusing as this would suggest the two are the same. So is there good information anywhere about what exactly are the differences between the two? This page is also confusing: http://docs.scipy.org/doc/numpy/reference/generated/numpy.recarray.html as to me Construct an ndarray that allows field access using attributes suggests that all a recarray is is an ndarray/structured array with overloaded __getattr__/__setattr__ methods. Is that all recarrays are? If so, why was a completely separate package developed for masked record arrays - can one not just use masked structured arrays and overload getattr/setattr? Cheers, Thomas -- View this message in context: http://old.nabble.com/masked-record-arrays-tp26237612p26247808.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic string length in recarray
Pierre GM-2 wrote: As a workwaround, perhaps you could use np.object instead of np.str while defining your array. You can then get the maximum string length by looping, as David suggested, and then use .astype to transform your array... I tried this: np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8),('b',np.object_)]) but I get a TypeError: --- TypeError Traceback (most recent call last) /Users/tom/ipython console in module() /Users/tom/Library/Python/2.6/site-packages/numpy/core/records.pyc in fromrecords(recList, dtype, shape, formats, names, titles, aligned, byteorder) 625 res = retval.view(recarray) 626 -- 627 res.dtype = sb.dtype((record, res.dtype)) 628 return res 629 /Users/tom/Library/Python/2.6/site-packages/numpy/core/records.pyc in __setattr__(self, attr, val) 432 if attr not in fielddict: 433 exctype, value = sys.exc_info()[:2] -- 434 raise exctype, value 435 else: 436 fielddict = ndarray.__getattribute__(self,'dtype').fields or {} TypeError: Cannot change data-type for object array. Is this a bug? Thanks, Thomas -- View this message in context: http://old.nabble.com/Automatic-string-length-in-recarray-tp26174810p26199762.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic string length in recarray
Pierre GM-2 wrote: Confirmed, it's a bug all right. Would you mind opening a ticket ? I'll try to take care of that in the next few days. Done - http://projects.scipy.org/numpy/ticket/1283 Thanks! Thomas -- View this message in context: http://old.nabble.com/Automatic-string-length-in-recarray-tp26174810p26203110.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Automatic string length in recarray
Hi, I'm having trouble with creating np.string_ fields in recarrays. If I create a recarray using np.rec.fromrecords([(1,'hello'),(2,'world')],names=['a','b']) the result looks fine: rec.array([(1, 'hello'), (2, 'world')], dtype=[('a', 'i8'), ('b', '| S5')]) But if I want to specify the data types: np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8), ('b',np.str)]) the string field is set to a length of zero: rec.array([(1, ''), (2, '')], dtype=[('a', '|i1'), ('b', '|S0')]) I need to specify datatypes for all numerical types since I care about int8/16/32, etc, but I would like to benefit from the auto string length detection that works if I don't specify datatypes. I tried replacing np.str by None but no luck. I know I can specify '|S5' for example, but I don't know in advance what the string length should be set to. Is there a way to solve this problem without manually examining the data that is being passed to rec.fromrecords? Thanks for any help, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Random int64 and float64 numbers
Hi, I'm trying to generate random 64-bit integer values for integers and floats using Numpy, within the entire range of valid values for that type. To generate random 32-bit floats, I can use: np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo (np.float32).max,size=10) which gives for example array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, -2.47883866e+38]) but if I try and use this for 64-bit numbers, i.e. np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo (np.float64).max,size=10) I get array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) Similarly, for integers, I can successfully generate random 32-bit integers: np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo (np.int32).max,size=10) which gives array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, -604311122, 2034533014, 449680073, -444302414, -1924170329]) but am unsuccessful for 64-bit integers, i.e. np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo (np.int64).max,size=10) which produces the following error: OverflowError: long int too large to convert to int Is this expected behavior, or are these bugs? Thanks for any help, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Formatting uint64 number
Hello, I have a question concerning uint64 numbers - let's say I want to format a uint64 number that is 2**31, at the moment it's necessary to wrap the numpy number inside long before formatting In [3]: %40i % np.uint64(2**64-1) Out[3]: ' -1' In [4]: %40i % long(np.uint64(2**64-1)) Out[4]: '18446744073709551615' Would it be easy to modify numpy such that it automatically converts uint64 numbers to long() instead of int() when implicitly converted to python types? Thanks, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] rec_append_fields and n-dimensional fields
Hi, I'm interested in constructing a recarray with fields that have two or more dimensions. This can be done from scratch like this: r = np.recarray((10,),dtype=[('c1',float,(3,))]) However, I am interested in appending a field to an existing recarray. Rather than repeating existing code I would like to use the numpy.lib.recfunctions.rec_append_fields method, but I am not sure how to specify the dimension of each field, since it doesn't seem to be possible to specify the dtype as a tuple as above. Thanks for any advice, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] unpacking bytes directly in numpy
Hi, To convert some bytes to e.g. a 32-bit int, I can do bytes = f.read(4) i = struct.unpack('i', bytes)[0] and the convert it to np.int32 with i = np.int32(i) However, is there a more direct way of directly transforming bytes into a np.int32 type without the intermediate 'struct.unpack' step? Thanks for any help, Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rasterizing points onto an array
Nathan Bell-4 wrote: image = np.histogram2d(x, y, bins=bins, weights=z)[0] This works great - thanks! Thomas -- View this message in context: http://www.nabble.com/Rasterizing-points-onto-an-array-tp23808494p23820216.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Rasterizing points onto an array
Hi, I have a set of n points with real coordinates between 0 and 1, given by two numpy arrays x and y, with a value at each point represented by a third array z. I am trying to then rasterize the points onto a grid of size npix*npix. So I can start by converting x and y to integer pixel coordinates ix and iy. But my question is, is there an efficient way to add z[i] to the pixel given by (xi[i],yi[i])? Below is what I am doing at the moment, but the for loop becomes very inefficient for large n. I would imagine that there is a way to do this without using a loop? --- import numpy as np n = 1000 x = np.random.random(n) y = np.random.random(n) z = np.random.random(n) npix = 100 ix = np.array(x*float(npix),int) iy = np.array(y*float(npix),int) image = np.zeros((npix,npix)) for i in range(len(ix)): image[ix[i],iy[i]] = image[ix[i],iy[i]] + z[i] --- Thanks for any advice, Thomas ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy Trac site redirecting in a loop?
Pauli Virtanen-3 wrote: I applied the patch from the ticket; I think password resets should work now, so you can try using your old accounts again. That worked, thanks! Now I think of it, the problem started occurring after I had forgotten my password and had to reset it. Thomas -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23454826.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy Trac site redirecting in a loop?
Hi, I'm having the exact same problem, trying to log in to the trac website for numpy, and getting stuck in a redirect loop. I tried different browsers, and no luck. The browser gets stuck on http://projects.scipy.org/numpy/prefs/account and stops loading after a while because of too many redirects... Is there any way around this? Thanks, Thomas -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23417366.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy Trac site redirecting in a loop?
Could it be linked to specific users, since the problem occurs when loading the account page? I had the same problem on two different computers with two different browsers. Thomas -- View this message in context: http://www.nabble.com/Numpy-Trac-site-redirecting-in-a-loop--tp23067410p23417595.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Concatenating string arrays
Hello, I am trying to find an efficient way to concatenate the elements of two same-length numpy str arrays. For example if I define the following arrays: import numpy as np arr1 = np.array(['a','b','c']) arr2 = np.array(['d','e','f']) I would like to produce a third array that would contain ['ad','be','cf']. Is there an efficient way to do this? I could do this element by element, but I need a faster method, as I need to do this on arrays with several million elements. Thanks for any help, Thomas ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Concatenating string arrays
import numpy as np arr1 = np.array(['a','b','c']) arr2 = np.array(['d','e','f']) I would like to produce a third array that would contain ['ad','be','cf']. Is there an efficient way to do this? I could do this element by element, but I need a faster method, as I need to do this on arrays with several million elements. arr1 = np.array(['a','b','c']) arr2 = np.array(['d','e','f']) arr3 = np.zeros(6, dtype='|S1') arr3[::2] = arr1 arr3[1::2] = arr2 arr3.view(dtype='|S2') array(['ad', 'be', 'cf'], dtype='|S2') Does this help? This works wonderfully - thanks! Tom ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion