Re: [Numpy-discussion] striding through arbitrarily large files
On 4 February 2014 15:01, RayS r...@blue-cove.com wrote: I was struggling with methods of reading large disk files into numpy efficiently (not FITS or .npy, just raw files of IEEE floats from numpy.tostring()). When loading arbitrarily large files it would be nice to not bother reading more than the plot can display before zooming in. There apparently are no built in methods that allow skipping/striding... Since you mentioned the plural files, are your datasets entirely contained within a single file? If not, you might be interested in Biggus ( https://pypi.python.org/pypi/Biggus). It's a small pure-Python module that lets you glue-together arrays (such as those from smmap) into a single arbitrarily large virtual array. You can then step over the virtual array and it maps it back to the underlying sources. Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about typenum
Hi Valentin, On 8 October 2013 13:23, Valentin Haenel valen...@haenel.co wrote: Certain functions, like `PyArray_SimpleNewFromData` `PyArray_SimpleNew` take a typeenum Is there any way to go from typeenum to something that can be passed to the dtype constructor, like mapping 12 - 'f8'? If you just want the corresponding dtype instance (aka PyArray_Descr) then `PyArray_DescrFromType` should be what you're after. But if you really need the 'f8' string then I'd be tempted to get the PyArray_Descr and then use the Python API (e.g. PyObject_GetAttrString) to request the str attribute. Under the hood this attribute is implemented by `arraydescr_protocol_typestr_get` but that's not part of the public API. Regards, Richard Hattersley ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bug in numpy.correlate documentation
Hi Bernard, Looks like you're on to something - two other people have raised this discrepancy before: https://github.com/numpy/numpy/issues/2588. Unfortunately, when it comes to resolving the discrepancy one of the previous comments takes the opposite view. Namely, that the docstring is correct and the code is wrong. Do different domains use different conventions here? Are there some references to back up one stance or another? But all else being equal, I'm guessing there'll be far more appetite for updating the documentation than the code. Regards, Richard Hattersley On 7 October 2013 22:09, Bernhard Spinnler bernhard.spinn...@gmx.netwrote: The numpy.correlate documentation says: correlate(a, v) = z[k] = sum_n a[n] * conj(v[n+k]) In [1]: a = [1, 2] In [2]: v = [2, 1j] In [3]: z = correlate(a, v, 'full') In [4]: z Out[4]: array([ 0.-1.j, 2.-2.j, 4.+0.j]) However, according to the documentation, z should be z[-1] = a[1] * conj(v[0]) = 4.+0.j z[0] = a[0] * conj(v[0]) + a[1] * conj(v[1]) = 2.-2.j z[1] = a[0] * conj(v[1]) = 0.-1.j which is the time reversed version of what correlate() calculates. IMHO, the correlate() code is correct. The correct formula in the docs (which is also the correlation formula in standard text books) should be z[k] = sum_n a[n+k] * conj(v[n]) Cheers, Bernhard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about typenum
On 8 October 2013 19:56, Valentin Haenel valen...@haenel.co wrote: I ended up using: PyArray_TypeObjectFromType from cython so: np.dtype(cnp.PyArray_TypeObjectFromType(self.ndtype)).str Maybe i can avoid the np.dtype call, when using PyArray_Descr? In short: yes. `PyArray_TypeObjectFromType` first uses `PyArray_DescrFromType` to figure out the dtype from the type number, and then it returns the corresponding array scalar type. Passing this array scalar type to `np.dtype` gets you back to the dtype that had just been looked up inside TypeObjectFromType. Regards, Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Indexing changes/deprecations
On 27 September 2013 13:27, Sebastian Berg sebast...@sipsolutions.netwrote: And most importantly, is there any behaviour thing in the index machinery that is bugging you, which I may have forgotten until now? Well, since you asked... I'd *love* to see the fancy indexing behaviour moved to a separate method(s). Yes, I know! I'm not realistically expecting that to be tackled right now. And it sometimes seems like something of a sacred idol that one is not supposed to question. But I've kept quiet on the issue for too long and would love to know if anyone else thinks the same. It confuses people. Actually, it confuses the hell out of people. I'm *still* finding out new quirks of its behaviour and I've been using NumPy in a professional role for years... although you should bear in mind I could just be a slow learner. ;-) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removal of numarray and oldnumeric packages.
On 23 September 2013 18:03, Charles R Harris charlesr.har...@gmail.comwrote: I have gotten no feedback on the removal of the numarray and oldnumeric packages. Consequently the removal will take place on 9/28. Scream now or never... I know I always like to get feedback either way ... so +1 for removal. Thanks. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PEP8
Something we have done in matplotlib is that we have made PEP8 a part of the tests. In Iris and Cartopy we've also done this and it works well. While we transition we have an exclusion list (which is gradually getting shorter). We've had mixed experiences with automatic reformatting, so prefer to keep the human in the loop. Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic custom dtype
On 21 June 2013 19:57, Charles R Harris charlesr.har...@gmail.com wrote: You could check the numpy/core/src/umath/test_rational.c.src code to see if you are missing something. In v1.7+ the difference in behaviour between my code and the rational test case is because my scalar type doesn't subclass np.generic (aka. PyGenericArrType_Type). In v1.6 this requirement doesn't exist ... mostly ... In other words, it works as long as the supplied scalars are contained within a sequence. So: np.array([scalar]) = np.array([scalar], dtype=my_dtype) But: np.array(scalar) = np.array(scalar, dtype=object) For one of my scalar/dtype combos I can easily workaround the 1.7+ issue by just adding the subclass relationship. But another of my dtypes is wrapping a third-party type so I can't modify the subclass relationship. :-( So I guess I have three questions. Firstly, is there some cunning workaround when defining a dtype for a third-party type? Secondly, is the subclass-generic requirement in v1.7+ desirable and/or intended? Or just an accidental regression? And thirdly, assuming it's desirable to remove the subclass-generic requirement, would it also make sense to make it work for scalars which are not within a sequence? NB. If we decide there's some work which needs doing here, then I should be able to put time on it. Thanks, Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic custom dtype
On 28 June 2013 17:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 28, 2013 at 5:27 AM, Richard Hattersley rhatters...@gmail.com wrote: So: np.array([scalar]) = np.array([scalar], dtype=my_dtype) But: np.array(scalar) = np.array(scalar, dtype=object) So the scalar case (0 dimensional array) doesn't work right. Hmm, what happens when you index the first array? Does subclassing the generic type work in 1.6? Indexing into the first array works fine. So something like `a[0]` calls my_dtype-f-getitem which creates a new scalar instance, and something like `a[:1]` creates a new view with the correct dtype. My impression is that subclassing the generic type should be required, but I don't see where it is documented :( Can you elaborate on why the generic type should be required? Do you think it might cause problems elsewhere? (FYI I've also tested with a patched version of v1.6.2 which fixes the typo which prevents the use of user-defined dtypes with ufuncs, and that functionality seems to work fine too.) Anyway, what is the problem with the third party code? Is there no chance that you can get hold of it to fix it? Unfortunately it's out of my control. Regards, Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic custom dtype
On 21 June 2013 19:57, Charles R Harris charlesr.har...@gmail.com wrote: You could check the numpy/core/src/umath/test_rational.c.src code to see if you are missing something. My code is based in large part on exactly those examples (I don't think I could have got this far using the documentation alone!), but I've rechecked and there's nothing obvious missing. That said I think there may be something funny going on with error handling within getitem and friends so I'm still following up on that. Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Automatic custom dtype
Hi all, In my continuing adventures in the Land of Custom Dtypes I've come across some rather disappointing behaviour in 1.7 1.8. I've defined my own class `Time360`, and a corresponding dtype `time360` which references Time360 as its scalar type. Now with 1.6.2 I can do: t = Time360(2013, 6, 29) np.array([t]).dtype dtype('Time360') And since all the instances supplied to the function were instances of the scalar type for my dtype, numpy automatically created an array using my dtype. Happy days! But in 1.7 and 1.8 I get: np.array([t]).dtype dtype('O') So now I just get a plain old object array. Boo! Hiss! Is this expected? Desirable? An unwanted regression? Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Automatic custom dtype
On 21 June 2013 14:49, Charles R Harris charlesr.har...@gmail.com wrote: Bit short on detail here ;) How did you create/register the dtype? The dtype is created/registered during module initialisation with: dtype = PyObject_New(PyArray_Descr, PyArrayDescr_Type); dtype-typeobj = Time360Type; ... PyArray_RegisterDataType(dtype) Where Time360Type is my new type definition: static PyTypeObject Time360Type = { ... } which is initialised prior to the dtype creation. If the detail matters then should I assume this is unexpected behaviour and maybe I can fix my code so it works? Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Parameterised dtypes
Hi Nathaniel, Thanks for the useful feedback - it'll definitely save me some time chasing around the code base. dtype callbacks and ufuncs don't in general get access to the dtype object, so they can't access whatever parameters exist Indeed - it is a little awkward. But I'm hoping I can use the `data` argument to supply this. You don't even need 'metadata' or 'c_metadata' -- this is Python, we already have a totally standard way to add new fields, just subclass the dumb thing. That would be nice ... but Py_TPFLAGS_BASETYPE is not set for PyArrayDescr_Type so that class is final. 1) No, you can't hook into the dtype string parser. Though, are you sure you really want to? Surely it's nicer to use Python syntax instead of inventing a new syntax and then having to write a parser for it from scratch? Thank you - that's good to know. As you say, I'd *far* rather avoid parsing, but I'd like my dtype to be a good citizen so I was asking out of completeness. (Off at a tangent: The blaze project is a good example of what happens if you do add more parsing. In my opinion it's not the way to go.) 2) I have some vague plans worked out to fix all this so dtypes are just ordinary python objects, but haven't written it down yet due to a combination of lack of time to do so, and lack of anyone with time to actually implement the plan even if it were written down. I mention this just in case someone wants to volunteer, which would move it up my stack. Would you have the time to sketch out the intended benefits? Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Parameterised dtypes
Hi Andrew, Maybe a stupid question, but do you know a reference I could look at for the metadata and c_metadata fields you described? Sorry ... no. I've not found anything. :-( If I remember correctly, I got wind of the metadata aspect from the mailing list discussions of datetime64. So for my current work I've just been scratching around in the datetime64 code looking for example usage. Regards, Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy sprints at Scipy 2013, Austin: call for topics and hands to help
Hi David, On 25 May 2013 15:23, David Cournapeau courn...@gmail.com wrote: As some of you may know, Stéfan and me will present a tutorial on NumPy C code, so if we do our job correctly, we should have a few new people ready to help out during the sprints. Is there any chance you'll be repeating this at EuroSciPy? Things I'd like to work on myself is looking into splitting things from multiarray, think about a better internal API for dtype registration/hooks (with the goal to remove any date dtype hardcoding in both multiarray and ufunc machinery), but I am sure others have more interesting ideas :) I'm not able to get to SciPy so I understand if my vote of support doesn't count ;-), but I'm very interested in the work on the dtype API. And if it was on the radar for EuroSciPy there's a good chance I'd be able to help out. (The combination of a NumPy C tutorial and dtype API work would make a pretty compelling case for my managers.) Regards, Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Parameterised dtypes
Hi all, I'm in the process of defining some new dtypes to handle non-physical calendars (such as the 360-day calendar used in the climate modelling world). This is all going fine[*] so far, but I'd like to know a little bit more about how much is ultimately possible. The PyArray_Descr members `metadata` and `c_metadata` allow run-time parametrisation, but is it possible to hook into the dtype('...') parsing mechanism to supply those parameters? Or is there some other dtype mechanism for supplying parameters? As an example, would it be possible to supply month lengths? a = np.zeros(n, dtype='my_date[34,33,31,30,30,29,29,30,31,32,34,35]') Or is the intended use of parametrisation more like: weird = my_stuff.make_dtype([34,33,31,30,30,29,29,30,31,32,34,35]) a = np.zeros(n, dtype=weird) [*] The docs could do with updating, and the examples would benefit from standardising (or at least explaining the significance of the differences). I intend to post updates where possible. Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Parameterised dtypes
On 24 May 2013 15:12, Richard Hattersley rhatters...@gmail.com wrote: Or is the intended use of parametrisation more like: weird = my_stuff.make_dtype([34,33,31,30,30,29,29,30,31,32,34,35]) a = np.zeros(n, dtype=weird) Or to put it another way I have a working `make_dtype` function (which could easily be extended to do dtype caching), but is that the right way to go about things? Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bug in deepcopy() of rank-zero arrays?
+1 for getting rid of this inconsistency We've hit this with Iris (a met/ocean analysis package - see github), and have had to add several workarounds. On 19 April 2013 16:55, Chris Barker - NOAA Federal chris.bar...@noaa.govwrote: Hi folks, In [264]: np.__version__ Out[264]: '1.7.0' I just noticed that deep copying a rank-zero array yields a scalar -- probably not what we want. In [242]: a1 = np.array(3) In [243]: type(a1), a1 Out[243]: (numpy.ndarray, array(3)) In [244]: a2 = copy.deepcopy(a1) In [245]: type(a2), a2 Out[245]: (numpy.int32, 3) regular copy.copy() seems to work fine: In [246]: a3 = copy.copy(a1) In [247]: type(a3), a3 Out[247]: (numpy.ndarray, array(3)) Higher-rank arrays seem to work fine: In [253]: a1 = np.array((3,4)) In [254]: type(a1), a1 Out[254]: (numpy.ndarray, array([3, 4])) In [255]: a2 = copy.deepcopy(a1) In [256]: type(a2), a2 Out[256]: (numpy.ndarray, array([3, 4])) Array scalars seem to work fine as well: In [257]: s1 = np.float32(3) In [258]: s2 = copy.deepcopy(s1) In [261]: type(s1), s1 Out[261]: (numpy.float32, 3.0) In [262]: type(s2), s2 Out[262]: (numpy.float32, 3.0) There are other ways to copy arrays, but in this case, I had a dict with a bunch of arrays in it, and needed a deepcopy of the dict. I was surprised to find that my rank-0 array got turned into a scalar. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fast numpy.fromfile skipping data chunks
Since the files are huge, and would make me run out of memory, I need to read data skipping some records Is it possible to describe what you're doing with the data once you have subsampled it? And if there were a way to work with the full resolution data, would that be desirable? I ask because I've been dabbling with a pure-Python library for handilng larger-than-memory datasets - https://github.com/SciTools/biggus, and it uses similar chunking techniques as mentioned in the other replies to process data at the full streaming I/O rate. It's still in the early stages of development so the design can be fluid, so maybe it's worth seeing if there's enough in common with your needs to warrant adding your use case. Richard On 13 March 2013 13:45, Andrea Cimatoribus andrea.cimatori...@nioz.nlwrote: Hi everybody, I hope this has not been discussed before, I couldn't find a solution elsewhere. I need to read some binary data, and I am using numpy.fromfile to do this. Since the files are huge, and would make me run out of memory, I need to read data skipping some records (I am reading data recorded at high frequency, so basically I want to read subsampling). At the moment, I came up with the code below, which is then compiled using cython. Despite the significant performance increase from the pure python version, the function is still much slower than numpy.fromfile, and only reads one kind of data (in this case uint32), otherwise I do not know how to define the array type in advance. I have basically no experience with cython nor c, so I am a bit stuck. How can I try to make this more efficient and possibly more generic? Thanks import numpy as np #For cython! cimport numpy as np from libc.stdint cimport uint32_t def cffskip32(fid, int count=1, int skip=0): cdef int k=0 cdef np.ndarray[uint32_t, ndim=1] data = np.zeros(count, dtype=np.uint32) if skip=0: while kcount: try: data[k] = np.fromfile(fid, count=1, dtype=np.uint32) fid.seek(skip, 1) k +=1 except ValueError: data = data[:k] break return data ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumPy 1.7.0b2 release
Hi, [First of all - thanks to everyone involved in the 1.7 release. Especially Ondřej - it takes a lot of time energy to coordinate something like this.] Is there an up to date release schedule anywhere? The trac milestone still references June. Regards, Richard Hattersley On 20 September 2012 07:24, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, I'm pleased to announce the availability of the second beta release of NumPy 1.7.0b2. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ Please test this release and report any issues on the numpy-discussion mailing list. Since beta1, we've fixed most of the known (back then) issues, except: http://projects.scipy.org/numpy/ticket/2076 http://projects.scipy.org/numpy/ticket/2101 http://projects.scipy.org/numpy/ticket/2108 http://projects.scipy.org/numpy/ticket/2150 And many other issues that were reported since the beta1 release. The log of changes is attached. The full list of issues that we still need to work on is at: https://github.com/numpy/numpy/issues/396 Any help is welcome, the best is to send a PR fixing any of the issues -- against master, and I'll then back-port it to the release branch (unless it is something release specific, in which case just send the PR against the release branch). Cheers, Ondrej * f217517 Release 1.7.0b2 * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. * fcacdcc FIX: use py24-compatible version of virtualenv on Travis * d01354e FIX: loosen numerical tolerance in test_pareto() * 65ec87e TST: Add test for boolean insert * 9ee9984 TST: Add extra test for multidimensional inserts. * 8460514 BUG: Fix for issues #378 and #392 This should fix the problems with numpy.insert(), where the input values were not checked for all scalar types and where values did not get inserted properly, but got duplicated by default. * 07e02d0 BUG: fix npymath install location. * 6da087e BUG: fix custom post_check. * 095a3ab BUG: forgot to build _dotblas in bento build. * cb0de72 REF: remove unused imports in bscript. * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 * 3dc3b1b Retain backward compatibility. Enforce C order. * 5a471b5 Improve ndindex execution speed. * 2f28db6 FIX: Add a test for Ticket #2066 * ca29849 BUG: Add a test for Ticket #2189 * 1ee4a00 BUG: Add a test for Ticket #1588 * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip * f65ff87 FIX: simplify the import statement * 124a608 Fix returned copy * 996a9fb FIX: bug in np.where and recarray swapping * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). * 416af9a pavement.py: rename yop to atlas * 3930881 BUG: fix bento build. * fbad4a7 Remove test_recarray_from_long_formats * 5cb80f8 Add test for long number in shape specifier of dtype string * 24da7f6 Add test for long numbers in numpy.rec.array formats string * 77da3f8 Allow long numbers in numpy.rec.array formats string * 99c9397 Use PyUnicode_DecodeUTF32() * 31660d0 Follow the C guidelines * d5d6894 Fix memory leak in concatenate. * 8141e1e FIX: Make sure the tests produce valid unicode * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 * a022015 Re-enable unpickling optimization for large py3k bytes objects. * 470486b Copy bytes object when unpickling an array * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 * a1561c2 [FIX] Add missing header so separate compilation works again * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. * 28ffac7 REL: set version number to 1.7.0rc1-dev. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] easy way to change part of only unmasked elements value?
Hi Chao, If you don't mind modifying masked values, then if you write to the underlying ndarray it won't touch the mask: a = np.ma.masked_less(np.arange(10),5) a.base[3:6] = 1 a masked_array(data = [-- -- -- -- -- 1 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 99) Regards, Richard Hattersley On 10 September 2012 17:43, Chao YUE chaoyue...@gmail.com wrote: Dear all numpy users, what's the easy way if I just want to change part of the unmasked array elements into another new value? like an example below: in my real case, I would like to change a subgrid of a masked numpy array to another value, but this grid include both masked and unmasked data. If I do a simple array[index1:index2, index3:index4] = another_value, those data with original True mask will change into False. I am using numpy 1.6.2. Thanks for any ideas. In [91]: a = np.ma.masked_less(np.arange(10),5) In [92]: or_mask = a.mask.copy() In [93]: a Out[93]: masked_array(data = [-- -- -- -- -- 5 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 99) In [94]: a[3:6]=1 In [95]: a Out[95]: masked_array(data = [-- -- -- 1 1 1 6 7 8 9], mask = [ True True True False False False False False False False], fill_value = 99) In [96]: a = np.ma.masked_array(a,mask=or_mask) In [97]: a Out[97]: masked_array(data = [-- -- -- -- -- 1 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 99) Chao -- *** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to debug reference counting errors
Hi, re: valgrind - to get better results you might try the suggestions from: http://svn.python.org/projects/python/trunk/Misc/README.valgrind Richard On 31 August 2012 09:03, Ondřej Čertík ondrej.cer...@gmail.com wrote: Hi, There is segfault reported here: http://projects.scipy.org/numpy/ticket/1588 I've managed to isolate the problem and even provide a simple patch, that fixes it here: https://github.com/numpy/numpy/issues/398 however the patch simply doesn't decrease the proper reference, so it might leak. I've used bisection (took the whole evening unfortunately...) but the good news is that I've isolated commits that actually broke it. See the github issue #398 for details, diffs etc. Unfortunately, it's 12 commits from Mark and the individual commits raise exception on the segfaulting code, so I can't pin point the problem further. In general, how can I debug this sort of problem? I tried to use valgrind, with a debugging build of numpy, but it provides tons of false (?) positives: https://gist.github.com/3549063 Mark, by looking at the changes that broke it, as well as at my fix, do you see where the problem could be? I suspect it is something with the changes in PyArray_FromAny() or PyArray_FromArray() in ctors.c. But I don't see anything so far that could cause it. Thanks for any help. This is one of the issues blocking the 1.7.0 release. Ondrej ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dropping support for Python 2.4 in NumPy 1.8
The project/environment we work with already targets Python 2.7, so it'd be fine for us and our collaborators. But it's hard to comment in a more altruistic way without knowing the impact of the change. Is it possible to summarise the benefits? (e.g. Simplifies NumPy codebase; allows better support for XXX under 2.5+; ...) On 28 June 2012 13:25, Travis Oliphant tra...@continuum.io wrote: Hey all, I'd like to propose dropping support for Python 2.4 in NumPy 1.8 (not the 1.7 release). What does everyone think of that? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Missing data wrap-up and request for comments
For what it's worth, I'd prefer ndmasked. As has been mentioned elsewhere, some algorithms can't really cope with missing data. I'd very much rather they fail than silently give incorrect results. Working in the climate prediction business (as with many other domains I'm sure), even the *potential* for incorrect results can be damaging. On 11 May 2012 06:14, Travis Oliphant tra...@continuum.io wrote: On May 10, 2012, at 12:21 AM, Charles R Harris wrote: On Wed, May 9, 2012 at 11:05 PM, Benjamin Root ben.r...@ou.edu wrote: On Wednesday, May 9, 2012, Nathaniel Smith wrote: My only objection to this proposal is that committing to this approach seems premature. The existing masked array objects act quite differently from numpy.ma, so why do you believe that they're a good foundation for numpy.ma, and why will users want to switch to their semantics over numpy.ma's semantics? These aren't rhetorical questions, it seems like they must have concrete answers, but I don't know what they are. Based on the design decisions made in the original NEP, a re-made numpy.ma would have to lose _some_ features particularly, the ability to share masks. Save for that and some very obscure behaviors that are undocumented, it is possible to remake numpy.ma as a compatibility layer. That being said, I think that there are some fundamental questions that has concerned. If I recall, there were unresolved questions about behaviors surrounding assignments to elements of a view. I see the project as broken down like this: 1.) internal architecture (largely abi issues) 2.) external architecture (hooks throughout numpy to utilize the new features where possible such as where= argument) 3.) getter/setter semantics 4.) mathematical semantics At this moment, I think we have pieces of 2 and they are fairly non-controversial. It is 1 that I see as being the immediate hold-up here. 3 4 are non-trivial, but because they are mostly about interfaces, I think we can be willing to accept some very basic, fundamental, barebones components here in order to lay the groundwork for a more complete API later. To talk of Travis's proposal, doing nothing is no-go. Not moving forward would dishearten the community. Making a ndmasked type is very intriguing. I see it as a set towards eventually deprecating ndarray? Also, how would it behave with no.asarray() and no.asanyarray()? My other concern is a possible violation of DRY. How difficult would it be to maintain two ndarrays in parallel? As for the flag approach, this still doesn't solve the problem of legacy code (or did I misunderstand?) My understanding of the flag is to allow the code to stay in and get reworked and experimented with while keeping it from contaminating conventional use. The whole point of putting the code in was to experiment and adjust. The rather bizarre idea that it needs to be perfect from the get go is disheartening, and is seldom how new things get developed. Sure, there is a plan up front, but there needs to be feedback and change. And in fact, I haven't seen much feedback about the actual code, I don't even know that the people complaining have tried using it to see where it hurts. I'd like that sort of feedback. I don't think anyone is saying it needs to be perfect from the get go. What I am saying is that this is fundamental enough to downstream users that this kind of thing is best done as a separate object. The flag could still be used to make all Python-level array constructors build ndmasked objects. But, this doesn't address the C-level story where there is quite a bit of downstream use where people have used the NumPy array as just a pointer to memory without considering that there might be a mask attached that should be inspected as well. The NEP addresses this a little bit for those C or C++ consumers of the ndarray in C who always use PyArray_FromAny which can fail if the array has non-NULL mask contents. However, it is *not* true that all downstream users use PyArray_FromAny. A large number of users just use something like PyArray_Check and then PyArray_DATA to get the pointer to the data buffer and then go from there thinking of their data as a strided memory chunk only (no extra mask). The NEP fundamentally changes this simple invariant that has been in NumPy and Numeric before it for a long, long time. I really don't see how we can do this in a 1.7 release.It has too many unknown and I think unknowable downstream effects.But, I think we could introduce another arrayobject that is the masked_array with a Python-level flag that makes it the default array in Python. There are a few more subtleties, PyArray_Check by default will pass sub-classes so if the new ndmask array were a sub-class then it would be passed (just like current numpy.ma arrays and matrices would pass that check today).However, there is a PyArray_CheckExact macro which could
Re: [Numpy-discussion] record arrays and vectorizing
Sounds like it could be a good match for `scipy.spatial.cKDTree`. It can handle single-element queries... element = numpy.arange(1, 8) targets = numpy.random.uniform(0, 8, (1000, 7)) tree = scipy.spatial.cKDTree(targets) distance, index = tree.query(element) targets[index] array([ 1.68457267, 4.26370212, 3.14837617, 4.67616512, 5.80572286, 6.46823904, 6.12957534]) Or even multi-element queries (shown here searching for 3 elements in one call)... elements = numpy.linspace(1, 8, 21).reshape((3, 7)) elements array([[ 1. , 1.35, 1.7 , 2.05, 2.4 , 2.75, 3.1 ], [ 3.45, 3.8 , 4.15, 4.5 , 4.85, 5.2 , 5.55], [ 5.9 , 6.25, 6.6 , 6.95, 7.3 , 7.65, 8. ]]) distances, indices = tree.query(element) targets[indices] array([[ 0.24314961, 2.77933521, 2.00092505, 3.25180563, 2.05392726, 2.80559459, 4.43030939], [ 4.19270199, 2.89257994, 3.91366449, 3.29262138, 3.6779851 , 4.06619636, 4.7183393 ], [ 6.58055518, 6.59232922, 7.00473346, 5.22612494, 7.07170015, 6.54570121, 7.59566404]]) Richard Hattersley On 2 May 2012 19:06, Moroney, Catherine M (388D) catherine.m.moro...@jpl.nasa.gov wrote: Hello, Can somebody give me some hints as to how to code up this function in pure python, rather than dropping down to Fortran? I will want to compare a 7-element vector (called element) to a large list of similarly-dimensioned vectors (called target, and pick out the vector in target that is the closest to element (determined by minimizing the Euclidean distance). For instance, in (slow) brute force form it would look like: element = numpy.array([1, 2, 3, 4, 5, 6, 7]) target = numpy.array(range(0, 49)).reshape(7,7)*0.1 min_length = .0 min_index = for i in xrange(0, 7): distance = (element-target)**2 distance = numpy.sqrt(distance.sum()) if (distance min_length): min_length = distance min_index = i Now of course, the actual problem will be of a much larger scale. I will have an array of elements, and a large number of potential targets. I was thinking of having element be an array where each element itself is a numpy.ndarray, and then vectorizing the code above so as an output I would have an array of the min_index and min_length values. I can get the following simple test to work so I may be on the right track: import numpy dtype = [(x, numpy.ndarray)] def single(data): return data[0].min() multiple = numpy.vectorize(single) if __name__ == __main__: a = numpy.arange(0, 16).reshape(4,4) b = numpy.recarray((4), dtype=dtype) for i in xrange(0, b.shape[0]): b[i][x] = a[i,:] print a print b x = multiple(b) print x What is the best way of constructing b from a? I tried b = numpy.recarray((4), dtype=dtype, buf=a) but I get a segmentation fault when I try to print b. Is there a way to perform this larger task efficiently with record arrays and vectorization, or am I off on the wrong track completely? How can I do this efficiently without dropping down to Fortran? Thanks for any advice, Catherine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A crazy masked-array thought
On 27 April 2012 17:42, Travis Oliphant tra...@continuum.io wrote: 1) There is a lot of code out there that does not know anything about masks and is not used to checking for masks.It enlarges the basic abstraction in a way that is not backwards compatible *conceptually*. This smells fishy to me and I could see a lot of downstream problems from libraries that rely on NumPy. That's exactly why I'd love to see plain arrays remain functionally unchanged. It's just a small, random sample, but here's how a few routines from NumPy and SciPy sanitise their inputs... numpy.trapz (aka scipy.integrate.trapz) - numpy.asanyarray scipy.spatial.KDTree - numpy.asarray scipy.spatial.cKDTree - numpy.ascontiguousarray scipy.integrate.odeint - PyArray_ContiguousFromObject scipy.interpolate.interp1d - numpy.array scipy.interpolate.griddata - numpy.asanyarray numpy.ascontiguousarray So, assuming numpy.ndarray became a strict subclass of some new masked array, it looks plausible that adding just a few checks to numpy.ndarray to exclude the masked superclass would prevent much downstream code from accidentally operating on masked arrays. 2) We cannot agree on how masks should be handled and consequently don't have a real plan for migrating numpy.ma to use these masks. So, we are just growing the API and introducing uncertainty for unclear benefit --- especially for the person that does not want to use masks. I've not yet looked at how numpy.ma users could be migrated. But if we make masked arrays a strict superclass and leave the numpy/ndarray interface and behaviour unchanged, API growth shouldn't be an issue. End-users will be able to completely ignore the existence of masked arrays (except for the minority(?) for whom the ABI/re-compile issue would be relevant). 3) Subclassing in C in Python requires that C-structures are *binary* compatible.This implies that all subclasses have *more* attributes than the superclass. The way it is currently implemented, that means that POAs would have these extra pointers they don't need sitting there to satisfy that requirement. From a C-struct perspective it therefore makes more sense for MAs to inherit from POAs.Ideally, that shouldn't drive the design, but it's part of the landscape in NumPy 1.X I'd hate to see the logical class hierarchy inverted (or collapsed to a single class) just to save a pointer or two from the struct. Now seems like a golden opportunity to fix the relationship between masked and plain arrays. I'm assuming (and implicitly checking that assumption with this statement!) that there's far more code using the Python interface to NumPy, than there is code using the C interface. So I'm urging that the logical consistency of the Python interface (and even the C and Cython interfaces) takes precedence over the C-struct memory saving. I'm not sure I agree with extra pointers they don't need. If we make plain arrays a subclass of masked arrays, aren't these pointers essential to ensure masked array methods can continue to work on plain arrays without requiring special code paths? I have some ideas about how to move forward, but I'm anxiously awaiting the write-up that Mark and Nathaniel are working on to inform and enhance those ideas. +1 As an aside, the implication of preserving the behaviour of the numpy/ndarray interface is that masked arrays will need a *new* interface. For example: import mumpy # Yes - I know it's a terrible name! But I had to write *something* ... sorry! ;-) import numpy a = mumpy.array(...) # makes a masked array b = numpy.array(...) # makes a plain array isinstance(a, mumpy.ndarray) True isinstance(b, mumpy.ndarray) True isinstance(a, numpy.ndarray) False isinstance(b, numpy.ndarray) True Richard Hattersley ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A crazy masked-array thought
I know used a somewhat jokey tone in my original posting, but fundamentally it was a serious question concerning a live topic. So I'm curious about the lack of response. Has this all been covered before? Sorry if I'm being too impatient! On 25 April 2012 16:58, Richard Hattersley rhatters...@gmail.com wrote: The masked array discussions have brought up all sorts of interesting topics - too many to usefully list here - but there's one aspect I haven't spotted yet. Perhaps that's because it's flat out wrong, or crazy, or just too awkward to be helpful. But ... Shouldn't masked arrays (MA) be a superclass of the plain-old-array (POA)? In the library I'm working on, the introduction of MAs (via numpy.ma) required us to sweep through the library and make a fair few changes. That's not the sort of thing one would normally expect from the introduction of a subclass. Putting aside the ABI issue, would it help downstream API compatibility if the POA was a subclass of the MA? Code that's expecting/casting-to a POA might continue to work and, where appropriate, could be upgraded in their own time to accept MAs. Richard Hattersley ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A crazy masked-array thought
Hi all, Thanks for all your responses and for your patience with a newcomer. Don't worry - I'm not going to give up yet. It's all just part of my learning the ropes. On 27 April 2012 14:05, Benjamin Root ben.r...@ou.edu wrote: snipYour idea is interesting, but doesn't it require C++? Or maybe you are thinking of creating a new C type object that would contain all the new features and hold a pointer and function interface to the original POA. Essentially, the new type would act as a wrapper around the original ndarray?/snip When talking about subclasses I'm just talking about the end-user experience within Python. In other words, I'm starting from issubclass(POA, MA) == True, and trying to figure out what the Python API implications would be. On 27 April 2012 14:55, Nathaniel Smith n...@pobox.com wrote: On Fri, Apr 27, 2012 at 11:32 AM, Richard Hattersley rhatters...@gmail.com wrote: I know used a somewhat jokey tone in my original posting, but fundamentally it was a serious question concerning a live topic. So I'm curious about the lack of response. Has this all been covered before? Sorry if I'm being too impatient! That's fine, I know I did read it, but I wasn't sure what to make of it to respond :-) On 25 April 2012 16:58, Richard Hattersley rhatters...@gmail.com wrote: The masked array discussions have brought up all sorts of interesting topics - too many to usefully list here - but there's one aspect I haven't spotted yet. Perhaps that's because it's flat out wrong, or crazy, or just too awkward to be helpful. But ... Shouldn't masked arrays (MA) be a superclass of the plain-old-array (POA)? In the library I'm working on, the introduction of MAs (via numpy.ma) required us to sweep through the library and make a fair few changes. That's not the sort of thing one would normally expect from the introduction of a subclass. Putting aside the ABI issue, would it help downstream API compatibility if the POA was a subclass of the MA? Code that's expecting/casting-to a POA might continue to work and, where appropriate, could be upgraded in their own time to accept MAs. This makes a certain amount of sense from a traditional OO modeling perspective, where classes are supposed to refer to sets of objects and subclasses are subsets and superclasses are supersets. This is the property that's needed to guarantee that if A is a subclass of B, then any code that expects a B can also handle an A, since all A's are B's, which is what you need if you're doing type-checking or type-based dispatch. And indeed, from this perspective, MAs are a superclass of POAs, because for every POA there's a equivalent MA (the one with the mask set to all-true), but not vice-versa. But, that model of OO doesn't have much connection to Python. In Python's semantics, classes are almost irrelevant; they're mostly just some convenience tools for putting together the objects you want, and what really matters is the behavior of each object (the famous duck typing). You can call isinstance() if you want, but it's just an ordinary function that looks at some attributes on an object; the only magic involved is that some of those attributes have underscores in their name. In Python, subclassing mostly does two things: (1) it's a quick way to define set up a class that's similar to another class (though this is a worse idea than it looks -- you're basically doing 'from other_class import *' with all the usual tight-coupling problems that 'import *' brings). (2) When writing Python objects at the C level, subclassing lets you achieve memory layout compatibility (which is important because C does *not* do duck typing), and it lets you add new fields to a C struct. So at this level, MAs are a subclass of POAs, because MAs have an extra field that POAs don't... So I don't know what to think about subclasses/superclasses here, because they're such confusing and contradictory concepts that it's hard to tell what the actual resulting API semantics would be. It doesn't seem essential that MAs have an extra field that POAs don't. If POA was a subclass of MA, instances of POA could have the extra field set to an all-valid/nothing-is-masked value. Granted, you'd want that to be a special value so you're not lugging around a load of redundant data (and you can optimise your processing for that), but I'm guessing you'd probably want that kind of capability within MA anyway. On 27 April 2012 15:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Apr 27, 2012 at 8:15 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Apr 25, 2012 at 9:58 AM, Richard Hattersley rhatters...@gmail.com wrote: The masked array discussions have brought up all sorts of interesting topics - too many to usefully list here - but there's one aspect I haven't spotted yet. Perhaps that's because it's flat out wrong, or crazy, or just too
[Numpy-discussion] A crazy masked-array thought
The masked array discussions have brought up all sorts of interesting topics - too many to usefully list here - but there's one aspect I haven't spotted yet. Perhaps that's because it's flat out wrong, or crazy, or just too awkward to be helpful. But ... Shouldn't masked arrays (MA) be a superclass of the plain-old-array (POA)? In the library I'm working on, the introduction of MAs (via numpy.ma) required us to sweep through the library and make a fair few changes. That's not the sort of thing one would normally expect from the introduction of a subclass. Putting aside the ABI issue, would it help downstream API compatibility if the POA was a subclass of the MA? Code that's expecting/casting-to a POA might continue to work and, where appropriate, could be upgraded in their own time to accept MAs. Richard Hattersley ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Style for pad implementation in 'pad' namespace or functions under np.lib
1) The use of string constants to identify NumPy processes. It would seem better to use library defined constants (ufuncs?) for better future-proofing, maintenance, etc. I don't see how this would help with future-proofing or maintenance -- can you elaborate? If this were C, I'd agree; using an enum would have a number of benefits: -- easier to work with than strings (== and switch work, no memory management hassles) -- compiler will notice if you accidentally misspell the enum name -- since you always in effect 'import *', getting access to additional constants doesn't require any extra effort But in Python none of these advantages apply, so I find it more convenient to just use strings. Using constants provides for tab-completion and associated help text. The help text can be particularly useful if the choice of constant affects which extra keyword arguments can be specified. And on a minor note, and far more subjectively (time for another bike-shedding reference!), there's the cleanliness of API. (e.g. Strings don't feel a good match. There are an infinite number of strings, but only a small number are valid. There's nothing machine-readable you can interrogate to find valid values.) Under the hood you'll have to use the string to do a lookup, but the constant can *be* the result of the lookup. Why re-invent the wheel when the language gives it to you for free? Note also that we couldn't use ufuncs here, because we're specifying a rather unusual sort of operation -- there is no ufunc for padding with a linear ramp etc. Using mean as the example is misleading in this respect -- it's not really the same as np.mean. 2) Why does only pad use this style of interface? If it's a good idea for pad, perhaps it should be applied more generally? numpy.aggregate(MEAN, ...), numpy.group(MEAN, ...), etc. anyone? The mode=foo interface style is actually used in other places, e.g., np.linalg.qr. My mistake - I misinterpreted the API earlier, so we're talking at cross-purposes. My comment/question isn't really about pad mode, but about numpy more generally. But it still stands - albeit somewhat hypothetically, since it's hard to imagine such a change taking place. Richard ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Style for pad implementation in 'pad' namespace or functions under np.lib
I like where this is going. Driven by a desire to avoid a million different methods on a single class, we've done something similar in our library. So instead of thing.mean() thing.max(...) etc. we have: thing.scrunch(MEAN, ...) thing.scrunch(MAX, ...) etc. Where the constants like MEAN and MAX encapsulate the process to be performed - including a reference to a NumPy/user-defined aggregation function, as well as some other transformation details. We then found we could reuse the same constants in other operations: thing.scrunch(MEAN, ...) thing.squish(MEAN, ...) thing.rolling_squish(MEAN, ...) So I have two minor concerns with the current proposal. 1) The use of string constants to identify NumPy processes. It would seem better to use library defined constants (ufuncs?) for better future-proofing, maintenance, etc. 2) Why does only pad use this style of interface? If it's a good idea for pad, perhaps it should be applied more generally? numpy.aggregate(MEAN, ...), numpy.group(MEAN, ...), etc. anyone? Richard Hattersley On 30 March 2012 02:55, Travis Oliphant tra...@continuum.io wrote: On Mar 29, 2012, at 12:53 PM, Tim Cera wrote: I was hoping pad would get finished some day. Maybe 1.9? You have been a great sport about this process. I think it will result in something quite nice. Alright - I do like the idea of passing a function to pad, with a bunch of pre-made functions in place. Maybe something like: a = np.arange(10) b = pad('mean', a, 2, stat_length=3) where if the first argument is a string, use one of the built in functions. If instead you passed in a function: def padwithzeros(vector, pad_width, iaxis, **kwargs): bvector = np.zeros(pad_width[0]) avector = np.zeros(pad_width[1]) return bvector, avector b = pad(padwithzeros, a, 2) Would that have some goodness? +1 -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy Memory Error with corrcoef
Both work on my computer, while your example indeed leads to a MemoryError (because shape 459375*459375 would be a decently big matrix...) Nicely understated :) For 32-bit values decently big = 786GB ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] label NA and datetime as experimental
Hi, My team are currently experimenting with extending datetime to allow alternative, non-physical calendars (e.g. 360-day used by climate modellers). Once we've got a handle on the options we'd like to propose the extensions/changes back to NumPy. Obviously we'd like to avoid wasted effort, so are there some aspects of datetime64 which are more experimental than others? Is there a summary of unresolved issues and/or plans for change? Thanks, Richard Hattersley On 25 March 2012 13:57, Ralf Gommers ralf.gomm...@googlemail.com wrote: Hi, We decided to label both NA and datetime APIs as experimental for the 1.7.0 release. I made a PR that does this, please review: https://github.com/numpy/numpy/pull/240 Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] label NA and datetime as experimental
OK - that's useful feedback. Thanks! On 26 March 2012 21:03, Ralf Gommers ralf.gomm...@googlemail.com wrote: On Mon, Mar 26, 2012 at 5:42 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Mon, Mar 26, 2012 at 2:29 AM, Richard Hattersley rhatters...@gmail.com wrote: Hi, My team are currently experimenting with extending datetime to allow alternative, non-physical calendars (e.g. 360-day used by climate modellers). Once we've got a handle on the options we'd like to propose the extensions/changes back to NumPy. Obviously we'd like to avoid wasted effort, so are there some aspects of datetime64 which are more experimental than others? Is there a summary of unresolved issues and/or plans for change? I believe datetime is already used by Pandas, so I don't think there will be major changes there. I'm not aware of open issues, but I could be wrong. The calenders are a bit independent, so I think the best procedure is to go ahead with your work. We want to leave some wiggle room since new features often need a little time to mature. That's how it looks to me anyway. That's my understanding too. Perhaps Mark can comment on the current status. That status and changes need to still be described in the release notes by the way. The experimental tag is mostly due to the datetime history: it was introduced in 1.4.0, removed again in 1.4.1, reintroduced in 1.6.0, the API then labeled not useful (http://thread.gmane.org/gmane.comp.python.numeric.general/44162/focus=44385), then more changes for this release. I hope it's stable now, but seeing what came before and that it still doesn't work with MinGW it's hard to be sure. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Using logical function on more than 2 arrays, availability of a between function ?
What do you mean by efficient? Are you trying to get it execute faster? Or using less memory? Or have more concise source code? Less memory: - numpy.vectorize would let you get to the end result without any intermediate arrays but will be slow. - Using the out parameter of numpy.logical_and will let you avoid one of the intermediate arrays. More speed?: Perhaps putting all three boolean temporary results into a single boolean array (using the out parameter of numpy.greater, etc) and using numpy.all might benefit from logical short-circuiting. And watch out for divide-by-zero from aNirChannel/aBlueChannel. Regards, Richard Hattersley On 19 March 2012 11:04, Matthieu Rigal ri...@rapideye.net wrote: Dear Numpy fellows, I have actually a double question, which only aims to answer a single one : how to get the following line being processed more efficiently : array = numpy.logical_and(numpy.logical_and(aBlueChannel 1.0, aNirChannel (aBlueChannel * 1.0)), aNirChannel (aBlueChannel * 1.8)) One possibility would have been to have the logical_and being able to handle more than two arrays Another one would have been to be able to make a double comparison or a between, like following one : array = numpy.logical_and((aBlueChannel 1.0), (1.0 aNirChannel/aBlueChannel 1.8)) Is there any way to get the things work this way ? Would it else be a possible improvement for 1.7 or a later version ? Best Regards, Matthieu Rigal RapidEye AG Molkenmarkt 30 14776 Brandenburg an der Havel Germany Follow us on Twitter! www.twitter.com/rapideye_ag Head Office/Sitz der Gesellschaft: Brandenburg an der Havel Management Board/Vorstand: Ryan Johnson Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: Robert Johnson Commercial Register/Handelsregister Potsdam HRB 24742 P Tax Number/Steuernummer: 048/100/00053 VAT-Ident-Number/Ust.-ID: DE 199331235 DIN EN ISO 9001 certified ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
+1 on the NEP guideline As part of a team building a scientific analysis library, I'm attempting to understand the current state of NumPy development and its likely future (with a view to contributing if appropriate). The proposed NEP process would make that a whole lot easier. And if nothing else, it would reduce the chance of me posting questions about topics that had already been discussed/decided! Without the process the NEPs become another potential source of confusion and mixed messages. On 1 March 2012 03:02, Travis Oliphant wrote: I Would like to hear the opinions of others on that point, but yes, I think that is an appropriate procedure. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Feb 29, 2012, at 10:54 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Feb 29, 2012 at 1:46 AM, Travis Oliphant tra...@continuum.io wrote: We already use the NEP process for such decisions. This discussion came from simply from the *idea* of writing such a NEP. Nothing has been decided. Only opinions have been shared that might influence the NEP. This is all pretty premature, though --- migration to C++ features on a trial branch is some months away were it to happen. Fernando can correct me if I'm wrong, but I think he was asking a governance question. That is: would you (as BDF$N) consider the following guideline: As a condition for accepting significant changes to Numpy, for each significant change, there will be a NEP. The NEP shall follow the same model as the Python PEPs - that is - there will be a summary of the changes, the issues arising, the for / against opinions and alternatives offered. There will usually be a draft implementation. The NEP will contain the resolution of the discussion as it relates to the code For example, the masked array NEP, although very substantial, contains little discussion of the controversy arising, or the intended resolution of the controversy: https://github.com/numpy/numpy/blob/3f685a1a990f7b6e5149c80b52 436fb4207e49f5/doc/neps/missing-data.rst I mean, although it is useful, it is not in the form of a PEP, as Fernando has described it. Would you accept extending the guidelines to the NEP format? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion