Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration)
On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith n...@pobox.com wrote: On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor jtaylor.deb...@googlemail.com wrote: hi, as the numpy gsoc topic page is a little short on options I was thinking about adding two topics for interested students. But as I have no experience with gsoc or mentoring and the ideas are not very fleshed out yet I'd like to ask if it might make sense at all: 1. configurable algorithm precision [...] with np.precmode(default=fast): np.abs(complex_array) or fast everything except sum and hypot with np.precmode(default=fast, sum=kahan, hypot=standard): np.sum(d) [...] Not a big fan of this one -- it seems like the biggest bulk of the effort would be in figuring out a non-horrible API for exposing these things and getting consensus around it, which is not a good fit to the SoC structure. I'm pretty nervous about the datetime proposal that's currently on the wiki, for similar reasons -- I'm not sure it's actually doable in the SoC context. 2. vector math library integration This is a great suggestion -- clear scope, clear benefit. Two more ideas: 3. Using Cython in the numpy core The numpy core contains tons of complicated C code implementing elaborate operations like indexing, casting, ufunc dispatch, etc. It would be really nice if we could use Cython to write some of these things. However, there is a practical problem: Cython assumes that each .pyx file generates a single compiled module with its own Cython-defined API. Numpy, however, contains a large number of .c files which are all compiled together into a single module, with its own home-brewed system for defining the public API. And we can't rewrite the whole thing. So for this to be viable, we would need some way to compile a bunch of .c *and .pyx* files together into a single module, and allow the .c and .pyx files to call each other. This might involve changes to Cython, some sort of clever post-processing or glue code to get existing cython-generated source code to play nicely with the rest of numpy, or something else. So this project would have the following goals, depending on how practical this turns out to be: (1) produce a hacky proof-of-concept system for doing the above, (2) turn the hacky proof-of-concept into something actually viable for use in real life (possibly this would require getting changes upstream into Cython, etc.), (3) use this system to actually port some interesting numpy code into cython. Having to synchronise two projects may be hard for a GSoC, no ? Otherwise, I am a bit worried about cython being used on the current C code as is, because core and python C API are so interwined (especially multiarray). Maybe one could use cython on the non-core numpy parts that are still in C ? It is not as sexy of a project, though. 4. Pythonic dtypes The current dtype system is klugey. It basically defines its own class system, in parallel to Python's, and unsurprisingly, this new class system is not as good. In particular, it has limitations around the storage of instance-specific data which rule out a large variety of interesting user-defined dtypes, and causes us to need some truly nasty hacks to support the built-in dtypes we do have. And it makes defining a new dtype much more complicated than defining a new Python class. This project would be to implement a new dtype system for numpy, in which np.dtype becomes a near-empty base class, different dtypes (e.g., float64, float32) are simply different subclasses of np.dtype, and dtype objects are simply instances of these classes. Further enhancements would be to make it possible to define new dtypes in pure Python by subclassing np.dtype and implementing special methods for the various dtype operations, and to make it possible for ufunc loops to see the dtype objects. This project would provide the key enabling piece for a wide variety of interesting new features: missing value support, better handling of strings and categorical data, unit handling, automatic differentiation, and probably a bunch more I'm forgetting right now. If we get someone who's up to handling the dtype thing then I can mentor or co-mentor. What do y'all think? (I don't think I have access to update that wiki page -- or maybe I'm just not clever enough to figure out how -- so it would be helpful if someone who can, could?) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy gsoc topic idea: configurable algorithm precision and vector math library integration
Am 03.03.2014 um 20:20 schrieb Julian Taylor jtaylor.deb...@googlemail.com: hi, as the numpy gsoc topic page is a little short on options I was thinking about adding two topics for interested students. But as I have no experience with gsoc or mentoring and the ideas are not very fleshed out yet I'd like to ask if it might make sense at all: 2. vector math library integration some operations like powers, sin, cos etc are relatively slow in numpy depending on the c library used. There are now a few free libraries available that make use of modern hardware to speed these operations up, e.g. sleef and yeppp (also mkl but I have no interest in supporting non-free software) It might be interesting to investigate if these libraries can be integrated with numpy. This also somewhat ties in with the configurable precision mode as the vector math libraries often have different options depending on precision and speed requirements. I have been exhuming an old package I once wrote that wraps the vectorized math functions from Intels MKL for use with numpy. I made it available at https://github.com/geggo/uvml . I don't have access to MKL anymore, so no idea whether this package still works with recent numpy. If still useful, adapting to work with other libraries should not be difficult since they all provide a similar API. For serious work other packages like numexpr, numba or theano are much better. Nevertheless some might want to pick up this approach. Gregor ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy apply_along_axis named arguments
Hi All, I am working with *apply_along_axis* method and I would like to apply a method that requires to pass named arguments (scipy.stats.mstats.mquantiles with prob[]). But currently, it is not possible with *apply_along_axis*. I wonder if it would make sense to add the possibility to pass named arguments. I am also aware that It could be implement using other ways (a loop for each row). That's why I would like to ask whether it makes sense or not to ask for this. Even though, I managed to modify the code, which is simple. http://pastebin.com/pBn0TbgK ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy apply_along_axis named arguments
Please, find below the patch file for numpy 1.8.0 http://pastebin.com/D33fFpjH On 06/03/14 12:17, Albert Jornet Puig wrote: Hi All, I am working with *apply_along_axis* method and I would like to apply a method that requires to pass named arguments (scipy.stats.mstats.mquantiles with prob[]). But currently, it is not possible with *apply_along_axis*. I wonder if it would make sense to add the possibility to pass named arguments. I am also aware that It could be implement using other ways (a loop for each row). That's why I would like to ask whether it makes sense or not to ask for this. Even though, I managed to modify the code, which is simple. http://pastebin.com/pBn0TbgK ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg sebast...@sipsolutions.net Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: 1394037947.21356.20.camel@sebastian-t440 Content-Type: text/plain; charset=UTF-8 Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). Do you mean adopted? What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out. However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :). +1 for it being too baroque for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as lean and mean as possible, embellishments are what SciPy is for. (Again, IMO.) Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite... DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration)
On Thu, Mar 6, 2014 at 5:17 AM, Sturla Molden sturla.mol...@gmail.com wrote: Nathaniel Smith n...@pobox.com wrote: 3. Using Cython in the numpy core The numpy core contains tons of complicated C code implementing elaborate operations like indexing, casting, ufunc dispatch, etc. It would be really nice if we could use Cython to write some of these things. So the idea of having a NumPy as a pure C library in the core is abandoned? This question doesn't make sense to me so I think I must be missing some context. Nothing is abandoned: This is one email by one person on one mailing list suggesting a project to the explore the feasibility of something. And anyway, Cython is just a C code generator, similar in principle to (though vastly more sophisticated than) the ones we already use. It's not like we've ever promised our users we'll keep stable which kind of code generators we use internally. However, there is a practical problem: Cython assumes that each .pyx file generates a single compiled module with its own Cython-defined API. Numpy, however, contains a large number of .c files which are all compiled together into a single module, with its own home-brewed system for defining the public API. And we can't rewrite the whole thing. So for this to be viable, we would need some way to compile a bunch of .c *and .pyx* files together into a single module, and allow the .c and .pyx files to call each other. Cython takes care of that already. http://docs.cython.org/src/userguide/sharing_declarations.html#cimport http://docs.cython.org/src/userguide/external_C_code.html#using-cython-declarations-from-c Linking multiple .c and .pyx files together into a single .so/.dll is much more complicated than just using 'cimport'. Try it if you don't believe me :-). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration)
On Thu, Mar 6, 2014 at 9:11 AM, David Cournapeau courn...@gmail.com wrote: On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith n...@pobox.com wrote: So this project would have the following goals, depending on how practical this turns out to be: (1) produce a hacky proof-of-concept system for doing the above, (2) turn the hacky proof-of-concept into something actually viable for use in real life (possibly this would require getting changes upstream into Cython, etc.), (3) use this system to actually port some interesting numpy code into cython. Having to synchronise two projects may be hard for a GSoC, no ? Yeah, if someone is interested in this it would be nice to get someone from Cython involved too. But that's why the primary goal is to produce a proof-of-concept -- even if all that comes out is that we learn that this cannot be done in an acceptable manner, then that's still a succesful (albeit disappointing) result. Otherwise, I am a bit worried about cython being used on the current C code as is, because core and python C API are so interwined (especially multiarray). I don't understand this objection. The whole advantage of Cython is that it makes it much, much easier to write code that involves intertwining complex algorithms and heavy use of the Python C API :-). There's tons of bug-prone spaghetti in numpy for doing boring things like refcounting, exception passing, and argument parsing. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Or, I might add, yourself, if you are interested in taking over that role. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
Hi, Should [1] be considered a release blocker for 1.8.1? Skipper [1] https://github.com/numpy/numpy/issues/4442 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1 release
On 06.03.2014 19:46, Skipper Seabold wrote: Hi, Should [1] be considered a release blocker for 1.8.1? Skipper [1] https://github.com/numpy/numpy/issues/4442 as far as I can tell its a regression of the 1.8.0 release but not the 1.8.1 release so I wouldn't consider it a blocker. But its definitely a very nice to have. Unfortunately it is probably also complicated and invasive to fix as it would need either modifications of nditer or gufuncs (or a revert to non gufunc) which are both quite complicated pieces of code. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
thanks for the report, thix should be fixed with https://github.com/numpy/numpy/pull/4455 which will be in the final 1.8.1 On 04.03.2014 13:49, Thomas Unterthiner wrote: Hi there! I just tried setting up a new installation using numpy 1.8.1rc1 (+scipy 0.13.3 and matplotlib 1.3.1). I ran into problems when installing matplotlib 1.3.1. The attached logfile shows the full log, but it ends with: src/_png.cpp:329:15: error: ‘npy_PyFile_Dup’ was not declared in this scope if ((fp = npy_PyFile_Dup(py_file, rb))) ^ src/_png.cpp:577:13: error: ‘npy_PyFile_DupClose’ was not declared in this scope if (npy_PyFile_DupClose(py_file, fp)) { ^ error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 The problem went away (and matplotlib installed cleanly) when I re-did the whole shebang using numpy 1.8.0, so I suspect this was caused by something in the rc. Cheers Thomas On 2014-03-03 17:23, Charles R Harris wrote: Hi All, Julian Taylor has put windows binaries and sources for the 1.8.1 release candidate up on sourceforge http://sourceforge.net/projects/numpy/files/NumPy/1.8.1rc1/. If things go well, it will taken to a full release in a week or so. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
Hi, On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Or, I might add, yourself, if you are interested in taking over that role. I don't know the code well enough to be the release manager, but I'm very happy to do the OSX binary builds. So - release manager VP of OSX maybe? Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 12:05 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Or, I might add, yourself, if you are interested in taking over that role. I don't know the code well enough to be the release manager, but I'm very happy to do the OSX binary builds. So - release manager VP of OSX maybe? That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef
is interested in this it would be nice to get someone from Cython involved too. But that's why the primary goal is to produce a proof-of-concept -- even if all that comes out is that we learn that this cannot be done in an acceptable manner, then that's still a succesful (albeit disappointing) result. Otherwise, I am a bit worried about cython being used on the current C code as is, because core and python C API are so interwined (especially multiarray). I don't understand this objection. The whole advantage of Cython is that it makes it much, much easier to write code that involves intertwining complex algorithms and heavy use of the Python C API :-). There's tons of bug-prone spaghetti in numpy for doing boring things like refcounting, exception passing, and argument parsing. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -- Message: 7 Date: Thu, 6 Mar 2014 10:35:15 -0700 From: Charles R Harris charlesr.har...@gmail.com Subject: Re: [Numpy-discussion] 1.8.1rc1 on sourceforge. To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: cab6mnx+btuf3vkxvebfbzkybggp+c6mtnb7zd1ck1bqk9vx...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Chuck -- next part -- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140306/d6534585/attachment.html -- ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion End of NumPy-Discussion Digest, Vol 90, Issue 13 -- From A Letter From The Future in Peak Everything by Richard Heinberg: By the time I was an older teenager, a certain...attitude was developing among the young people...a feeling of utter contempt for anyone over a certain age--maybe 30 or 40. The adults had consumed so many resources, and now there were none left for their own children...when those adults were younger, they [were] just doing what everybody else was doing...they figured it was normal to cut down ancient forests for...phone books, pump every last gallon of oil to power their SUV's...[but] for...my generation all that was just a dim memory...We [grew up] living in darkness, with shortages of food and water, with riots in the streets, with people begging on street corners...for us, the adults were the enemy. Want to *really* understand what's *really* going on? Read Peak Everything. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
Hi, On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Mar 6, 2014 at 12:05 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Or, I might add, yourself, if you are interested in taking over that role. I don't know the code well enough to be the release manager, but I'm very happy to do the OSX binary builds. So - release manager VP of OSX maybe? That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Ralf - any thoughts? pypi is accepting wheels: http://pythonwheels.com/ https://pypi.python.org/pypi/pyzmq/14.0.1 Chris B - any comments here? As for the numpy wheels specifically - I believe the ones I posted are correct - but I would very much like to get feedback. And - yes please for access to the sourceforge site so I can upload the wheels for testing. I'd recommend dropping 10.5 compatibility and going for 10.6. Apple hasn't updated 10.5 since 2009. For example, Firefox dropped support for it in 2012. I do have a couple of machines running 10.5 if you need it though. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef
On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). The idea right now would be to add a `weights` and a `frequencies` keyword arguments to these functions. In more detail: The situation is a bit more complex for `cov` and `corrcoef` than `average`, because there are different types of weights. The current plan would be to add two new keyword arguments: * weights: Uncertainty weights which causes `N` to be recalculated accordingly (This is R's `cov.wt` default I believe). * frequencies: When given, `N = sum(frequencies)` and the values are weighted by their frequency. I don't understand this description at all. One them recalculates N, and the other sets N according to some calculation? Is there a standard reference on how these are supposed to be interpreted? When you talk about per-value uncertainties, I start imagining that we're trying to estimate a population covariance given a set of samples each corrupted by independent measurement noise, and then there's some natural hierarchical Bayesian model one could write down and get an ML estimate of the latent covariance via empirical Bayes or something. But this requires a bunch of assumptions and is that really what we want to do? (Or maybe it collapses down into something simpler if the measurement noise is gaussian or something?) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef
On Thu, Mar 6, 2014 at 2:51 PM, Nathaniel Smith n...@pobox.com wrote: On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). The idea right now would be to add a `weights` and a `frequencies` keyword arguments to these functions. In more detail: The situation is a bit more complex for `cov` and `corrcoef` than `average`, because there are different types of weights. The current plan would be to add two new keyword arguments: * weights: Uncertainty weights which causes `N` to be recalculated accordingly (This is R's `cov.wt` default I believe). * frequencies: When given, `N = sum(frequencies)` and the values are weighted by their frequency. I don't understand this description at all. One them recalculates N, and the other sets N according to some calculation? Is there a standard reference on how these are supposed to be interpreted? When you talk about per-value uncertainties, I start imagining that we're trying to estimate a population covariance given a set of samples each corrupted by independent measurement noise, and then there's some natural hierarchical Bayesian model one could write down and get an ML estimate of the latent covariance via empirical Bayes or something. But this requires a bunch of assumptions and is that really what we want to do? (Or maybe it collapses down into something simpler if the measurement noise is gaussian or something?) I think the idea is that if you write formulas involving correlation or covariance using matrix notation, then these formulas can be generalized in several different ways by inserting some non-negative or positive diagonal matrices into the formulas in various places. The diagonal entries could be called 'weights'. If they are further restricted to sum to 1 then they could be called 'frequencies'. Or maybe this is too cynical and the jargon has a more standard meaning in this context. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, Would anyone decide that other than this group? but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Drop that baby! First, it's bit odd -- as I undertand it, the python.org builds support either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a couple years, and 10.6 is getting pretty darn long in the tooth, the only reason to support that older build is for PPC support - I wonder how many folks are still running PPCs? I thought I was one of the hold outs, and I dropped it over a year ago. I'd love to know if it is something that the community still needs to support. And thanks for doing this Matthew! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 11:38 AM, Matthew Brett matthew.br...@gmail.comwrote: pypi is accepting wheels: http://pythonwheels.com/ https://pypi.python.org/pypi/pyzmq/14.0.1 Chris B - any comments here? It's my understanding that pypi accepts wheels built for the python.orgreleases -- and pip should be able to get the right ones in that case. As far as I know, it's up to the project managers to decide what to put up there. Also, I _think_ that macports, homebrew, and hopefully the Apple builds, won't match to the python.org names, so people won't accidentally get a mis-matched binary wheel. As for the numpy wheels specifically - I believe the ones I posted are correct - but I would very much like to get feedback. I, for one will try to test on a couple machines. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration)
On Wed, Mar 5, 2014 at 9:17 PM, Sturla Molden sturla.mol...@gmail.comwrote: we could use Cython to write some of these things. So the idea of having a NumPy as a pure C library in the core is abandoned? And at some point, there was the idea of a numpy_core library that could be used entirely independently of cPython. I think Enthought did some work on this for MS, to create a .net numpy, maybe? I do still like that idea But there could be a core numpy and a other stuff that is cPython specific layer than Cython would be great for. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg sebast...@sipsolutions.netwrote: On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg sebast...@sipsolutions.net Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: 1394037947.21356.20.camel@sebastian-t440 Content-Type: text/plain; charset=UTF-8 Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). Do you mean adopted? What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out. However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :). +1 for it being too baroque for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as lean and mean as possible, embellishments are what SciPy is for. (Again, IMO.) Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite... AFAIK there's currently no such plan. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 1:32 PM, Chris Barker chris.bar...@noaa.gov wrote: On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, Would anyone decide that other than this group? but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Drop that baby! First, it's bit odd -- as I undertand it, the python.org builds support either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a couple years, and 10.6 is getting pretty darn long in the tooth, the only reason to support that older build is for PPC support - I wonder how many folks are still running PPCs? I thought I was one of the hold outs, and I dropped it over a year ago. I'd love to know if it is something that the community still needs to support. Now that I look on sourceforge, I don't see any OS X 10.5 builds, they are all 10.6+. So that bit of support seems to have dropped in reality, if not officially. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
Hi, On Thu, Mar 6, 2014 at 12:36 PM, Chris Barker chris.bar...@noaa.gov wrote: On Thu, Mar 6, 2014 at 11:38 AM, Matthew Brett matthew.br...@gmail.com wrote: pypi is accepting wheels: http://pythonwheels.com/ https://pypi.python.org/pypi/pyzmq/14.0.1 Chris B - any comments here? It's my understanding that pypi accepts wheels built for the python.org releases -- and pip should be able to get the right ones in that case. As far as I know, it's up to the project managers to decide what to put up there. Also, I _think_ that macports, homebrew, and hopefully the Apple builds, won't match to the python.org names, so people won't accidentally get a mis-matched binary wheel. I believe that the wheels built against python.org python will in any case work with system python. I've just tested the wheel I built [1] on a 10.7 machine in a system python virtualenv - all tests pass. In any case, unless we do something extra, the built wheel won't install into system python by default, because the wheel name can't match the name system python expects. Here I tested the situation I'd expect when the wheel is on pypi, by downloading the wheel to the current directory of a 10.7 machine and: pip install --pre --find-links . numpy pip doesn't accept the wheel and starts a source install. This is because of the platform tag [2]. System python expects a platform tag that matches the result of `distutils.util.get_platform()`. The python.org builds always have `10_6_intel` for this. On a 10.7 machine: $ /Library/Frameworks/Python.framework/Versions/2.7/bin/python -c import distutils.util; print(distutils.util.get_platform()) macosx-10.6-intel $ /Library/Frameworks/Python.framework/Versions/3.3/bin/python3 -c import distutils.util; print(distutils.util.get_platform()) macosx-10.6-intel System python has the actual OSX version. On 10.7 again: $ /usr/bin/python -c import distutils.util; print(distutils.util.get_platform()) macosx-10.7-intel On 10.9: $ /usr/bin/python -c import distutils.util; print(distutils.util.get_platform()) macosx-10.9-intel In fact, if I rename my wheel from `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` to `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.macosx_10_7_intel.whl`, system python will pick up this wheel, but obviously this could get boring for lots of OSX versions, and in any case, it's not really our target market for the wheels. [4] Min RK actually has a pull request in to relax this OSX version specificity [3] because the wheels should be (and seem to be) interoperable, but the take-home is that we're not likely to run into trouble with system python. Cheers, Matthew [1] http://nipy.bic.berkeley.edu/numpy-dist/numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl [2] http://legacy.python.org/dev/peps/pep-0425/ [3] https://github.com/pypa/pip/pull/1465 [4] http://legacy.python.org/dev/peps/pep-0425/#compressed-tag-sets ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 2:10 PM, Charles R Harris charlesr.har...@gmail.comwrote: On Thu, Mar 6, 2014 at 1:32 PM, Chris Barker chris.bar...@noaa.govwrote: On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, Would anyone decide that other than this group? but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Drop that baby! First, it's bit odd -- as I undertand it, the python.org builds support either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a couple years, and 10.6 is getting pretty darn long in the tooth, the only reason to support that older build is for PPC support - I wonder how many folks are still running PPCs? I thought I was one of the hold outs, and I dropped it over a year ago. I'd love to know if it is something that the community still needs to support. Now that I look on sourceforge, I don't see any OS X 10.5 builds, they are all 10.6+. So that bit of support seems to have dropped in reality, if not officially. The last release to support earlier than that was 1.7.1, which supported 10.3 and that has 643 downloads total. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg sebast...@sipsolutions.net Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: 1394037947.21356.20.camel@sebastian-t440 Content-Type: text/plain; charset=UTF-8 Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). Do you mean adopted? What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out. However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :). +1 for it being too baroque for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as lean and mean as possible, embellishments are what SciPy is for. (Again, IMO.) Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite... AFAIK there's currently no such plan. since numpy has taken over all the basic statistics, var, std, cov, corrcoef, and scipy.stats dropped those, I don't see any reason to resurrect them. The only question IMO is which ddof for weighted std, ... statsmodels has the basic statistics with frequency weights, but they are largely in support of t-test and similar hypothesis tests. Josef Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.8.1rc1 on sourceforge.
On Thu, Mar 6, 2014 at 1:14 PM, Matthew Brett matthew.br...@gmail.comwrote: I believe that the wheels built against python.org python will in any case work with system python. IIUC, the system python is built agains an up-to-date SDK. so it wouldn't run on an older OS version -- and why would anyone want it to -- it comes with the system. Our wheels are built with the 10.6 SDK -- OS-X is supposed to be backward compatible, so 10.6 SDK code will run on newer OS versions -- but could there be any clashes if a shared lib is linked in that uses a different SDK than the host application? I have no idea if this is expected to be robust -- though the linker doesn't given an error -- so maybe. I've just tested the wheel I built [1] on a 10.7 machine in a system python virtualenv - all tests pass. Good start. In any case, unless we do something extra, the built wheel won't install into system python by default, because the wheel name can't match the name system python expects. Here I tested the situation I'd expect when the wheel is on pypi, by downloading the wheel to the current directory of a 10.7 machine and: pip install --pre --find-links . numpy pip doesn't accept the wheel and starts a source install. This is because of the platform tag [2]. System python expects a platform tag that matches the result of `distutils.util.get_platform()`. The python.org builds always have `10_6_intel` for this. On a 10.7 machine: System python has the actual OSX version. On 10.7 again: $ /usr/bin/python -c import distutils.util; print(distutils.util.get_platform()) macosx-10.7-intel interesting -- the intel part of that means is SHOULD be a universal binary with 32 and 64 bit in there. And on my 10.7 system, it is. So maybe this should work. In fact, if I rename my wheel from `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` to `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.macosx_10_7_intel.whl`, system python will pick up this wheel, but obviously this could get boring for lots of OSX versions, and in any case, it's not really our target market for the wheels. [4] Exactly, though if we can support it easily, maybe good to do. Min RK actually has a pull request in to relax this OSX version specificity [3] because the wheels should be (and seem to be) interoperable, but the take-home is that we're not likely to run into trouble with system python. If we relax things too much, will we also get homebrew and macports and built-it-myself pythons, and will they work? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
josef.p...@gmail.com wrote: The only question IMO is which ddof for weighted std, ... Something like this? sum_weights - (ddof/float(n))*sum_weights Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef
On Do, 2014-03-06 at 19:51 +, Nathaniel Smith wrote: On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). The idea right now would be to add a `weights` and a `frequencies` keyword arguments to these functions. In more detail: The situation is a bit more complex for `cov` and `corrcoef` than `average`, because there are different types of weights. The current plan would be to add two new keyword arguments: * weights: Uncertainty weights which causes `N` to be recalculated accordingly (This is R's `cov.wt` default I believe). * frequencies: When given, `N = sum(frequencies)` and the values are weighted by their frequency. I don't understand this description at all. One them recalculates N, and the other sets N according to some calculation? Is there a standard reference on how these are supposed to be interpreted? When you talk about per-value uncertainties, I start imagining that we're trying to estimate a population covariance given a set of samples each corrupted by independent measurement noise, and then there's some natural hierarchical Bayesian model one could write down and get an ML estimate of the latent covariance via empirical Bayes or something. But this requires a bunch of assumptions and is that really what we want to do? (Or maybe it collapses down into something simpler if the measurement noise is gaussian or something?) I had really hoped someone who knows this stuff very well would show up ;). I think these weights were uncertainties under gaussian assumption and the other types of weights different, see `aweights` here: http://www.stata.com/support/faqs/statistics/weights-and-summary-statistics/, but I did not check a statistics book or have one here right now (e.g. wikipedia is less than helpful). Frankly unless there is some obviously right thing (for a statistician), I would be careful add such new features. And while I thought before that this might be the case, it isn't clear to me. - Sebastian -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
On Do, 2014-03-06 at 16:30 -0500, josef.p...@gmail.com wrote: On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg sebast...@sipsolutions.net wrote: On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: Date: Wed, 05 Mar 2014 17:45:47 +0100 From: Sebastian Berg sebast...@sipsolutions.net Subject: [Numpy-discussion] Adding weights to cov and corrcoef To: numpy-discussion@scipy.org Message-ID: 1394037947.21356.20.camel@sebastian-t440 Content-Type: text/plain; charset=UTF-8 Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). Do you mean adopted? What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out. However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :). +1 for it being too baroque for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as lean and mean as possible, embellishments are what SciPy is for. (Again, IMO.) Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite... AFAIK there's currently no such plan. since numpy has taken over all the basic statistics, var, std, cov, corrcoef, and scipy.stats dropped those, I don't see any reason to resurrect them. The only question IMO is which ddof for weighted std, ... I am right now a bit unsure about whether or not the weights would be aweights or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions. statsmodels has the basic statistics with frequency weights, but they are largely in support of t-test and similar hypothesis tests. Josef Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
Sebastian Berg sebast...@sipsolutions.net wrote: I am right now a bit unsure about whether or not the weights would be aweights or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions. http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
Sturla Molden sturla.mol...@gmail.com wrote: josef.p...@gmail.com wrote: The only question IMO is which ddof for weighted std, ... Something like this? sum_weights - (ddof/float(n))*sum_weights Please ignore. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)
On Thu, Mar 6, 2014 at 8:38 PM, Sturla Molden sturla.mol...@gmail.com wrote: Sebastian Berg sebast...@sipsolutions.net wrote: I am right now a bit unsure about whether or not the weights would be aweights or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions. http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance just as additional motivation (I'm not into definition of weights right now :) I was just reading a chapter on robust covariance estimation, and one of the steps in many of the procedures requires weighted covariances, and weighted variances. weights are just to reduce the influence of outlying observations. Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding weights to cov and corrcoef
On Thu, Mar 6, 2014 at 2:51 PM, Nathaniel Smith n...@pobox.com wrote: On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg sebast...@sipsolutions.net wrote: Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). The idea right now would be to add a `weights` and a `frequencies` keyword arguments to these functions. In more detail: The situation is a bit more complex for `cov` and `corrcoef` than `average`, because there are different types of weights. The current plan would be to add two new keyword arguments: * weights: Uncertainty weights which causes `N` to be recalculated accordingly (This is R's `cov.wt` default I believe). * frequencies: When given, `N = sum(frequencies)` and the values are weighted by their frequency. I don't understand this description at all. One them recalculates N, and the other sets N according to some calculation? Is there a standard reference on how these are supposed to be interpreted? When you talk about per-value uncertainties, I start imagining that we're trying to estimate a population covariance given a set of samples each corrupted by independent measurement noise, and then there's some natural hierarchical Bayesian model one could write down and get an ML estimate of the latent covariance via empirical Bayes or something. But this requires a bunch of assumptions and is that really what we want to do? (Or maybe it collapses down into something simpler if the measurement noise is gaussian or something?) In general, going mostly based on Stata frequency weights are just a shortcut if you have repeated observations. In my unit tests, the results is the same as using np.repeat IIRC. The total number of observation is the sum of weights. aweights and pweights are mainly like weights in WLS, reflecting the uncertainty of each observation. The number of observations is equal to the number of rows. (Stata internally rescales the weights) one explanation is that observations are measured with different noise, another that observations represent the mean of subsamples with different number of observations. there is an additional degrees of freedom correction in one of the proposed calculations modeled after other packages that I never figured out. (aside: statsmodels does not normalize the scale in WLS, in contrast to Stata, and it is now equivalent to GLS with diagonal sigma. The meaning of weight=1 depends on the user. nobs is number of rows.) no Bayesian analysis involved. but I guess someone could come up with a Bayesian interpretation. I think the two proposed weight types, weights and frequencies, should be able to handle almost all cases. Josef -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion