Re: [Numpy-discussion] NumPy-Discussion OpenBLAS and dotblas
On Wed, Aug 13, 2014 at 12:47 AM, Sturla Molden sturla.mol...@gmail.com wrote: Robert Kern robert.k...@gmail.com wrote: BLAS/LAPACK are heavy dependencies that often give problems, which is why you don't want to require them for the casual user that only needs numpy arrays to make some plots for examples. Maybe we are not talking about the same thing, but isn't blas_lite.c and lapack_lite.c more or less f2c'd versions of reference BLAS and reference LAPACK? Not all of them, no. Just the routines that numpy itself uses. Hence, lite. I thought it got the 'lite' name because Netlib up to LAPACK 3.1.1 had packages named 'lapack-lite-3.1.1.tgz' in addition to 'lapack-3.1.1.tgz'. (I am not sure what the differences between the packages were.) No. https://github.com/numpy/numpy/blob/master/numpy/linalg/lapack_lite/README The lapack_lite.c file looks rather complete, but it seems the build process somehow extracts only parts of it. I assume you mean dlapack_lite.c? It is incomplete. It is the end product of taking the full LAPACK 3.0 distribution, stripping out the routines that are not used in numpy, and f2cing the subset. Go ahead and look for the routines in LAPACK 3.0 systematically, and you will find many of them missing. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.
On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: ah yes, that's also an issue I was trying to deal with. the semantics I prefer in these type of operators, is (as a default), to have every array be treated as a sequence of keys, so if calling unique(arr_2d), youd get unique rows, unless you pass axis=None, in which case the array is flattened. I also agree that the extension you propose here is useful; but ideally, with a little more discussion on these subjects we can converge on an even more comprehensive overhaul On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington joferking...@gmail.com wrote: On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Thanks. Prompted by that stackoverflow question, and similar problems I had to deal with myself, I started working on a much more general extension to numpy's functionality in this space. Like you noted, things get a little panda-y, but I think there is a lot of panda's functionality that could or should be part of the numpy core, a robust set of grouping operations in particular. see pastebin here: http://pastebin.com/c5WLWPbp On a side note, this is related to a pull request of mine from awhile back: https://github.com/numpy/numpy/pull/3584 There was a lot of disagreement on the mailing list about what to call a unique slices along a given axis function, so I wound up closing the pull request pending more discussion. At any rate, I think it's a useful thing to have in base numpy. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Update: I renamed the function to `table` in the pull request: https://github.com/numpy/numpy/pull/4958 Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.
The ever-wonderful pylab mode in matplotlib has a table function for plotting a table of text in a plot. If I remember correctly, what would happen is that matplotlib's table() function will simply obliterate the numpy's table function. This isn't a show-stopper, I just wanted to point that out. Personally, while I wasn't a particular fan of count_unique because I wouldn't necessarially think of it when needing a contingency table, I do like that it is verb-ish. table(), in this sense, is not a verb. That said, I am perfectly fine with it if you are fine with the name collision in pylab mode. On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser warren.weckes...@gmail.com wrote: On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: ah yes, that's also an issue I was trying to deal with. the semantics I prefer in these type of operators, is (as a default), to have every array be treated as a sequence of keys, so if calling unique(arr_2d), youd get unique rows, unless you pass axis=None, in which case the array is flattened. I also agree that the extension you propose here is useful; but ideally, with a little more discussion on these subjects we can converge on an even more comprehensive overhaul On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington joferking...@gmail.com wrote: On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Thanks. Prompted by that stackoverflow question, and similar problems I had to deal with myself, I started working on a much more general extension to numpy's functionality in this space. Like you noted, things get a little panda-y, but I think there is a lot of panda's functionality that could or should be part of the numpy core, a robust set of grouping operations in particular. see pastebin here: http://pastebin.com/c5WLWPbp On a side note, this is related to a pull request of mine from awhile back: https://github.com/numpy/numpy/pull/3584 There was a lot of disagreement on the mailing list about what to call a unique slices along a given axis function, so I wound up closing the pull request pending more discussion. At any rate, I think it's a useful thing to have in base numpy. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Update: I renamed the function to `table` in the pull request: https://github.com/numpy/numpy/pull/4958 Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.
On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root ben.r...@ou.edu wrote: The ever-wonderful pylab mode in matplotlib has a table function for plotting a table of text in a plot. If I remember correctly, what would happen is that matplotlib's table() function will simply obliterate the numpy's table function. This isn't a show-stopper, I just wanted to point that out. Personally, while I wasn't a particular fan of count_unique because I wouldn't necessarially think of it when needing a contingency table, I do like that it is verb-ish. table(), in this sense, is not a verb. That said, I am perfectly fine with it if you are fine with the name collision in pylab mode. Thanks for pointing that out. I only changed it to have something that sounded more table-ish, like the Pandas, R and Matlab functions. I won't update it right now, but if there is interest in putting it into numpy, I'll rename it to avoid the pylab conflict. Anything along the lines of `crosstab`, `xtable`, etc., would be fine with me. Warren On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser warren.weckes...@gmail.com wrote: On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: ah yes, that's also an issue I was trying to deal with. the semantics I prefer in these type of operators, is (as a default), to have every array be treated as a sequence of keys, so if calling unique(arr_2d), youd get unique rows, unless you pass axis=None, in which case the array is flattened. I also agree that the extension you propose here is useful; but ideally, with a little more discussion on these subjects we can converge on an even more comprehensive overhaul On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington joferking...@gmail.com wrote: On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Thanks. Prompted by that stackoverflow question, and similar problems I had to deal with myself, I started working on a much more general extension to numpy's functionality in this space. Like you noted, things get a little panda-y, but I think there is a lot of panda's functionality that could or should be part of the numpy core, a robust set of grouping operations in particular. see pastebin here: http://pastebin.com/c5WLWPbp On a side note, this is related to a pull request of mine from awhile back: https://github.com/numpy/numpy/pull/3584 There was a lot of disagreement on the mailing list about what to call a unique slices along a given axis function, so I wound up closing the pull request pending more discussion. At any rate, I think it's a useful thing to have in base numpy. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Update: I renamed the function to `table` in the pull request: https://github.com/numpy/numpy/pull/4958 Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.
Its pretty easy to implement this table functionality and more on top of the code I linked above. I still think such a comprehensive overhaul of arraysetops is worth discussing. import numpy as np import grouping x = [1, 1, 1, 1, 2, 2, 2, 2, 2] y = [3, 4, 3, 3, 3, 4, 5, 5, 5] z = np.random.randint(0,2,(9,2)) def table(*keys): desired table implementation, building on the index object cleaner, and more functionality performance should be the same indices = [grouping.as_index(k, axis=0) for k in keys] uniques = [i.unique for i in indices] inverses = [i.inverse for i in indices] shape= [i.groups for i in indices] t = np.zeros(shape, np.int) np.add.at(t, inverses, 1) return tuple(uniques), t #here is how to use print table(x,y) #but we can use fancy keys as well; here a composite key and a row-key print table((x,y), z) #this effectively creates a sparse matrix equivalent of your desired table print grouping.count((x,y)) On Wed, Aug 13, 2014 at 11:25 PM, Warren Weckesser warren.weckes...@gmail.com wrote: On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root ben.r...@ou.edu wrote: The ever-wonderful pylab mode in matplotlib has a table function for plotting a table of text in a plot. If I remember correctly, what would happen is that matplotlib's table() function will simply obliterate the numpy's table function. This isn't a show-stopper, I just wanted to point that out. Personally, while I wasn't a particular fan of count_unique because I wouldn't necessarially think of it when needing a contingency table, I do like that it is verb-ish. table(), in this sense, is not a verb. That said, I am perfectly fine with it if you are fine with the name collision in pylab mode. Thanks for pointing that out. I only changed it to have something that sounded more table-ish, like the Pandas, R and Matlab functions. I won't update it right now, but if there is interest in putting it into numpy, I'll rename it to avoid the pylab conflict. Anything along the lines of `crosstab`, `xtable`, etc., would be fine with me. Warren On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser warren.weckes...@gmail.com wrote: On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: ah yes, that's also an issue I was trying to deal with. the semantics I prefer in these type of operators, is (as a default), to have every array be treated as a sequence of keys, so if calling unique(arr_2d), youd get unique rows, unless you pass axis=None, in which case the array is flattened. I also agree that the extension you propose here is useful; but ideally, with a little more discussion on these subjects we can converge on an even more comprehensive overhaul On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington joferking...@gmail.com wrote: On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn hoogendoorn.ee...@gmail.com wrote: Thanks. Prompted by that stackoverflow question, and similar problems I had to deal with myself, I started working on a much more general extension to numpy's functionality in this space. Like you noted, things get a little panda-y, but I think there is a lot of panda's functionality that could or should be part of the numpy core, a robust set of grouping operations in particular. see pastebin here: http://pastebin.com/c5WLWPbp On a side note, this is related to a pull request of mine from awhile back: https://github.com/numpy/numpy/pull/3584 There was a lot of disagreement on the mailing list about what to call a unique slices along a given axis function, so I wound up closing the pull request pending more discussion. At any rate, I think it's a useful thing to have in base numpy. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Update: I renamed the function to `table` in the pull request: https://github.com/numpy/numpy/pull/4958 Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion