Re: [Numpy-discussion] Move scipy.org docs to Github?
I have always put my docs on Amazon S3 (examples: http://mdtraj.org/1.8.0/ , .http://msmbuilder.org/3.7.0/) For static webpages, you can't beat the cost, and there's a lot of tooling in the wild for uploading pages to S3. It might be an option to consider. -Robert On Thu, Mar 16, 2017 at 5:08 PM, Pauli Virtanen <p...@iki.fi> wrote: > Thu, 16 Mar 2017 08:15:08 +0100, Didrik Pinte kirjoitti: > >> The advantage of something like github pages is that it's big enough > >> that it *does* have dedicated ops support. > > > > Agreed. One issue is that we are working with a lot of legacy. Github > > will more than likely be a great solution to host static web pages but > > the evaluation for the shift needs to get into all the funky legacy > > redirects/rewrites we have in place, etc. This is probably not a real > > issue for docs.scipy.org but would be for other services. > > IIRC, there's not that many of them, so in principle it could be possible > to cobble them with redirects. > > >> As long as we can fit under the 1 gig size limit then GH pages seems > >> like the best option so far... it's reliable, widely understood, and > >> all of the limits besides the 1 gig size are soft limits where they say > >> they'll work with us to figure things out. > > > > Another option would be to just host the content under S3 with > > Cloudfront. > > It will also be pretty simple as a setup, scale nicely and won't have > > much restrictions on sizing. > > Some minor-ish disadvantages of this are that it brings a new set of > credentials to manage, it will be somewhat less transparent, and the > tooling will be less familiar to people (eg release managers) who have to > deal with it. > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2017.1
I am pleased to announce release 2017.1 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (limited support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker: https://github.com/sfepy/sfepy Highlights of this release -- - spline-box parametrization of an arbitrary field - conda-forge recipe (thanks to Daniel Wheeler) - fixes for Python 3.6 For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Cheers, Robert Cimrman --- Contributors to this release in alphabetical order: Siwei Chen Robert Cimrman Jan Heczko Vladimir Lukes Matyas Novak ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fortran order in recarray.
Just as a note, Appveyor supports uploading modules to "public websites": https://packaging.python.org/appveyor/ The main issue I would see from this, is the PyPi has my password stored on my machine in a plain text file. I'm not sure whether there's a way to provide Appveyor with a SSH key instead. On Wed, Feb 22, 2017 at 4:23 PM, Alex Rogozhnikov < alex.rogozhni...@yandex.ru> wrote: > Hi Francesc, > thanks a lot for you reply and for your impressive job on bcolz! > > Bcolz seems to make stress on compression, which is not of much interest > for me, but the *ctable*, and chunked operations look very appropriate to > me now. (Of course, I'll need to test it much before I can say this for > sure, that's current impression). > > The strongest concern with bcolz so far is that it seems to be completely > non-trivial to install on windows systems, while pip provides binaries for > most (or all?) OS for numpy. > I didn't build pip binary wheels myself, but is it hard / impossible to > cook pip-installabel binaries? > > You can change shapes of numpy arrays, but that usually involves copies > of the whole container. > > sure, but this is ok for me, as I plan to organize column editing in > 'batches', so this should require seldom copying. > It would be nice to see an example to understand how deep I need to go > inside numpy. > > Cheers, > Alex. > > > > > 22 февр. 2017 г., в 17:03, Francesc Alted <fal...@gmail.com> написал(а): > > Hi Alex, > > 2017-02-22 12:45 GMT+01:00 Alex Rogozhnikov <alex.rogozhni...@yandex.ru>: > >> Hi Nathaniel, >> >> >> pandas >> >> >> yup, the idea was to have minimal pandas.DataFrame-like storage (which I >> was using for a long time), >> but without irritating problems with its row indexing and some other >> problems like interaction with matplotlib. >> >> A dict of arrays? >> >> >> that's what I've started from and implemented, but at some point I >> decided that I'm reinventing the wheel and numpy has something already. In >> principle, I can ignore this 'column-oriented' storage requirement, but >> potentially it may turn out to be quite slow-ish if dtype's size is large. >> >> Suggestions are welcome. >> > > You may want to try bcolz: > > https://github.com/Blosc/bcolz > > bcolz is a columnar storage, basically as you require, but data is > compressed by default even when stored in-memory (although you can disable > compression if you want to). > > > >> >> Another strange question: >> in general, it is considered that once numpy.array is created, it's shape >> not changed. >> But if i want to keep the same recarray and change it's dtype and/or >> shape, is there a way to do this? >> > > You can change shapes of numpy arrays, but that usually involves copies > of the whole container. With bcolz you can change length and add/del > columns without copies. If your containers are large, it is better to > inform bcolz on its final estimated size. See: > > http://bcolz.blosc.org/en/latest/opt-tips.html > > Francesc > > >> >> Thanks, >> Alex. >> >> >> >> 22 февр. 2017 г., в 3:53, Nathaniel Smith <n...@pobox.com> написал(а): >> >> On Feb 21, 2017 3:24 PM, "Alex Rogozhnikov" <alex.rogozhni...@yandex.ru> >> wrote: >> >> Ah, got it. Thanks, Chris! >> I thought recarray can be only one-dimensional (like tables with named >> columns). >> >> Maybe it's better to ask directly what I was looking for: >> something that works like a table with named columns (but no labelling >> for rows), and keeps data (of different dtypes) in a column-by-column way >> (and this is numpy, not pandas). >> >> Is there such a magic thing? >> >> >> Well, that's what pandas is for... >> >> A dict of arrays? >> >> -n >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Francesc Alted > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: NumExpr3 Alpha
Hi Juan, A guy on reddit suggested looking at SymPy for just such a thing. I know that Dask also represents its process as a graph. https://www.reddit.com/r/Python/comments/5um04m/numexpr3/ I'll think about it some more but it seems a little abstract still. To a certain extent the NE3 compiler already works this way. The compiler has a dictionary in which keys are `ast.Node` types, and each value is a function pointer, which knows how to handle that particular node. Providing an external interface to this would be the most natural extension. There's quite a few things to do before I would think about a functional interface. The things I mentioned in my original mail; pickling of the C-object so that it can be using within modules like `multiprocessing`; having a pre-allocated shared memory region shared among threads for temporaries and parameters, etc. If someone else wants to dabble in it they are welcome to. Robert On Sun, Feb 19, 2017 at 4:19 AM, Juan Nunez-Iglesias <jni.s...@gmail.com> wrote: > Hi everyone, > > Thanks for this. It looks absolutely fantastic. I've been putting off > using numexpr but it looks like I don't have a choice anymore. ;) > > Regarding feature requests, I've always found it off putting that I have > to wrap my expressions in a string to speed them up. Has anyone explored > the possibility of using Python 3.6's frame evaluation API to do this? I > remember a vague discussion on this list a while back but I don't know > whether anything came of it. > > Thanks! > > Juan. > > On 18 Feb 2017, 3:42 AM +1100, Robert McLeod <robbmcl...@gmail.com>, > wrote: > > Hi David, > > Thanks for your comments, reply below the fold. > > On Fri, Feb 17, 2017 at 4:34 PM, Daπid <davidmen...@gmail.com> wrote: > >> This is very nice indeed! >> >> On 17 February 2017 at 12:15, Robert McLeod <robbmcl...@gmail.com> wrote: >> > * bytes and unicode support >> > * reductions (mean, sum, prod, std) >> >> I use both a lot, maybe I can help you get them working. >> >> Also, regarding "Vectorization hasn't been done yet with cmath >> functions for real numbers (such as sqrt(), exp(), etc.), only for >> complex functions". What is the bottleneck? Is it in GCC or just >> someone has to sit down and adapt it? > > > I just haven't done it yet. Basically I'm moving from Switzerland to > Canada in a week so this was the gap to push something out that's usable if > not perfect. Rather I just import cmath functions, which are inlined but I > suspect what's needed is to break them down into their components. For > example, the complex arccos function looks like this: > > static void > nc_acos( npy_intp n, npy_complex64 *x, npy_complex64 *r) > { > npy_complex64 a; > for( npy_intp I = 0; I < n; I++ ) { > a = x[I]; > _inline_mul( x[I], x[I], r[I] ); > _inline_sub( Z_1, r[I], r[I] ); > _inline_sqrt( r[I], r[I] ); > _inline_muli( r[I], r[I] ); > _inline_add( a, r[I], r[I] ); > _inline_log( r[I] , r[I] ); > _inline_muli( r[I], r[I] ); > _inline_neg( r[I], r[I]); > } > } > > I haven't sat down and inspected whether the cmath versions get > vectorized, but there's not a huge speed difference between NE2 and 3 for > such a function on float (but their is for complex), so my suspicion is > they aren't. Another option would be to add a library such as Yeppp! as > LIB_YEPPP or some other library that's faster than glib. For example the > glib function "fma(a,b,c)" is slower than doing "a*b+c" in NE3, and that's > not how it should be. Yeppp is also built with Python generating C code, > so it could either be very easy or very hard. > > On bytes and unicode, I haven't seen examples for how people use it, so > I'm not sure where to start. Since there's practically not a limitation on > the number of operations now (the library is 1.3 MB now, compared to 1.2 MB > for NE2 with gcc 5.4) the string functions could grow significantly from > what we have in NE2. > > With regards to reductions, NumExpr never multi-threaded them, and could > only do outer reductions, so in the end there was no speed advantage to be > had compared to having NumPy do them on the result. I suspect the primary > value there was in PyTables and Pandas where the expression had to do > everything. One of the things I've moved away from in NE3 is doing output > buffering (rather it pre-allocates the output array), so for reductions the > understanding NumExpr has of broadcasting would have to be deeper. > > In any event contributions would certainly be welcome. > > Robert > > -- > Robert McLeod, Ph.D. >
Re: [Numpy-discussion] ANN: NumExpr3 Alpha
Hi David, Thanks for your comments, reply below the fold. On Fri, Feb 17, 2017 at 4:34 PM, Daπid <davidmen...@gmail.com> wrote: > This is very nice indeed! > > On 17 February 2017 at 12:15, Robert McLeod <robbmcl...@gmail.com> wrote: > > * bytes and unicode support > > * reductions (mean, sum, prod, std) > > I use both a lot, maybe I can help you get them working. > > Also, regarding "Vectorization hasn't been done yet with cmath > functions for real numbers (such as sqrt(), exp(), etc.), only for > complex functions". What is the bottleneck? Is it in GCC or just > someone has to sit down and adapt it? I just haven't done it yet. Basically I'm moving from Switzerland to Canada in a week so this was the gap to push something out that's usable if not perfect. Rather I just import cmath functions, which are inlined but I suspect what's needed is to break them down into their components. For example, the complex arccos function looks like this: static void nc_acos( npy_intp n, npy_complex64 *x, npy_complex64 *r) { npy_complex64 a; for( npy_intp I = 0; I < n; I++ ) { a = x[I]; _inline_mul( x[I], x[I], r[I] ); _inline_sub( Z_1, r[I], r[I] ); _inline_sqrt( r[I], r[I] ); _inline_muli( r[I], r[I] ); _inline_add( a, r[I], r[I] ); _inline_log( r[I] , r[I] ); _inline_muli( r[I], r[I] ); _inline_neg( r[I], r[I]); } } I haven't sat down and inspected whether the cmath versions get vectorized, but there's not a huge speed difference between NE2 and 3 for such a function on float (but their is for complex), so my suspicion is they aren't. Another option would be to add a library such as Yeppp! as LIB_YEPPP or some other library that's faster than glib. For example the glib function "fma(a,b,c)" is slower than doing "a*b+c" in NE3, and that's not how it should be. Yeppp is also built with Python generating C code, so it could either be very easy or very hard. On bytes and unicode, I haven't seen examples for how people use it, so I'm not sure where to start. Since there's practically not a limitation on the number of operations now (the library is 1.3 MB now, compared to 1.2 MB for NE2 with gcc 5.4) the string functions could grow significantly from what we have in NE2. With regards to reductions, NumExpr never multi-threaded them, and could only do outer reductions, so in the end there was no speed advantage to be had compared to having NumPy do them on the result. I suspect the primary value there was in PyTables and Pandas where the expression had to do everything. One of the things I've moved away from in NE3 is doing output buffering (rather it pre-allocates the output array), so for reductions the understanding NumExpr has of broadcasting would have to be deeper. In any event contributions would certainly be welcome. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 <061%20387%2032%2025> robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: NumExpr3 Alpha
- * vectorize real functions (such as exp, sqrt, log) similar to the complex_functions.hpp vectorization. * Add a keyword (likely 'yield') to indicate that a token is intended to be changed by a generator inside a loop with each call to NumExpr.run() If you have any thoughts or find any issues please don't hesitate to open an issue at the Github repo. Although unit tests have been run over the operation space there are undoubtedly a number of bugs to squash. Sincerely, Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 <061%20387%2032%2025> robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] composing Euler rotation matrices
Instead of trying to decipher what someone wrote on a Wikipedia, why don't you look at a working piece of source code? e.g. https://github.com/3dem/relion/blob/master/src/euler.cpp Robert On Wed, Feb 1, 2017 at 4:27 AM, Seb <splu...@gmail.com> wrote: > On Tue, 31 Jan 2017 21:23:55 -0500, > Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com> wrote: > > > Could you show what you are doing to get the statement "However, I > > cannot reproduce this matrix via composition; i.e. by multiplying the > > underlying rotation matrices.". I would guess something involving the > > `*` operator instead of `@`, but guessing probably won't help you > > solve your issue. > > Sure, although composition is not something I can take credit for, as > it's a well-described operation for generating linear transformations. > It is the matrix multiplication of two or more transformation matrices. > In the case of Euler transformations, it's matrices specifying rotations > around 3 orthogonal axes by 3 given angles. I'm using `numpy.dot' to > perform matrix multiplication on 2D arrays representing matrices. > > However, it's not obvious from the link I provided what particular > rotation matrices are multiplied and in what order (i.e. what > composition) is used to arrive at the Z1Y2X3 rotation matrix shown. > Perhaps I'm not understanding the conventions used therein. This is one > of my attempts at reproducing that rotation matrix via composition: > > ---<cut here---start-- > ->--- > import numpy as np > > angles = np.radians(np.array([30, 20, 10])) > > def z1y2x3(alpha, beta, gamma): > """Z1Y2X3 rotation matrix given Euler angles""" > return np.array([[np.cos(alpha) * np.cos(beta), > np.cos(alpha) * np.sin(beta) * np.sin(gamma) - > np.cos(gamma) * np.sin(alpha), > np.sin(alpha) * np.sin(gamma) + > np.cos(alpha) * np.cos(gamma) * np.sin(beta)], > [np.cos(beta) * np.sin(alpha), > np.cos(alpha) * np.cos(gamma) + > np.sin(alpha) * np.sin(beta) * np.sin(gamma), > np.cos(gamma) * np.sin(alpha) * np.sin(beta) - > np.cos(alpha) * np.sin(gamma)], > [-np.sin(beta), np.cos(beta) * np.sin(gamma), > np.cos(beta) * np.cos(gamma)]]) > > euler_mat = z1y2x3(angles[0], angles[1], angles[2]) > > ## Now via composition > > def rotation_matrix(theta, axis, active=False): > """Generate rotation matrix for a given axis > > Parameters > -- > > theta: numeric, optional > The angle (degrees) by which to perform the rotation. Default is > 0, which means return the coordinates of the vector in the rotated > coordinate system, when rotate_vectors=False. > axis: int, optional > Axis around which to perform the rotation (x=0; y=1; z=2) > active: bool, optional > Whether to return active transformation matrix. > > Returns > --- > numpy.ndarray > 3x3 rotation matrix > """ > theta = np.radians(theta) > if axis == 0: > R_theta = np.array([[1, 0, 0], > [0, np.cos(theta), -np.sin(theta)], > [0, np.sin(theta), np.cos(theta)]]) > elif axis == 1: > R_theta = np.array([[np.cos(theta), 0, np.sin(theta)], > [0, 1, 0], > [-np.sin(theta), 0, np.cos(theta)]]) > else: > R_theta = np.array([[np.cos(theta), -np.sin(theta), 0], > [np.sin(theta), np.cos(theta), 0], > [0, 0, 1]]) > if active: > R_theta = np.transpose(R_theta) > return R_theta > > ## The rotations are given as active > xmat = rotation_matrix(angles[2], 0, active=True) > ymat = rotation_matrix(angles[1], 1, active=True) > zmat = rotation_matrix(angles[0], 2, active=True) > ## The operation seems to imply this composition > euler_comp_mat = np.dot(xmat, np.dot(ymat, zmat)) > ---<cut here---end > ->--- > > I believe the matrices `euler_mat' and `euler_comp_mat' should be the > same, but they aren't, so it's unclear to me what particular composition > is meant to produce the matrix specified by this Z1Y2X3 transformation. > What am I missing? > > -- > Seb > > ___ > NumPy-Discussion mailing list > NumP
Re: [Numpy-discussion] Question about numpy.random.choice with probabilties
On Mon, Jan 23, 2017 at 9:41 AM, Nadav Har'El <n...@scylladb.com> wrote: > > On Mon, Jan 23, 2017 at 4:52 PM, aleba...@gmail.com <aleba...@gmail.com> wrote: >> >> 2017-01-23 15:33 GMT+01:00 Robert Kern <robert.k...@gmail.com>: >>> >>> I don't object to some Notes, but I would probably phrase it more like we are providing the standard definition of the jargon term "sampling without replacement" in the case of non-uniform probabilities. To my mind (or more accurately, with my background), "replace=False" obviously picks out the implemented procedure, and I would have been incredibly surprised if it did anything else. If the option were named "unique=True", then I would have needed some more documentation to let me know exactly how it was implemented. >>> >> FWIW, I totally agree with Robert > > With my own background (MSc. in Mathematics), I agree that this algorithm is indeed the most natural one. And as I said, when I wanted to implement something myself when I wanted to choose random combinations (k out of n items), I wrote exactly the same one. But when it didn't produce the desired probabilities (even in cases where I knew that doing this was possible), I wrongly assumed numpy would do things differently - only to realize it uses exactly the same algorithm. So clearly, the documentation didn't quite explain what it does or doesn't do. In my experience, I have seen "without replacement" mean only one thing. If the docstring had said "returns unique items", I'd agree that it doesn't explain what it does or doesn't do. The only issue is that "without replacement" is jargon, and it is good to recapitulate the definitions of such terms for those who aren't familiar with them. > Also, Robert, I'm curious: beyond explaining why the existing algorithm is reasonable (which I agree), could you give me an example of where it is actually *useful* for sampling? The references I previously quoted list a few. One is called "multistage sampling proportional to size". The idea being that you draw (without replacement) from a larger units (say, congressional districts) before sampling within them. It is similar to the situation you outline, but it is probably more useful at a different scale, like lots of larger units (where your algorithm is likely to provide no solution) rather than a handful. It is probably less useful in terms of survey design, where you are trying to *design* a process to get a result, than it is in queueing theory and related fields, where you are trying to *describe* and simulate a process that is pre-defined. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about numpy.random.choice with probabilties
On Mon, Jan 23, 2017 at 9:22 AM, Anne Archibald <peridot.face...@gmail.com> wrote: > > > On Mon, Jan 23, 2017 at 3:34 PM Robert Kern <robert.k...@gmail.com> wrote: >> >> I don't object to some Notes, but I would probably phrase it more like we are providing the standard definition of the jargon term "sampling without replacement" in the case of non-uniform probabilities. To my mind (or more accurately, with my background), "replace=False" obviously picks out the implemented procedure, and I would have been incredibly surprised if it did anything else. If the option were named "unique=True", then I would have needed some more documentation to let me know exactly how it was implemented. > > > It is what I would have expected too, but we have a concrete example of a user who expected otherwise; where one user speaks up, there are probably more who didn't (some of whom probably have code that's not doing what they think it does). So for the cost of adding a Note, why not help some of them? That's why I said I'm fine with adding a Note. I'm just suggesting a re-wording so that the cautious language doesn't lead anyone who is familiar with the jargon to think we're doing something ad hoc while still providing the details for those who aren't so familiar. > As for the standardness of the definition: I don't know, have you a reference where it is defined? More natural to me would be to have a list of items with integer multiplicities (as in: "cat" 3 times, "dog" 1 time). I'm hesitant to claim ours is a standard definition unless it's in a textbook somewhere. But I don't insist on my phrasing. Textbook, I'm not so sure, but it is the *only* definition I've ever encountered in the literature: http://epubs.siam.org/doi/abs/10.1137/0209009 http://www.sciencedirect.com/science/article/pii/S002001900500298X -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about numpy.random.choice with probabilties
On Mon, Jan 23, 2017 at 6:27 AM, Anne Archibald <peridot.face...@gmail.com> wrote: > > On Wed, Jan 18, 2017 at 4:13 PM Nadav Har'El <n...@scylladb.com> wrote: >> >> On Wed, Jan 18, 2017 at 4:30 PM, <josef.p...@gmail.com> wrote: >>> >>>> Having more sampling schemes would be useful, but it's not possible to implement sampling schemes with impossible properties. >>> >>> BTW: sampling 3 out of 3 without replacement is even worse >>> >>> No matter what sampling scheme and what selection probabilities we use, we always have every element with probability 1 in the sample. >> >> I agree. The random-sample function of the type I envisioned will be able to reproduce the desired probabilities in some cases (like the example I gave) but not in others. Because doing this correctly involves a set of n linear equations in comb(n,k) variables, it can have no solution, or many solutions, depending on the n and k, and the desired probabilities. A function of this sort could return an error if it can't achieve the desired probabilities. > > It seems to me that the basic problem here is that the numpy.random.choice docstring fails to explain what the function actually does when called with weights and without replacement. Clearly there are different expectations; I think numpy.random.choice chose one that is easy to explain and implement but not necessarily what everyone expects. So the docstring should be clarified. Perhaps a Notes section: > > When numpy.random.choice is called with replace=False and non-uniform probabilities, the resulting distribution of samples is not obvious. numpy.random.choice effectively follows the procedure: when choosing the kth element in a set, the probability of element i occurring is p[i] divided by the total probability of all not-yet-chosen (and therefore eligible) elements. This approach is always possible as long as the sample size is no larger than the population, but it means that the probability that element i occurs in the sample is not exactly p[i]. I don't object to some Notes, but I would probably phrase it more like we are providing the standard definition of the jargon term "sampling without replacement" in the case of non-uniform probabilities. To my mind (or more accurately, with my background), "replace=False" obviously picks out the implemented procedure, and I would have been incredibly surprised if it did anything else. If the option were named "unique=True", then I would have needed some more documentation to let me know exactly how it was implemented. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve
On Mon, Jan 9, 2017 at 7:10 PM, Ilhan Polat <ilhanpo...@gmail.com> wrote: > > Yes, that's precisely the case but when we know the structure we can just choose the appropriate solver anyhow with a little bit of overhead. What I mean is that, to my knowledge, FORTRAN routines for checking for triangularness etc. are absent. I'm responding to that. The reason that they don't have those FORTRAN routines for testing for structure inside of a generic dense matrix is that in FORTRAN it's more natural (and efficient) to just use the explicit packed structure and associated routines instead. You would only use a generic dense matrix if you know that there isn't structure in the matrix. So there are no routines for detecting that structure in generic dense matrices. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: Backslash operator A\b and np/sp.linalg.solve
On Mon, Jan 9, 2017 at 5:09 PM, Ilhan Polat <ilhanpo...@gmail.com> wrote: > So every test in the polyalgorithm is cheaper than the next one. I'm not exactly sure what might be the best strategy yet hence the question. It's really interesting that LAPACK doesn't have this type of fast checks. In Fortran LAPACK, if you have a special structured matrix, you usually explicitly use packed storage and call the appropriate function type on it. It's only when you go to a system that only has a generic, unstructured dense matrix data type that it makes sense to do those kinds of checks. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2016.4
I am pleased to announce release 2016.4 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (limited support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker: https://github.com/sfepy/sfepy Highlights of this release -- - support tensor product element meshes with one-level hanging nodes - improve homogenization support for large deformations - parallel calculation of homogenized coefficients and related sub-problems - evaluation of second derivatives of Lagrange basis functions For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Cheers, Robert Cimrman --- Contributors to this release in alphabetical order: Robert Cimrman Vladimir Lukes Matyas Novak ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] array comprehension
On Fri, Nov 4, 2016 at 6:36 AM, Neal Becker <ndbeck...@gmail.com> wrote: > > Francesc Alted wrote: > > > 2016-11-04 13:06 GMT+01:00 Neal Becker <ndbeck...@gmail.com>: > > > >> I find I often write: > >> np.array ([some list comprehension]) > >> > >> mainly because list comprehensions are just so sweet. > >> > >> But I imagine this isn't particularly efficient. > >> > > > > Right. Using a generator and np.fromiter() will avoid the creation of the > > intermediate list. Something like: > > > > np.fromiter((i for i in range(x))) # use xrange for Python 2 > > > > > Does this generalize to >1 dimensions? No. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] missing from contributor list?
Because Github (or maybe git) doesn't track the history of the file through all of the renames. It is only reporting the contributors of changes to the file at its current location. If you go back to the time just prior to the commit that renamed the file, you do show up in the list: https://github.com/numpy/numpy/blob/f179ec92d8ddb0dc5f7445255022be5c4765a704/numpy/build_utils/src/apple_sgemv_fix.c On Wed, Nov 2, 2016 at 3:38 PM, Sturla Molden <sturla.mol...@gmail.com> wrote: > Why am I missing from the contributor hist here? > > https://github.com/numpy/numpy/blob/master/numpy/_ > build_utils/src/apple_sgemv_fix.c > > > Sturla > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to use user input as equation directly
On Thu, Oct 27, 2016 at 11:35 PM, Benjamin Root <ben.v.r...@gmail.com> wrote: > Perhaps the numexpr package might be safer? Not exactly meant for this > situation (meant for optimizations), but the evaluator is pretty darn safe. > > It would not be able to evaluate something like 'np.arange(50)' for example, since it only has a limited subset of numpy functionality. In the example provided that or linspace is likely the natural input for the variable 't'. -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
On Thu, Oct 27, 2016 at 10:45 AM, Todd <toddr...@gmail.com> wrote: > > On Thu, Oct 27, 2016 at 12:12 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >> Ever notice how Anaconda doesn't provide pyfftw? They can't legally ship both MKL and pyfftw, and they picked MKL. > > Anaconda does ship GPL code [1]. They even ship GPL code that depends on numpy, such as cvxcanon and pystan, and there doesn't seem to be anything that prevents me from installing them alongside the MKL version of numpy. So I don't see how it would be any different for pyfftw. I think we've exhausted the relevance of this tangent to Oleksander's contributions. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
Releasing NumPy under GPL would make it incompatible with SciPy, which may be _slightly_ inconvenient to the scientific Python community: https://scipy.github.io/old-wiki/pages/License_Compatibility.html https://mail.scipy.org/pipermail/scipy-dev/2013-August/019149.html Robert On Thu, Oct 27, 2016 at 5:14 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > On 10/27/2016 04:52 PM, Todd wrote: > >> On Thu, Oct 27, 2016 at 10:43 AM, Julian Taylor >> <jtaylor.deb...@googlemail.com <mailto:jtaylor.deb...@googlemail.com>> >> wrote: >> >> On 10/27/2016 04:30 PM, Todd wrote: >> >> On Thu, Oct 27, 2016 at 4:25 AM, Ralf Gommers >> <ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com> >> <mailto:ralf.gomm...@gmail.com <mailto:ralf.gomm...@gmail.com>>> >> wrote: >> >> >> On Thu, Oct 27, 2016 at 10:25 AM, Pavlyk, Oleksandr >> <oleksandr.pav...@intel.com >> <mailto:oleksandr.pav...@intel.com> >> <mailto:oleksandr.pav...@intel.com >> <mailto:oleksandr.pav...@intel.com>>> wrote: >> >> Please see responses inline. >> >> >> >> *From:*NumPy-Discussion >> [mailto:numpy-discussion-boun...@scipy.org >> <mailto:numpy-discussion-boun...@scipy.org> >> <mailto:numpy-discussion-boun...@scipy.org >> <mailto:numpy-discussion-boun...@scipy.org>>] *On Behalf Of *Todd >> *Sent:* Wednesday, October 26, 2016 4:04 PM >> *To:* Discussion of Numerical Python >> <numpy-discussion@scipy.org <mailto:numpy-discussion@scipy.org> >> <mailto:numpy-discussion@scipy.org >> <mailto:numpy-discussion@scipy.org>>> >> *Subject:* Re: [Numpy-discussion] Intel random number >> package >> >> >> >> >> On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr >> <oleksandr.pav...@intel.com >> <mailto:oleksandr.pav...@intel.com> >> <mailto:oleksandr.pav...@intel.com >> >> <mailto:oleksandr.pav...@intel.com>>> >> wrote: >> >> Another point already raised by Nathaniel is that for >> numpy's randomness ideally should provide a way to >> override >> default algorithm for sampling from a particular >> distribution. For example RandomState object that >> implements PCG may rely on default >> acceptance-rejection >> algorithm for sampling from Gamma, while the >> RandomState >> object that provides interface to MKL might want to >> call >> into MKL directly. >> >> >> >> The approach that pyfftw uses at least for scipy, which >> may also >> work here, is that you can monkey-patch the >> scipy.fftpack module >> at runtime, replacing it with pyfftw's drop-in >> replacement. >> scipy then proceeds to use pyfftw instead of its built-in >> fftpack implementation. Might such an approach work here? >> Users can either use this alternative randomstate >> replacement >> directly, or they can replace numpy's with it at runtime >> and >> numpy will then proceed to use the alternative. >> >> >> The only reason that pyfftw uses monkeypatching is that the >> better >> approach is not possible due to license constraints with >> FFTW (it's >> GPL). >> >> >> Yes, that is exactly why I brought it up. Better approaches are >> also >> not possible with MKL due to license constraints. It is a very >> similar >> situation overall. >> >> >> Its not that similar, the better approach is certainly possible with >> FFTW, the GPL is compatible with numpys license. It is only a >> concern users of binary distributions. Nobody provided the code to >> use fftw yet, but it would certainly be accepted. >> >> >> Although it is technically compatible, it would make numpy effectively >> GPL. Suggestions for this have been explicitly
Re: [Numpy-discussion] Intel random number package
On Wed, Oct 26, 2016 at 12:41 PM, Warren Weckesser < warren.weckes...@gmail.com> wrote: > > On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith <n...@pobox.com> wrote: >> The patch also adds ~10,000 lines of code; here's an example of what >> some of it looks like: >> >> https://github.com/oleksandr-pavlyk/numpy/blob/b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/mklrand/mkl_distributions.cpp#L1724-L1833 >> >> I don't see how we can realistically commit to maintaining this. > > FYI: numpy already maintains code exactly like that: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397 > > Perhaps the point should be that the numpy devs won't want to maintain two nearly identical versions of that code. Indeed. That's how the algorithm was published. The /* sigh ... */ is my own. ;-) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
On Wed, Oct 26, 2016 at 9:36 AM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > > On Mi, 2016-10-26 at 09:29 -0700, Robert Kern wrote: > > On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor <jtaylor.debian@google > > mail.com> wrote: > > > > > > On 10/26/2016 06:00 PM, Julian Taylor wrote: > > > > >> I prefer for the full functionality of numpy to stay available > > with a > > >> stack of community owned software, even if it may be less powerful > > that > > >> way. > > > > > > But then if this is really just the same random numbers numpy > > already provides just faster, it is probably acceptable in principle. > > I haven't actually looked at the PR yet. > > > > I think the stream is different in some places, at least. And it's > > not a silent backend drop-in like np.linalg being built against an > > optimized BLAS, just a separate module that is inoperative without > > MKL. > > I might be swayed, but my gut feeling would be that a backend change > (if the default stream changes, an explicit one, though maybe one could > make a "fastest") would be the only reasonable way to provide such a > thing in numpy itself. That mostly argues for distributing it as a separate package, not part of numpy at all. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > > On 10/26/2016 06:00 PM, Julian Taylor wrote: >> I prefer for the full functionality of numpy to stay available with a >> stack of community owned software, even if it may be less powerful that >> way. > > But then if this is really just the same random numbers numpy already provides just faster, it is probably acceptable in principle. I haven't actually looked at the PR yet. I think the stream is different in some places, at least. And it's not a silent backend drop-in like np.linalg being built against an optimized BLAS, just a separate module that is inoperative without MKL. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
On Tue, Oct 25, 2016 at 10:22 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: > > On Tue, Oct 25, 2016 at 10:41 PM, Robert Kern <robert.k...@gmail.com> wrote: >> >> On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: >> > >> > Hi All, >> > >> > There is a proposed random number package PR now up on github: https://github.com/numpy/numpy/pull/8209. It is from >> > oleksandr-pavlyk and implements the number random number package using MKL for increased speed. I think we are definitely interested in the improved speed, but I'm not sure numpy is the best place to put the package. I'd welcome any comments on the PR itself, as well as any thoughts on the best way organize or use of this work. Maybe scikit-random >> >> This is what ng-numpy-randomstate is for. >> >> https://github.com/bashtage/ng-numpy-randomstate > > Interesting, despite old fashioned original ziggurat implementation of the normal and gnu c style... Does that project seek to preserve all the bytestreams or is it still in flux? I would assume some flux for now, but you can ask the author by submitting a corrected ziggurat PR as a trial balloon. ;-) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Intel random number package
On Tue, Oct 25, 2016 at 9:34 PM, Charles R Harris <charlesr.har...@gmail.com> wrote: > > Hi All, > > There is a proposed random number package PR now up on github: https://github.com/numpy/numpy/pull/8209. It is from > oleksandr-pavlyk and implements the number random number package using MKL for increased speed. I think we are definitely interested in the improved speed, but I'm not sure numpy is the best place to put the package. I'd welcome any comments on the PR itself, as well as any thoughts on the best way organize or use of this work. Maybe scikit-random This is what ng-numpy-randomstate is for. https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Preserving NumPy views when pickling
On Tue, Oct 25, 2016 at 7:05 PM, Feng Yu <rainwood...@gmail.com> wrote: > > Hi, > > Just another perspective. base' and 'data' in PyArrayObject are two > separate variables. > > base can point to any PyObject, but it is `data` that defines where > data is accessed in memory. > > 1. There is no clear way to pickle a pointer (`data`) in a meaningful > way. In order for `data` member to make sense we still need to > 'readout' the values stored at `data` pointer in the pickle. > > 2. By definition base is not necessary a numpy array but it is just > some other object for managing the memory. In general, yes, but most often it's another ndarray, and the child is related to the parent by a slice operation that could be computed by comparing the `data` tuples. The exercise here isn't to always represent the general case in this way, but to see what can be done opportunistically and if that actually helps solve a practical problem. > 3. One can surely pickle the `base` object as a reference, but it is > useless if the data memory has been reconstructed independently during > unpickling. > > 4. Unless there is clear way to notify the referencing numpy array of > the new data pointer. There probably isn't. > > BTW, is the stride information is lost during pickling, too? The > behavior shall probably be documented if not yet. The stride information may be lost, yes. We reserve the right to retain it, though (for example, if .T is contiguous then we might well serialize the transposed data linearly and return a view on that data upon deserialization). I don't believe that we guarantee that the unpickled result is contiguous. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Preserving NumPy views when pickling
On Tue, Oct 25, 2016 at 5:09 PM, Matthew Harrigan < harrigan.matt...@gmail.com> wrote: > > It seems pickle keeps track of references for basic python types. > > x = [1] > y = [x] > x,y = pickle.loads(pickle.dumps((x,y))) > x.append(2) > print(y) > >>> [[1,2]] > > Numpy arrays are different but references are forgotten after pickle/unpickle. Shared objects do not remain shared. Based on the quote below it could be considered bug with numpy/pickle. Not a bug, but an explicit design decision on numpy's part. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Preserving NumPy views when pickling
On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer <sho...@gmail.com> wrote: > > On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >> Concretely, what do would you suggest should happen with: >> >> base = np.zeros(1) >> view = base[:10] >> >> # case 1 >> pickle.dump(view, file) >> >> # case 2 >> pickle.dump(base, file) >> pickle.dump(view, file) >> >> # case 3 >> pickle.dump(view, file) >> pickle.dump(base, file) >> >> ? > > I see what you're getting at here. We would need a rule for when to include the base in the pickle and when not to. Otherwise, pickle.dump(view, file) always contains data from the base pickle, even with view is much smaller than base. > > The safe answer is "only use views in the pickle when base is already being pickled", but that isn't possible to check unless all the arrays are together in a custom container. So, this isn't really feasible for NumPy. It would be possible with a custom Pickler/Unpickler since they already keep track of objects previously (un)pickled. That would handle [base, view] okay but not [view, base], so it's probably not going to be all that useful outside of special situations. It would make a neat recipe, but I probably would not provide it in numpy itself. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] automatically avoiding temporary arrays
On Wed, Oct 5, 2016 at 1:11 PM, srean <srean.l...@gmail.com> wrote: > Thanks Francesc, Robert for giving me a broader picture of where this fits > in. I believe numexpr does not handle slicing, so that might be another > thing to look at. > Dereferencing would be relatively simple to add into numexpr, as it would just be some getattr() calls. Personally I will add that at some point because it will clean up my code. Slicing, maybe only for continuous blocks in memory? I.e. imageStack[0,:,:] would be possible, but imageStack[:, ::2, ::2] would not be trivial (I think...). I seem to remember someone asked David Cooke about slicing and he said something along the lines of, "that's what Numba is for." Perhaps NumPy backended by Numba is more so what you are looking for, as it hooks into the byte compiler? The main advantage of numexpr is that a series of numpy functions in can be enclosed in ne.evaluate( "" ) and it provides a big acceleration for little programmer effort, but it's not nearly as sophisticated as Numba or PyPy. > On Wed, Oct 5, 2016 at 4:26 PM, Robert McLeod <robbmcl...@gmail.com> > wrote: > >> >> As Francesc said, Numexpr is going to get most of its power through >> grouping a series of operations so it can send blocks to the CPU cache and >> run the entire series of operations on the cache before returning the block >> to system memory. If it was just used to back-end NumPy, it would only >> gain from the multi-threading portion inside each function call. >> > > Is that so ? > > I thought numexpr also cuts down on number of temporary buffers that get > filled (in other words copy operations) if the same expression was written > as series of operations. My understanding can be wrong, and would > appreciate correction. > > The 'out' parameter in ufuncs can eliminate extra temporaries but its not > composable. Right now I have to manually carry along the array where the in > place operations take place. I think the goal here is to eliminate that. > The numexpr virtual machine does create temporaries where needed when it parses the abstract syntax tree for all the operations it has to do. I believe the main advantage is that the temporaries are created on the CPU cache, and not in system memory. It's certainly true that numexpr doesn't create a lot of OP_COPY operations, rather it's optimized to minimize them, so probably it's fewer ops than naive successive calls to numpy within python, but I'm unsure if there's any difference in operation count between a hand-optimized numpy with out= set and numexpr. Numexpr just does it for you. This blog post from Tim Hochberg is useful for understanding the performance advantages of blocking versus multithreading: http://www.bitsofbits.com/2014/09/21/numpy-micro-optimization-and-numexpr/ Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] automatically avoiding temporary arrays
All, On Wed, Oct 5, 2016 at 11:46 AM, Francesc Alted <fal...@gmail.com> wrote: > 2016-10-05 8:45 GMT+02:00 srean <srean.l...@gmail.com>: > >> Good discussion, but was surprised by the absence of numexpr in the >> discussion., given how relevant it (numexpr) is to the topic. >> >> Is the goal to fold in the numexpr functionality (and beyond) into Numpy ? >> > > Yes, the question about merging numexpr into numpy has been something that > periodically shows up in this list. I think mostly everyone agree that it > is a good idea, but things are not so easy, and so far nobody provided a > good patch for this. Also, the fact that numexpr relies on grouping an > expression by using a string (e.g. (y = ne.evaluate("x**3 + tanh(x**2) + > 4")) does not play well with the way in that numpy evaluates expressions, > so something should be suggested to cope with this too. > As Francesc said, Numexpr is going to get most of its power through grouping a series of operations so it can send blocks to the CPU cache and run the entire series of operations on the cache before returning the block to system memory. If it was just used to back-end NumPy, it would only gain from the multi-threading portion inside each function call. I'm not sure how one would go about grouping successive numpy expressions without modifying the Python interpreter? I put a bit of effort into extending numexpr to use 4-byte word opcodes instead of 1-byte. Progress has been very slow, however, due to time constraints, but I have most of the numpy data types (u[1-4], i[1-4], f[4,8], c[8,16], S[1-4], U[1-4]). On Tuesday I finished writing a Python generator script that writes all the C-side opcode macros for opcodes.hpp. Now I have about 900 opcodes, and this could easily grow into thousands if more functions are added, so I also built a reverse lookup tree (based on collections.defaultdict) for the Python-side of numexpr. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2016.3
I am pleased to announce release 2016.3 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (limited support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker: http://github.com/sfepy/sfepy Highlights of this release -- - Python 3 support - testing with Travis CI - new classes for homogenized coefficients - using argparse instead of optparse For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Cheers, Robert Cimrman --- Contributors to this release in alphabetical order: Robert Cimrman Jan Heczko Thomas Kluyver Vladimir Lukes ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Using library-specific headers
Pavlyk, NumExpr optionally includes MKL's VML at compile-time. You may want to look at its implementation. From what I recall it relies on a function in a bootstrapped __config__.py to determine if MKL is present. Robert On Thu, Sep 29, 2016 at 7:27 PM, Pavlyk, Oleksandr < oleksandr.pav...@intel.com> wrote: > Hi Julian, > > Thank you very much for the response. It appears to work. > > I work on "Intel Distribution for Python" at Intel Corp. This question was > motivated by work needed to > prepare pull requests with our changes/optimizations to numpy source code. > In particular, the numpy.random_intel package > >https://mail.scipy.org/pipermail/numpy-discussion/2016-June/075693.html > > relies on MKL, but its potential inclusion in numpy should not break the > build if MKL is unavailable. > > Also our benchmarking was pointing at Numpy's sequential memory copying as > a bottleneck. > I am working to open a pull request into the main trunk of numpy to take > advantage of multithreaded > MKL's BLAS dcopy function to do memory copying in parallel for > sufficiently large sizes. > > Related to numpy.random_inter, I noticed that the randomstate package, > which extends numpy.random was > not being made a part of numpy, but rather published on PyPI as a > stand-alone module. Does that mean that > the community decided against including it in numpy's codebase? If so, I > would appreciate if someone could > elaborate on or point me to the reasoning behind that decision. > > Thank you, > Oleksandr > > > > -Original Message- > From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On > Behalf Of Julian Taylor > Sent: Thursday, September 29, 2016 8:10 AM > To: numpy-discussion@scipy.org > Subject: Re: [Numpy-discussion] Using library-specific headers > > On 09/27/2016 11:09 PM, Pavlyk, Oleksandr wrote: > > Suppose I would like to take advantage of some functions from MKL in > > numpy C source code, which would require to use > > > > > > > > #include "mkl.h" > > > > > > > > Ideally this include line must not break the build of numpy when MKL > > is not present, so my initial approach was to use > > > > > > > > #if defined(SCIPY_MKL_H) > > > > #include "mkl.h" > > > > #endif > > > > > > > > Unfortunately, this did not work when building with gcc on a machine > > where MKL is present on default LD_LIBRARY_PATH, because then the > > distutils code was setting SCIPY_MKL_H preprocessor variable, even > > though mkl headers are not on the C_INCLUDE_PATH. > > > > > > > > What is the preferred solution to include an external library header > > to ensure that code-base continues to build in most common cases? > > > > > > > > One approach I can think of is to set a preprocessor variable, say > > HAVE_MKL_HEADERS in numpy/core/includes/numpy/config.h depending on an > > outcome of building of a simple _configtest.c using > > config.try_compile(), like it is done in numpy/core/setup.py // > > > > / / > > > > Is there a simpler, or a better way? > > > > hi, > you could put the header into OPTIONAL_HEADERS in > numpy/core/setup_common.py. This will define HAVE_HEADERFILENAME_H for you > but this will not check that the corresponding the library actually exists > and can be linked. > For that SCIPY_MKL_H is probably the right macro, though its name is > confusing as it does not check for the header presence ... > > Can you tell us more about what from mkl you are attempting to add and for > what purpos, e.g. is it something that should go into numpy proper or just > for personal/internal use? > > cheers, > Julian > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Using library-specific headers
On Thu, Sep 29, 2016 at 6:27 PM, Pavlyk, Oleksandr < oleksandr.pav...@intel.com> wrote: > Related to numpy.random_inter, I noticed that the randomstate package, which extends numpy.random was > not being made a part of numpy, but rather published on PyPI as a stand-alone module. Does that mean that > the community decided against including it in numpy's codebase? If so, I would appreciate if someone could > elaborate on or point me to the reasoning behind that decision. No, we are just working out the API and the extensibility machinery in a separate package before committing to backwards compatibility. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New Indexing Methods Revival #N (subclasses!)
On Tue, Sep 6, 2016 at 8:46 AM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > > On Di, 2016-09-06 at 09:37 +0200, Sebastian Berg wrote: > > On Mo, 2016-09-05 at 18:31 -0400, Marten van Kerkwijk wrote: > > > > > > Actually, on those names: an alternative to your proposal would be > > > to > > > introduce only one new method which can do all types of indexing, > > > depending on a keyword argument, i.e., something like > > > ``` > > > def getitem(self, item, mode='outer'): > > > ... > > > ``` > > Have I been overthinking this, eh? Just making it `__getitem__(self, > > index, mode=...)` and then from `vindex` calling the subclasses > > `__getitem__(self, index, mode="vector")` or so would already solve > > the > > issue almost fully? Only thing I am not quite sure about: > > > > 1. Is `__getitem__` in some way special to make this difficult (also > > considering some new ideas like allowing object[a=4]? > > OK; I think the C-side slot cannot get the kwarg likely, but probably > you can find a solution for that Well, the solution is to use a different name, I think. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reading in a mesh file
On Thu, Sep 1, 2016 at 3:49 PM, Florian Lindner <mailingli...@xgm.de> wrote: > > Hello, > > thanks for your reply which was really helpful! > > My problem is that I discovered that the data I got is rather unordered. > > The documentation for reshape says: Read the elements of a using this index order, and place the elements into the > reshaped array using this index order. ‘C’ means to read / write the elements using C-like index order, with the last > axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to read / write the elements using > Fortran-like index order, with the first index changing fastest, and the last index changing slowest. > > With my data both dimensions change, so there is no specific ordering of the points, just a bunch of arbitrarily mixed > "x y z value" data. > > My idea is: > > out = np.loadtxt(...) > x = np.unique(out[:,0]) > y = np.unique[out]:,1]) > xx, yy = np.meshgrid(x, y) > > values = lookup(xx, yy, out) > > lookup is ufunc (I hope that term is correct here) that looks up the value of every x and y in out, like > x_filtered = out[ out[:,0] == x, :] > y_filtered = out[ out[:,1] == y, :] > return y_filtered[2] > > (untested, just a sketch) > > Would this work? Any better way? If the (x, y) values are actually drawn from a rectilinear grid, then you can use np.lexsort() to sort the rows before reshaping. [~/scratch] |4> !cat random-mesh.txt 0.3 0.3 21 0 0 10 0 0.3 11 0.3 0.6 22 0 0.6 12 0.6 0.3 31 0.3 0 20 0.6 0.6 32 0.6 0 30 [~/scratch] |5> scrambled_nodes = np.loadtxt('random-mesh.txt') # Note! Put the "faster" column before the "slower" column! [~/scratch] |6> i = np.lexsort([scrambled_nodes[:, 1], scrambled_nodes[:, 0]]) [~/scratch] |7> sorted_nodes = scrambled_nodes[i] [~/scratch] |8> sorted_nodes array([[ 0. , 0. , 10. ], [ 0. , 0.3, 11. ], [ 0. , 0.6, 12. ], [ 0.3, 0. , 20. ], [ 0.3, 0.3, 21. ], [ 0.3, 0.6, 22. ], [ 0.6, 0. , 30. ], [ 0.6, 0.3, 31. ], [ 0.6, 0.6, 32. ]]) Then carry on with the reshape()ing as before. If the grid points that "ought to be the same" are not actually identical, then you may end up with some problems, e.g. if you had "0.3001 0.0 20.0" as a row, but all of the other "x=0.3" rows had "0.3", then that row would get sorted out of order. You would have to clean up the grid coordinates a bit first. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Reading in a mesh file
On Wed, Aug 31, 2016 at 4:00 PM, Florian Lindner <mailingli...@xgm.de> wrote: > > Hello, > > I have mesh (more exactly: just a bunch of nodes) description with values associated to the nodes in a file, e.g. for a > 3x3 mesh: > > 0 0 10 > 0 0.3 11 > 0 0.6 12 > 0.3 0 20 > 0.3 0.3 21 > 0.3 0.6 22 > 0.6 0 30 > 0.6 0.3 31 > 0.6 0.6 32 > > What is best way to read it in and get data structures like the ones I get from np.meshgrid? > > Of course, I know about np.loadtxt, but I'm having trouble getting the resulting arrays (x, y, values) in the right form > and to retain association to the values. For this particular case (known shape and ordering), this is what I would do. Maybe throw in a .T or three depending on exactly how you want them to be laid out. [~/scratch] |1> !cat mesh.txt 0 0 10 0 0.3 11 0 0.6 12 0.3 0 20 0.3 0.3 21 0.3 0.6 22 0.6 0 30 0.6 0.3 31 0.6 0.6 32 [~/scratch] |2> nodes = np.loadtxt('mesh.txt') [~/scratch] |3> nodes array([[ 0. , 0. , 10. ], [ 0. , 0.3, 11. ], [ 0. , 0.6, 12. ], [ 0.3, 0. , 20. ], [ 0.3, 0.3, 21. ], [ 0.3, 0.6, 22. ], [ 0.6, 0. , 30. ], [ 0.6, 0.3, 31. ], [ 0.6, 0.6, 32. ]]) [~/scratch] |4> reshaped = nodes.reshape((3, 3, -1)) [~/scratch] |5> reshaped array([[[ 0. , 0. , 10. ], [ 0. , 0.3, 11. ], [ 0. , 0.6, 12. ]], [[ 0.3, 0. , 20. ], [ 0.3, 0.3, 21. ], [ 0.3, 0.6, 22. ]], [[ 0.6, 0. , 30. ], [ 0.6, 0.3, 31. ], [ 0.6, 0.6, 32. ]]]) [~/scratch] |7> x = reshaped[..., 0] [~/scratch] |8> y = reshaped[..., 1] [~/scratch] |9> values = reshaped[..., 2] [~/scratch] |10> x array([[ 0. , 0. , 0. ], [ 0.3, 0.3, 0.3], [ 0.6, 0.6, 0.6]]) [~/scratch] |11> y array([[ 0. , 0.3, 0.6], [ 0. , 0.3, 0.6], [ 0. , 0.3, 0.6]]) [~/scratch] |12> values array([[ 10., 11., 12.], [ 20., 21., 22.], [ 30., 31., 32.]]) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Include last element when subindexing numpy arrays?
On Wed, Aug 31, 2016 at 1:34 PM, Matti Viljamaa <mvilja...@kapsi.fi> wrote: > > On 31 Aug 2016, at 15:22, Robert Kern <robert.k...@gmail.com> wrote: > > On Wed, Aug 31, 2016 at 12:28 PM, Matti Viljamaa <mvilja...@kapsi.fi> wrote: > > > > Is there a clean way to include the last element when subindexing numpy arrays? > > Since the default behaviour of numpy arrays is to omit the “stop index”. > > > > So for, > > > > >>> A > > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > >>> A[0:5] > > array([0, 1, 2, 3, 4]) > > A[5:] > > -- > Robert Kern > > No that returns the subarray starting from index 5 to the end. > > What I want to be able to return > > array([0, 1, 2, 3, 4, 5]) > > (i.e. last element 5 included) > > but without the funky A[0:6] syntax, which looks like it should return > > array([0, 1, 2, 3, 4, 5, 6]) > > but since bumpy arrays omit the last index, returns > > array([0, 1, 2, 3, 4, 5]) > > which syntactically would be more reasonable to be A[0:5]. Ah, I see what you are asking now. The answer is "no"; this is just the way that slicing works in Python in general. numpy merely follows suit. It is something that you will get used to with practice. My sense of "funkiness" and "reasonableness" is the opposite of yours, for instance. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] State-of-the-art to use a C/C++ library from Python
On Wed, Aug 31, 2016 at 12:28 PM, Michael Bieri <mibi...@gmail.com> wrote: > > Hi all > > There are several ways on how to use C/C++ code from Python with NumPy, as given in http://docs.scipy.org/doc/numpy/user/c-info.html . Furthermore, there's at least pybind11. > > I'm not quite sure which approach is state-of-the-art as of 2016. How would you do it if you had to make a C/C++ library available in Python right now? > > In my case, I have a C library with some scientific functions on matrices and vectors. You will typically call a few functions to configure the computation, then hand over some pointers to existing buffers containing vector data, then start the computation, and finally read back the data. The library also can use MPI to parallelize. I usually reach for Cython: http://cython.org/ http://docs.cython.org/en/latest/src/userguide/memoryviews.html -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Why np.fft.rfftfreq only returns up to Nyqvist?
On Wed, Aug 31, 2016 at 1:14 PM, Matti Viljamaa <mvilja...@kapsi.fi> wrote: > > What’s the reasonability of np.fft.rfftfreq returning frequencies only up to Nyquist, rather than for the full sample rate? The answer to the question that you asked is that np.fft.rfft() only computes values for frequencies only up to Nyquist, so np.fft.rfftfreq() must give you the frequencies to match. I'm not sure if there is another misunderstanding lurking that needs to be clarified. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Include last element when subindexing numpy arrays?
On Wed, Aug 31, 2016 at 12:28 PM, Matti Viljamaa <mvilja...@kapsi.fi> wrote: > > Is there a clean way to include the last element when subindexing numpy arrays? > Since the default behaviour of numpy arrays is to omit the “stop index”. > > So for, > > >>> A > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > >>> A[0:5] > array([0, 1, 2, 3, 4]) A[5:] -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] coordinate bounds
On Sat, Aug 20, 2016 at 9:16 PM, Alan Isaac <alan.is...@gmail.com> wrote: > > Is there a numpy equivalent to Mma's CoordinateBounds command? > http://reference.wolfram.com/language/ref/CoordinateBounds.html The first signature can be computed like so: np.transpose([coords.min(axis=0), coords.max(axis=0)]) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy set_printoptions, silent failure, bug?
On Tue, Jul 19, 2016 at 10:41 PM, John Ladasky <jlada...@itu.edu> wrote: > Should this be considered a Numpy bug, or is there some reason that set_printoptions would legitimately need to accept a dictionary as a single argument? There is no such reason. One could certainly add more validation to the arguments to np.set_printoptions(). -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Design feedback solicitation
On Fri, Jul 15, 2016 at 2:53 AM, Pavlyk, Oleksandr < oleksandr.pav...@intel.com> wrote: > > Hi Robert, > > Thank you for the pointers. > > I think numpy.random should have a mechanism to choose between methods for generating the underlying randomness dynamically, at a run-time, as well as an extensible framework, where developers could add more methods. The default would be MT19937 for backwards compatibility. It is important to be able to do this at a run-time, as it would allow one to use different algorithms in different threads (like different members of the parallel Mersenne twister family of generators, see MT2203). > > The framework should allow to define randomness as a bit stream, a stream of fixed size integers, or a stream of uniform reals (32 or 64 bits). This is a lot of like MKL’s abstract method for basic pseudo-random number generation. > > Each method should provide routines to sample from uniform distributions over reals (in floats and doubles), as well as over integers. > > All remaining non-uniform distributions build on top of these uniform streams. ng-numpy-randomstate does all of these. > I think it is pretty important to refactor numpy.random to allow the underlying generators to produce a given number of independent variates at a time. There could be convenience wrapper functions to allow to get one variate for backwards compatibility, but this change in design would allow for better efficiency, as sampling a vector of random variates at once is often faster than repeated sampling of one at a time due to set-up cost, vectorization, etc. The underlying C implementation is an implementation detail, so the refactoring that you suggest has no backwards compatibility constraints. > Finally, methods to sample particular distribution should uniformly support method keyword argument. Because method names vary from distribution to distribution, it should ideally be programmatically discoverable which methods are supported for a given distribution. For instance, the standard normal distribution could support method=’Inversion’, method=’Box-Muller’, method=’Ziggurat’, method=’Box-Muller-Marsaglia’ (the one used in numpy.random right now), as well as bunch of non-named methods based on transformed rejection method (see http://statistik.wu-wien.ac.at/anuran/ ) That is one of the items under discussion. I personally prefer that one simply exposes named methods for each different scheme (e.g. ziggurat_normal(), etc.). > It would also be good if one could dynamically register a new method to sample from a non-uniform distribution. This would allow, for instance, to automatically add methods to sample certain non-uniform distribution by directly calling into MKL (or other library), when available, instead of building them from uniforms (which may remain a fall-through method). > > The linked project is a good start, but the choice of the underlying algorithm needs to be made at a run-time, That's what happens. You instantiate the RandomState class that you want. > as far as I understood, and the only provided interface to query random variates is one at a time, just like it is currently the case > in numpy.random. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Design feedback solicitation
On Fri, Jun 17, 2016 at 4:08 PM, Pavlyk, Oleksandr < oleksandr.pav...@intel.com> wrote: > > Hi, > > I am new to this list, so I will start with an introduction. My name is Oleksandr Pavlyk. I now work at Intel Corp. on the Intel Distribution for Python, and previously worked at Wolfram Research for 12 years. My latest project was to write a mirror to numpy.random, named numpy.random_intel. The module uses MKL to sample from different distributions for efficiency. It provides support for different underlying algorithms for basic pseudo-random number generation, i.e. in addition to MT19937, it also provides SFMT19937, MT2203, etc. > > I recently published a blog about it: > > https://software.intel.com/en-us/blogs/2016/06/15/faster-random-number-generation-in-intel-distribution-for-python > > I originally attempted to simply replace numpy.random in the Intel Distribution for Python with the new module, but due to fixed seed backwards incompatibility this results in numerous test failures in numpy, scipy, pandas and other modules. > > Unlike numpy.random, the new module generates a vector of random numbers at a time, which can be done faster than repeatedly generating the same number of variates one at a time. > > The source code for the new module is not upstreamed yet, and this email is meant to solicit early community feedback to allow for faster acceptance of the proposed changes. Cool! You can find pertinent discussion here: https://github.com/numpy/numpy/issues/6967 And the current effort for adding new core PRNGs here: https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Indexing with floats
On Fri, Jun 10, 2016 at 12:15 PM, Fabien <fabien.mauss...@gmail.com> wrote: > > Hi, > > I really tried to do my homework before asking this here, but I just couldn't find the relevant information anywhere... > > My question is about the rationale behind forbidding indexing with floats, i.e.: > > >>> x[2.] > __main__:1: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future > > I don't find this very handy from a user's perspective, and I'd be grateful for pointers on discussion threads and/or PRs where this has been discussed, so that I can understand why it's important. https://mail.scipy.org/pipermail/numpy-discussion/2012-December/064705.html https://github.com/numpy/numpy/issues/2810 https://github.com/numpy/numpy/pull/2891 https://github.com/numpy/numpy/pull/3243 https://mail.scipy.org/pipermail/numpy-discussion/2015-July/073125.html Note that the future is coming in the next numpy release: https://github.com/numpy/numpy/pull/6271 -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: ifft padding
Allen, Probably it needs to work in n-dimensions, like the existing np.fft.fftshift function does, with an optional axis=tuple parameter. I recall that fftshift is just an array indexing trick? It would be helpful to see what's faster, two fftshifts and a edge padding or your inter-padding. Probably it's faster to make a new zeros array of the appropriate padded size and do 2*ndim copies? Robert On Wed, May 25, 2016 at 9:35 PM, Allen Welkie <allen.wel...@gmail.com> wrote: > I'd like to get some feedback on my [pull request]( > https://github.com/numpy/numpy/pull/7593). > > This pull request adds a function `ifftpad` which pads a spectrum by > inserting zeros where the highest frequencies would be. This is necessary > because the padding that `ifft` does simply inserts zeros at the end of the > array. But because of the way the spectrum is laid out, this changes which > bins represent which frequencies and in general messes up the result of > `ifft`. If you pad with the proposed `ifftpad` function, the zeros will be > inserted in the middle of the spectrum and the time signal that results > from `ifft` will be an interpolated version of the unpadded time signal. > See the discussion in [issue #1346]( > https://github.com/numpy/numpy/issues/1346). > > The following is a script to demonstrate what I mean: > > ``` > import numpy > from numpy import concatenate, zeros > from matplotlib import pyplot > > def correct_padding(a, n, scale=True): > """ A copy of the proposed `ifftpad` function. """ > spectrum = concatenate((a[:len(a) // 2], > zeros(n - len(a)), > a[len(a) // 2:])) > if scale: > spectrum *= (n / len(a)) > return spectrum > > def plot_real(signal, label): > time = numpy.linspace(0, 1, len(signal) + 1)[:-1] > pyplot.plot(time, signal.real, label=label) > > def main(): > spectrum = numpy.zeros(10, dtype=complex) > spectrum[-1] = 1 + 1j > > signal = numpy.fft.ifft(spectrum) > signal_bad_padding = numpy.fft.ifft(10 * spectrum, 100) > signal_good_padding = numpy.fft.ifft(correct_padding(spectrum, 100)) > > plot_real(signal, 'No padding') > plot_real(signal_bad_padding, 'Bad padding') > plot_real(signal_good_padding, 'Good padding') > > pyplot.legend() > pyplot.show() > > > if __name__ == '__main__': > main() > ``` > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Mon, May 23, 2016 at 5:41 PM, Chris Barker <chris.bar...@noaa.gov> wrote: > > On Sun, May 22, 2016 at 2:35 AM, Robert Kern <robert.k...@gmail.com> wrote: >> >> Well, I mean, engineers want lots of things. I suspect that most engineers *really* just want to call `numpy.random.seed(8675309)` at the start and never explicitly pass around separate streams. There's an upside to that in terms of code simplicity. There are also significant limitations and constraints. Ultimately, the upside against the alternative of passing around RandomState objects is usually overweighed by the limitations, so best practice is to pass around RandomState objects. > > Could we do something like the logging module, and have numpy.random "manage" a bunch of stream objects for you -- so you could get the default single stream easily, and also get access to specific streams without needing to pass around the objects? No, I don't think so. The logging module's namespacing doesn't really have an equivalent use case for PRNGs. We would just be making a much more complicated global state to manage. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Wed, May 18, 2016 at 7:56 PM, Nathaniel Smith <n...@pobox.com> wrote: > > On Wed, May 18, 2016 at 5:07 AM, Robert Kern <robert.k...@gmail.com> wrote: > > On Wed, May 18, 2016 at 1:14 AM, Nathaniel Smith <n...@pobox.com> wrote: > >> ...anyway, the real reason I'm a bit grumpy is because there are solid > >> engineering reasons why users *want* this API, > > > > I remain unconvinced on this mark. Grumpily. > > Sorry for getting grumpy :-). And my apologies for some unwarranted hyperbole. I think we're both converging on a reasonable approach, though. > The engineering reasons seem pretty > obvious to me though? Well, I mean, engineers want lots of things. I suspect that most engineers *really* just want to call `numpy.random.seed(8675309)` at the start and never explicitly pass around separate streams. There's an upside to that in terms of code simplicity. There are also significant limitations and constraints. Ultimately, the upside against the alternative of passing around RandomState objects is usually overweighed by the limitations, so best practice is to pass around RandomState objects. I acknowledge that there exists an upside to the splitting API, but I don't think it's a groundbreaking improvement over the alternative current best practice. It's also unclear to me how often situations that really demonstrate the upside come into play; in my experience a lot of these situations are already structured such that preallocating N streams is the natural thing to do. The limitations and constraints are currently underexplored, IMO; and in this conservative field, pessimism is warranted. > If you have any use case for independent streams > at all, and you're writing code that's intended to live inside a > library's abstraction barrier, then you need some way to choose your > streams to avoid colliding with arbitrary other code that the end-user > might assemble alongside yours as part of their final program. So > AFAICT you have two options: either you need a "tree-style" API for > allocating these streams, or else you need to add some explicit API to > your library that lets the end-user control in detail which streams > you use. Both are possible, but the latter is obviously undesireable > if you can avoid it, since it breaks the abstraction barrier, making > your library more complicated to use and harder to evolve. ACK > >> so whether or not it > >> turns out to be possible I think we should at least be allowed to have > >> a discussion about whether there's some way to give it to them. > > > > I'm not shutting down discussion of the option. I *implemented* the option. > > I think that discussing whether it should be part of the main API is > > premature. There probably ought to be a paper or three out there supporting > > its safety and utility first. Let the utility function version flourish > > first. > > OK -- I guess this particularly makes sense given how > extra-tightly-constrained we currently are in fixing mistakes in > np.random. But I feel like in the end the right place for this really > is inside the RandomState interface, because the person implementing > RandomState is the one best placed to understand (a) the gnarly > technical details here, and (b) how those change depending on the > particular PRNG in use. I don't want to end up with a bunch of > subtly-buggy utility functions in non-specialist libraries like dask > -- so we should be trying to help downstream users figure out how to > actually get this into np.random? I think this is an open research area. An enterprising grad student could milk this for a couple of papers analyzing how to do this safely for a variety of PRNGs. I don't think we can hash this out in an email thread or PR. So yeah, eventually there might be an API on RandomState for this, but it's way too premature to do so right now, IMO. Maybe start with a specialized subclass of RandomState that adds this experimental API. In ng-numpy-randomstate. ;-) But if someone has spare time to work on numpy.random, for God's sake, use it to review @gfyoung's PRs instead. > >> It's > >> not even 100% out of the question that we conclude that existing PRNGs > >> are buggy because they don't take this use case into account -- it > >> would be far from the first time that numpy found itself going beyond > >> the limits of older numerical tools that weren't designed to build the > >> kind of large composable systems that numpy gets used for. > >> > >> MT19937's state space is large enough that you could explicitly encode > >> a "tree seed" into it, even if you don't trust the laws of probability > >> -- e.g., you start with a RandomState with id [], then i
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Wed, May 18, 2016 at 6:20 PM, <josef.p...@gmail.com> wrote: > > On Wed, May 18, 2016 at 12:01 PM, Robert Kern <robert.k...@gmail.com> wrote: >> >> On Wed, May 18, 2016 at 4:50 PM, Chris Barker <chris.bar...@noaa.gov> wrote: >> >> >> >> > ...anyway, the real reason I'm a bit grumpy is because there are solid >> >> > engineering reasons why users *want* this API, >> > >> > Honestly, I am lost in the math -- but like any good engineer, I want to accomplish something anyway :-) I trust you guys to get this right -- or at least document what's "wrong" with it. >> > >> > But, if I'm reading the use case that started all this correctly, it closely matches my use-case. That is, I have a complex model with multiple independent "random" processes. And we want to be able to re-produce EXACTLY simulations -- our users get confused when the results are "different" even if in a statistically insignificant way. >> > >> > At the moment we are using one RNG, with one seed for everything. So we get reproducible results, but if one thing is changed, then the entire simulation is different -- which is OK, but it would be nicer to have each process using its own RNG stream with it's own seed. However, it matters not one whit if those seeds are independent -- the processes are different, you'd never notice if they were using the same PRN stream -- because they are used differently. So a "fairly low probability of a clash" would be totally fine. >> >> Well, the main question is: do you need to be able to spawn dependent streams at arbitrary points to an arbitrary depth without coordination between processes? The necessity for multiple independent streams per se is not contentious. > > I'm similar to Chris, and didn't try to figure out the details of what you are talking about. > > However, if there are functions getting into numpy that help in using a best practice even if it's not bullet proof, then it's still better than home made approaches. > If it get's in soon, then we can use it in a few years (given dependency lag). At that point there should be more distributed, nested simulation based algorithms where we don't know in advance how far we have to go to get reliable numbers or convergence. > > (But I don't see anything like that right now.) Current best practice is to use PRNGs with settable streams (or fixed jumpahead for those PRNGs cursed to not have settable streams but blessed to have super-long periods). The way to get those into numpy is to help Kevin Sheppard finish: https://github.com/bashtage/ng-numpy-randomstate He's done nearly all of the hard work already. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Wed, May 18, 2016 at 4:50 PM, Chris Barker <chris.bar...@noaa.gov> wrote: >> >> > ...anyway, the real reason I'm a bit grumpy is because there are solid >> > engineering reasons why users *want* this API, > > Honestly, I am lost in the math -- but like any good engineer, I want to accomplish something anyway :-) I trust you guys to get this right -- or at least document what's "wrong" with it. > > But, if I'm reading the use case that started all this correctly, it closely matches my use-case. That is, I have a complex model with multiple independent "random" processes. And we want to be able to re-produce EXACTLY simulations -- our users get confused when the results are "different" even if in a statistically insignificant way. > > At the moment we are using one RNG, with one seed for everything. So we get reproducible results, but if one thing is changed, then the entire simulation is different -- which is OK, but it would be nicer to have each process using its own RNG stream with it's own seed. However, it matters not one whit if those seeds are independent -- the processes are different, you'd never notice if they were using the same PRN stream -- because they are used differently. So a "fairly low probability of a clash" would be totally fine. Well, the main question is: do you need to be able to spawn dependent streams at arbitrary points to an arbitrary depth without coordination between processes? The necessity for multiple independent streams per se is not contentious. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Wed, May 18, 2016 at 1:14 AM, Nathaniel Smith <n...@pobox.com> wrote: > > On Tue, May 17, 2016 at 10:41 AM, Robert Kern <robert.k...@gmail.com> wrote: > > On Tue, May 17, 2016 at 6:24 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> > >> On May 17, 2016 1:50 AM, "Robert Kern" <robert.k...@gmail.com> wrote: > >> > > >> [...] > >> > What you want is a function that returns many RandomState objects that > >> > are hopefully spread around the MT19937 space enough that they are > >> > essentially independent (in the absence of true jumpahead). The better > >> > implementation of such a function would look something like this: > >> > > >> > def spread_out_prngs(n, root_prng=None): > >> > if root_prng is None: > >> > root_prng = np.random > >> > elif not isinstance(root_prng, np.random.RandomState): > >> > root_prng = np.random.RandomState(root_prng) > >> > sprouted_prngs = [] > >> > for i in range(n): > >> > seed_array = root_prng.randint(1<<32, size=624) # > >> > dtype=np.uint32 under 1.11 > >> > sprouted_prngs.append(np.random.RandomState(seed_array)) > >> > return spourted_prngs > >> > >> Maybe a nice way to encapsulate this in the RandomState interface would be > >> a method RandomState.random_state() that generates and returns a new child > >> RandomState. > > > > I disagree. This is a workaround in the absence of proper jumpahead or > > guaranteed-independent streams. I would not encourage it. > > > >> > Internally, this generates seed arrays of about the size of the MT19937 > >> > state so make sure that you can access more of the state space. That will at > >> > least make the chance of collision tiny. And it can be easily rewritten to > >> > take advantage of one of the newer PRNGs that have true independent streams: > >> > > >> > https://github.com/bashtage/ng-numpy-randomstate > >> > >> ... But unfortunately I'm not sure how to make my interface suggestion > >> above work on top of one of these RNGs, because for RandomState.random_state > >> you really want a tree of independent RNGs and the fancy new PRNGs only > >> provide a single flat namespace :-/. And even more annoyingly, the tree API > >> is actually a nicer API, because with a flat namespace you have to know up > >> front about all possible RNGs your code will use, which is an unfortunate > >> global coupling that makes it difficult to compose programs out of > >> independent pieces, while the RandomState.random_state approach composes > >> beautifully. Maybe there's some clever way to allocate a 64-bit namespace to > >> make it look tree-like? I'm not sure 64 bits is really enough... > > > > MT19937 doesn't have a "tree" any more than the others. It's the same flat > > state space. You are just getting the illusion of a tree by hoping that you > > never collide. You ought to think about precisely the same global coupling > > issues with MT19937 as you do with guaranteed-independent streams. > > Hope-and-prayer isn't really a substitute for properly engineering your > > problem. It's just a moral hazard to promote this method to the main API. > > Nonsense. > > If your definition of "hope and prayer" includes assuming that we > won't encounter a random collision in a 2**19937 state space, then > literally all engineering is hope-and-prayer. A collision could > happen, but if it does it's overwhelmingly more likely to happen > because of a flaw in the mathematical analysis, or a bug in the > implementation, or because random quantum fluctuations caused you and > your program to suddenly be transported to a parallel world where 1 + > 1 = 1, than that you just got unlucky with your random state. And all > of these hazards apply equally to both MT19937 and more modern PRNGs. Granted. > ...anyway, the real reason I'm a bit grumpy is because there are solid > engineering reasons why users *want* this API, I remain unconvinced on this mark. Grumpily. > so whether or not it > turns out to be possible I think we should at least be allowed to have > a discussion about whether there's some way to give it to them. I'm not shutting down discussion of the option. I *implemented* the option. I think that discussing whether it should be part of the main API is premature. There probably ought to be a paper or three out there supporting its safety and utility first. Let the utili
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Tue, May 17, 2016 at 6:24 PM, Nathaniel Smith <n...@pobox.com> wrote: > > On May 17, 2016 1:50 AM, "Robert Kern" <robert.k...@gmail.com> wrote: > > > [...] > > What you want is a function that returns many RandomState objects that are hopefully spread around the MT19937 space enough that they are essentially independent (in the absence of true jumpahead). The better implementation of such a function would look something like this: > > > > def spread_out_prngs(n, root_prng=None): > > if root_prng is None: > > root_prng = np.random > > elif not isinstance(root_prng, np.random.RandomState): > > root_prng = np.random.RandomState(root_prng) > > sprouted_prngs = [] > > for i in range(n): > > seed_array = root_prng.randint(1<<32, size=624) # dtype=np.uint32 under 1.11 > > sprouted_prngs.append(np.random.RandomState(seed_array)) > > return spourted_prngs > > Maybe a nice way to encapsulate this in the RandomState interface would be a method RandomState.random_state() that generates and returns a new child RandomState. I disagree. This is a workaround in the absence of proper jumpahead or guaranteed-independent streams. I would not encourage it. > > Internally, this generates seed arrays of about the size of the MT19937 state so make sure that you can access more of the state space. That will at least make the chance of collision tiny. And it can be easily rewritten to take advantage of one of the newer PRNGs that have true independent streams: > > > > https://github.com/bashtage/ng-numpy-randomstate > > ... But unfortunately I'm not sure how to make my interface suggestion above work on top of one of these RNGs, because for RandomState.random_state you really want a tree of independent RNGs and the fancy new PRNGs only provide a single flat namespace :-/. And even more annoyingly, the tree API is actually a nicer API, because with a flat namespace you have to know up front about all possible RNGs your code will use, which is an unfortunate global coupling that makes it difficult to compose programs out of independent pieces, while the RandomState.random_state approach composes beautifully. Maybe there's some clever way to allocate a 64-bit namespace to make it look tree-like? I'm not sure 64 bits is really enough... MT19937 doesn't have a "tree" any more than the others. It's the same flat state space. You are just getting the illusion of a tree by hoping that you never collide. You ought to think about precisely the same global coupling issues with MT19937 as you do with guaranteed-independent streams. Hope-and-prayer isn't really a substitute for properly engineering your problem. It's just a moral hazard to promote this method to the main API. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Tue, May 17, 2016 at 2:40 PM, Sturla Molden <sturla.mol...@gmail.com> wrote: > > Stephan Hoyer <sho...@gmail.com> wrote: > > I have recently encountered several use cases for randomly generate random > > number seeds: > > > > 1. When writing a library of stochastic functions that take a seed as an > > input argument, and some of these functions call multiple other such > > stochastic functions. Dask is one such example [1]. > > > > 2. When a library needs to produce results that are reproducible after > > calling numpy.random.seed, but that do not want to use the functions in > > numpy.random directly. This came up recently in a pandas pull request [2], > > because we want to allow using RandomState objects as an alternative to > > global state in numpy.random. A major advantage of this approach is that it > > provides an obvious alternative to reusing the private numpy.random._mtrand > > [3]. > > What about making numpy.random a finite state machine, and keeping a stack > of RandomState seeds? That is, something similar to what OpenGL does for > its matrices? Then we get two functions, numpy.random.push_seed and > numpy.random.pop_seed. I don't think that addresses the issues brought up here. It's just more global state to worry about. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Tue, May 17, 2016 at 9:09 AM, Stephan Hoyer <sho...@gmail.com> wrote: > > On Tue, May 17, 2016 at 12:18 AM, Robert Kern <robert.k...@gmail.com> wrote: >> >> On Tue, May 17, 2016 at 4:54 AM, Stephan Hoyer <sho...@gmail.com> wrote: >> > 1. When writing a library of stochastic functions that take a seed as an input argument, and some of these functions call multiple other such stochastic functions. Dask is one such example [1]. >> >> Can you clarify the use case here? I don't really know what you are doing here, but I'm pretty sure this is not the right approach. > > Here's a contrived example. Suppose I've written a simulator for cars that consists of a number of loosely connected components (e.g., an engine, brakes, etc.). The behavior of each component of our simulator is stochastic, but we want everything to be fully reproducible, so we need to use seeds or RandomState objects. > > We might write our simulate_car function like the following: > > def simulate_car(engine_config, brakes_config, seed=None): > rs = np.random.RandomState(seed) > engine = simulate_engine(engine_config, seed=rs.random_seed()) > brakes = simulate_brakes(brakes_config, seed=rs.random_seed()) > ... > > The problem with passing the same RandomState object (either explicitly or dropping the seed argument entirely and using the global state) to both simulate_engine and simulate_breaks is that it breaks encapsulation -- if I change what I do inside simulate_engine, it also effects the brakes. That's a little too contrived, IMO. In most such simulations, the different components interact with each other in the normal course of the simulation; that's why they are both joined together in the same simulation instead of being two separate runs. Unless if the components are being run across a process or thread boundary (a la dask below) where true nondeterminism comes into play, then I don't think you want these semi-independent streams. This seems to be the advice du jour from the agent-based modeling community. > The dask use case is actually pretty different -- the intent is to create many random numbers in parallel using multiple threads or processes (possibly in a distributed fashion). I know that skipping ahead is the standard way to get independent number streams for parallel sampling, but that isn't exposed in numpy.random, and setting distinct seeds seems like a reasonable alternative for scientific computing use cases. Forget about integer seeds. Those are for human convenience. If you're not jotting them down in your lab notebook in pen, you don't want an integer seed. What you want is a function that returns many RandomState objects that are hopefully spread around the MT19937 space enough that they are essentially independent (in the absence of true jumpahead). The better implementation of such a function would look something like this: def spread_out_prngs(n, root_prng=None): if root_prng is None: root_prng = np.random elif not isinstance(root_prng, np.random.RandomState): root_prng = np.random.RandomState(root_prng) sprouted_prngs = [] for i in range(n): seed_array = root_prng.randint(1<<32, size=624) # dtype=np.uint32 under 1.11 sprouted_prngs.append(np.random.RandomState(seed_array)) return spourted_prngs Internally, this generates seed arrays of about the size of the MT19937 state so make sure that you can access more of the state space. That will at least make the chance of collision tiny. And it can be easily rewritten to take advantage of one of the newer PRNGs that have true independent streams: https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: numpy.random.random_seed
On Tue, May 17, 2016 at 4:54 AM, Stephan Hoyer <sho...@gmail.com> wrote: > > I have recently encountered several use cases for randomly generate random number seeds: > > 1. When writing a library of stochastic functions that take a seed as an input argument, and some of these functions call multiple other such stochastic functions. Dask is one such example [1]. Can you clarify the use case here? I don't really know what you are doing here, but I'm pretty sure this is not the right approach. > 2. When a library needs to produce results that are reproducible after calling numpy.random.seed, but that do not want to use the functions in numpy.random directly. This came up recently in a pandas pull request [2], because we want to allow using RandomState objects as an alternative to global state in numpy.random. A major advantage of this approach is that it provides an obvious alternative to reusing the private numpy.random._mtrand [3]. It's only pseudo-private. This is an authorized use of it. However, for this case, I usually just pass around the the numpy.random module itself and let duck-typing take care of the rest. > [3] On a side note, if there's no longer a good reason to keep this object private, perhaps we should expose it in our public API. It would certainly be useful -- scikit-learn is already using it (see links in the pandas PR above). Adding a public get_global_random_state() function might be in order. Originally, I wanted there to be *some* barrier to entry, but just grabbing it to use as a default RandomState object is definitely an intended use of it. It's not going to disappear. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2016.2
I am pleased to announce release 2016.2 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -- - partial shell10x element implementation - parallel computation of homogenized coefficients - clean up of elastic terms - read support for msh file mesh format of gmsh For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman on behalf of the SfePy development team --- Contributors to this release in alphabetical order: Robert Cimrman Vladimir Lukes ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Floor divison on int returns float
On Wed, Apr 13, 2016 at 3:17 AM, Antony Lee <antony@berkeley.edu> wrote: > > This kind of issue (see also https://github.com/numpy/numpy/issues/3511) has become more annoying now that indexing requires integers (indexing with a float raises a VisibleDeprecationWarning). The argument "dividing an uint by an int may give a result that does not fit in an uint nor in an int" does not sound very convincing to me, It shouldn't because that's not the rule that numpy follows. The range of the result is never considered. Both *inputs* are cast to the same type that can represent the full range of either input type (for that matter, the actual *values* of the inputs are also never considered). In the case of uint64 and int64, there is no really good common type (the integer hierarchy has to top out somewhere), but float64 merely loses resolution rather than cutting off half of the range of uint64. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code
On Wed, Apr 6, 2016 at 4:17 PM, Neal Becker <ndbeck...@gmail.com> wrote: > I prefer to use a single instance of a RandomState so that there are > guarantees about the independence of streams generated from python random > functions, and from my c++ code. True, there are simpler approaches - but > I'm a purist. Consider using PRNGs that actually expose truly independent streams instead of a single shared stream: https://github.com/bashtage/ng-numpy-randomstate -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] mtrand.c update 1.11 breaks my crappy code
On Wed, Apr 6, 2016 at 2:18 PM, Neal Becker <ndbeck...@gmail.com> wrote: > > I have C++ code that tries to share the mtrand state. It unfortunately > depends on the layout of RandomState which used to be: > > struct __pyx_obj_6mtrand_RandomState { > PyObject_HEAD > rk_state *internal_state; > PyObject *lock; > }; > > But with 1.11 it's: > struct __pyx_obj_6mtrand_RandomState { > PyObject_HEAD > struct __pyx_vtabstruct_6mtrand_RandomState *__pyx_vtab; > rk_state *internal_state; > PyObject *lock; > PyObject *state_address; > }; > > So > 1. Why the change? > 2. How can I write portable code? There is no C API to RandomState at this time, stable, portable or otherwise. It's all private implementation detail. If you would like a stable and portable C API for RandomState, you will need to contribute one using PyCapsules to expose the underlying rk_state* pointer. https://docs.python.org/2.7/c-api/capsule.html -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] linux wheels coming soon
I suspect that many of the maintainers of major scipy-ecosystem projects are aware of these (or other similar) travis wheel caches, but would guess that the pool of travis-ci python users who weren't aware of these wheel caches is much much larger. So there will still be a lot of travis-ci clock cycles saved by manylinux wheels. -Robert On Thu, Mar 24, 2016 at 10:46 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Thu, Mar 24, 2016 at 11:44 AM, Peter Cock <p.j.a.c...@googlemail.com> > wrote: > > On Thu, Mar 24, 2016 at 6:37 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> On Mar 24, 2016 8:04 AM, "Peter Cock" <p.j.a.c...@googlemail.com> > wrote: > >>> > >>> Hi Nathaniel, > >>> > >>> Will you be providing portable Linux wheels aka manylinux1? > >>> https://www.python.org/dev/peps/pep-0513/ > >> > >> Matthew Brett will (probably) do the actual work, but yeah, that's the > idea > >> exactly. Note the author list on that PEP ;-) > >> > >> -n > > > > Yep - I was partly double checking, but also aware many folk > > skim the NumPy list and might not be aware of PEP-513 and > > the standardisation efforts going on. > > > > Also in addition to http://travis-dev-wheels.scipy.org/ and > > http://travis-wheels.scikit-image.org/ mentioned by Ralf there > > is http://wheels.scipy.org/ which I presume will get the new > > Linux wheels once they go live. > > The new wheels will go up on pypi, and I guess once everyone has > wheels on pypi then these ad-hoc wheel servers that existed only as a > way to distribute Linux wheels will become obsolete. > > (travis-dev-wheels will remain useful, though, because its purpose is > to hold up-to-the-minute builds of project master branches to allow > downstream projects to get early warning of breaking changes -- we > don't plan to upload to pypi after every commit :-).) > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2016.1
I am pleased to announce release 2016.1 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -- - major simplification of finite element field code - automatic checking of shapes of term arguments - improved mesh parametrization code and documentation - support for fieldsplit preconditioners of PETSc For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman on behalf of the SfePy development team --- Contributors to this release in alphabetical order: Robert Cimrman Vladimir Lukes ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposal: new logspace without the log in the argument
On Fri, Feb 19, 2016 at 12:10 PM, Andrew Nelson <andyf...@gmail.com> wrote: > > With respect to geomspace proposals: instead of specifying start and end values and the number of points I'd like to have an option where I can set the start and end points and the ratio. The function would then work out the correct number of points to get closest to the end value. > > E.g. geomspace(start=1, finish=2, ratio=1.03) > > The first entries would be 1.0, 1.03, 1*1.03**2, etc. > > I have a requirement for the correct ratio between the points, and it's a right bind having to calculate the exact number of points needed. At the risk of extending the twisty little maze of names, all alike, I would probably call a function with this signature geomrange() instead. It is more akin to arange(start, stop, step) than linspace(start, stop, num_steps). -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposal: new logspace without the log in the argument
On Thu, Feb 18, 2016 at 10:19 PM, Alan Isaac <alan.is...@gmail.com> wrote: > > On 2/18/2016 2:44 PM, Robert Kern wrote: >> >> In a new function not named `linspace()`, I think that might be fine. I do occasionally want to swap between linear and logarithmic/geometric spacing based on a parameter, so this >> doesn't violate the van Rossum Rule of Function Signatures. > > Would such a new function correct the apparent mistake (?) of > `linspace` including the endpoint by default? > Or is the current API justified by its Matlab origins? > (Or have I missed the point altogether?) The last, I'm afraid. Different use cases, different conventions. Integer ranges are half-open because that is the most useful convention in a 0-indexed ecosystem. Floating point ranges don't interface with indexing, and the closed intervals are the most useful (or at least the most common). > If this query is annoying, please ignore it. It is not meant to be. The same for my answer. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposal: new logspace without the log in the argument
On Thu, Feb 18, 2016 at 7:38 PM, Nathaniel Smith <n...@pobox.com> wrote: > > Some questions it'd be good to get feedback on: > > - any better ideas for naming it than "geomspace"? It's really too bad > that the 'logspace' name is already taken. geomspace() is a perfectly cromulent name, IMO. > - I guess the alternative interface might be something like > > np.linspace(start, stop, steps, spacing="log") > > what do people think? In a new function not named `linspace()`, I think that might be fine. I do occasionally want to swap between linear and logarithmic/geometric spacing based on a parameter, so this doesn't violate the van Rossum Rule of Function Signatures. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making "low" optional in numpy.randint
He was talking consistently about "random integers" not "random_integers()". :-) On Wednesday, 17 February 2016, G Young <gfyoun...@gmail.com> wrote: > Your statement is a little self-contradictory, but in any case, you > shouldn't worry about random_integers getting removed from the code-base. > However, it has been deprecated in favor of randint. > > On Wed, Feb 17, 2016 at 11:48 PM, Juan Nunez-Iglesias <jni.s...@gmail.com > <javascript:_e(%7B%7D,'cvml','jni.s...@gmail.com');>> wrote: > >> Also fwiw, I think the 0-based, half-open interval is one of the best >> features of Python indexing and yes, I do use random integers to index into >> my arrays and would not appreciate having to litter my code with "-1" >> everywhere. >> >> On Thu, Feb 18, 2016 at 10:29 AM, Alan Isaac <alan.is...@gmail.com >> <javascript:_e(%7B%7D,'cvml','alan.is...@gmail.com');>> wrote: >> >>> On 2/17/2016 3:42 PM, Robert Kern wrote: >>> >>>> random.randint() was the one big exception, and it was considered a >>>> mistake for that very reason, soft-deprecated in favor of >>>> random.randrange(). >>>> >>> >>> >>> randrange also has its detractors: >>> https://code.activestate.com/lists/python-dev/138358/ >>> and following. >>> >>> I think if we start citing persistant conventions, the >>> persistent convention across *many* languages that the bounds >>> provided for a random integer range are inclusive also counts for >>> something, especially when the names are essentially shared. >>> >>> But again, I am just trying to be clear about what is at issue, >>> not push for a change. I think citing non-existent standards >>> is not helpful. I think the discrepancy between the Python >>> standard library and numpy for a function going by a common >>> name is harmful. (But then, I teach.) >>> >>> fwiw, >>> >>> Alan >>> >>> >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> <javascript:_e(%7B%7D,'cvml','NumPy-Discussion@scipy.org');> >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> <javascript:_e(%7B%7D,'cvml','NumPy-Discussion@scipy.org');> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making "low" optional in numpy.randint
On Wed, Feb 17, 2016 at 8:43 PM, G Young <gfyoun...@gmail.com> wrote: > Josef: I don't think we are making people think more. They're all keyword arguments, so if you don't want to think about them, then you leave them as the defaults, and everyone is happy. I believe that Josef has the code's reader in mind, not the code's writer. As a reader of other people's code (and I count 6-months-ago-me as one such "other people"), I am sure to eventually encounter all of the different variants, so I will need to know all of them. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making "low" optional in numpy.randint
On Wed, Feb 17, 2016 at 8:30 PM, Alan Isaac <alan.is...@gmail.com> wrote: > > On 2/17/2016 12:28 PM, G Young wrote: >> >> Perhaps, but we are not coding in Haskell. We are coding in Python, and >> the standard is that the endpoint is excluded, which renders your point >> moot I'm afraid. > > I am not sure what "standard" you are talking about. > I thought we were talking about the user interface. It is a persistent and consistent convention (i.e. "standard") across Python APIs that deal with integer ranges (range(), slice(), random.randrange(), ...), particularly those that end up related to indexing; e.g. `x[np.random.randint(0, len(x))]` to pull a random sample from an array. random.randint() was the one big exception, and it was considered a mistake for that very reason, soft-deprecated in favor of random.randrange(). -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making "low" optional in numpy.randint
On Wed, Feb 17, 2016 at 4:40 PM, Alan Isaac <alan.is...@gmail.com> wrote: > > Behavior of random integer generation: > Python randint[a,b] > MATLAB randi [a,b] > Mma RandomInteger [a,b] > haskell randomR [a,b] > GAUSS rndi[a,b] > Maple rand[a,b] > > In short, NumPy's `randint` is non-standard (and, > I would add, non-intuitive). Presumably was due > due to relying on a float draw from [0,1) along > with the use of floor. No, never was. It is implemented so because Python uses semi-open integer intervals by preference because it plays most nicely with 0-based indexing. Not sure about all of those systems, but some at least are 1-based indexing, so closed intervals do make sense. The Python stdlib's random.randint() closed interval is considered a mistake by python-dev leading to the implementation and preference for random.randrange() instead. > The divergence in behavior between the (later) Python > function of the same name is particularly unfortunate. Indeed, but unfortunately, this mistake dates way back to Numeric times, and easing the migration to numpy was a priority in the heady days of numpy 1.0. > So I suggest further work on this function is > not called for, and use of `random_integers` > should be encouraged. Probably NumPy's `randint` > should be deprecated. Not while I'm here. Instead, `random_integers()` is discouraged and perhaps might eventually be deprecated. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fwd: Numexpr-3.0 proposal
On Mon, Feb 15, 2016 at 10:43 AM, Gregor Thalhammer < gregor.thalham...@gmail.com> wrote: > > Dear Robert, > > thanks for your effort on improving numexpr. Indeed, vectorized math > libraries (VML) can give a large boost in performance (~5x), except for a > couple of basic operations (add, mul, div), which current compilers are > able to vectorize automatically. With recent gcc even more functions are > vectorized, see https://sourceware.org/glibc/wiki/libmvec But you need > special flags depending on the platform (SSE, AVX present?), runtime > detection of processor capabilities would be nice for distributing > binaries. Some time ago, since I lost access to Intels MKL, I patched > numexpr to use Accelerate/Veclib on os x, which is preinstalled on each > mac, see https://github.com/geggo/numexpr.git veclib_support branch. > > As you increased the opcode size, I could imagine providing a bit to > switch (during runtime) between internal functions and vectorized ones, > that would be handy for tests and benchmarks. > Dear Gregor, Your suggestion to separate the opcode signature from the library used to execute it is very clever. Based on your suggestion, I think that the natural evolution of the opcodes is to specify them by function signature and library, using a two-level dict, i.e. numexpr.interpreter.opcodes['exp_f8f8f8'][gnu] = some_enum numexpr.interpreter.opcodes['exp_f8f8f8'][msvc] = some_enum +1 numexpr.interpreter.opcodes['exp_f8f8f8'][vml] = some_enum + 2 numexpr.interpreter.opcodes['exp_f8f8f8'][yeppp] = some_enum +3 I want to procedurally generate opcodes.cpp and interpreter_body.cpp. If I do it the way you suggested funccodes.hpp and all the many #define's regarding function codes in the interpreter can hopefully be removed and hence simplify the overall codebase. One could potentially take it a step further and plan (optimize) each expression, similar to what FFTW does with regards to matrix shape. That is, the basic way to control the library would be with a singleton library argument, i.e.: result = ne.evaluate( "A*log(foo**2 / bar**2", lib=vml ) However, we could also permit a tuple to be passed in, where each element of the tuple reflects the library to use for each operation in the AST tree: result = ne.evaluate( "A*log(foo**2 / bar**2", lib=(gnu,gnu,gnu,yeppp,gnu) ) In this case the ops are (mul,mul,div,log,mul). The op-code picking is done by the Python side, and this tuple could be potentially optimized by numexpr rather than hand-optimized, by trying various permutations of the linked C math libraries. The wisdom from the planning could be pickled and saved in a wisdom file. Currently Numexpr has cacheDict in util.py but there's no reason this can't be pickled and saved to disk. I've done a similar thing by creating wrappers for PyFFTW already. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numexpr-3.0 proposal
On Mon, Feb 15, 2016 at 7:28 AM, Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > > On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod <robbmcl...@gmail.com> > wrote: > >> >> 4.) I took a stab at converting from distutils to setuputils but this >> seems challenging with numpy as a dependency. I wonder if anyone has tried >> monkey-patching so that setup.py build_ext uses distutils and then pass the >> interpreter.pyd/so as a data file, or some other such chicanery? >> > > Not sure what you mean, since numpexpr already uses setuptools: > https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the > real goal you're trying to achieve? > > This monkeypatching is a bad idea: > https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both > setuptools and numpy.distutils already do that, and that's already one too > many. So you definitely don't want to add a third place You can use the > -j (--parallel) flag to numpy.distutils instead, see > http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds > > Ralf > Dear Ralf, Yes, this appears to be a bad idea. I was just trying to think about if I could use the more object-oriented approach that I am familiar with in setuptools to easily build wheels for Pypi. Thanks for the comments and links; I didn't know I could parallelize the numpy build. Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
On Mon, Feb 15, 2016 at 4:24 PM, Jeff Reback <jeffreb...@gmail.com> wrote: > > just an FYI. > > pandas implemented a RangeIndex in upcoming 0.18.0, mainly for memory savings, > see here, similar to how python range/xrange work. > > though there are substantial perf benefits, mainly with set operations, see here > though didn't officially benchmark thes. Since it is a numpy-aware object (unlike the builtins), you can (and have, if I'm reading the code correctly) implement __array__() such that it does the correctly performant thing and call np.arange(). RangeIndex won't be adversely impacted by retaining the status quo. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numexpr-3.0 proposal
Hello everyone, I've done some work on making a new version of Numexpr that would fix some of the limitations of the original virtual machine with regards to data types and operation/function count. Basically I re-wrote the Python and C sides to use 4-byte words, instead of null-terminated strings, for operations and passing types. This means the number of operations and types isn't significantly limited anymore. Francesc Alted suggested I should come here and get some advice from the community. I wrote a short proposal on the Wiki here: https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview One can see my branch here: https://github.com/robbmcleod/numexpr/tree/numexpr-3.0 If anyone has any comments they'd be welcome. Questions from my side for the group: 1.) Numpy casting: I downloaded the Numpy source and after browsing it seems the best approach is probably to just use numpy.core.numerictypes.find_common_type? 2.) Can anyone foresee any issues with casting build-in Python types (i.e. float and integer) to their OS dependent numpy equivalents? Numpy already seems to do this. 3.) Is anyone enabling the Intel VML library? There are a number of comments in the code that suggest it's not accelerating the code. It also seems to cause problems with bundling numexpr with cx_freeze. 4.) I took a stab at converting from distutils to setuputils but this seems challenging with numpy as a dependency. I wonder if anyone has tried monkey-patching so that setup.py build_ext uses distutils and then pass the interpreter.pyd/so as a data file, or some other such chicanery? (I was going to ask about attaching a debugger, but I just noticed: https://wiki.python.org/moin/DebuggingWithGdb ) Ciao, Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch <robert.mcl...@ethz.ch> robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Hook in __init__.py to let distributors patch numpy
I would add a numpy/_distributor_init.py module and unconditionally import it in the __init__.py. It's contents in our upstream sources would just be a docstring: """Distributors! Put your initialization code here! """ One important technical benefit is that the unconditional import won't hide ImportErrors in the distributor's code. On Fri, Feb 12, 2016 at 1:19 AM, Matthew Brett <matthew.br...@gmail.com> wrote: > Hi, > > Over at https://github.com/numpy/numpy/issues/5479 we're discussing > Windows wheels. > > On thing that we would like to be able to ship Windows wheels, is to > be able to put some custom checks into numpy when you build the > wheels. > > Specifically, for Windows, we're building on top of ATLAS BLAS / > LAPACK, and we need to check that the system on which the wheel is > running, has SSE2 instructions, otherwise we know ATLAS will crash > (almost everybody does have SSE2 these days). > > The way I propose we do that, is this patch here: > > https://github.com/numpy/numpy/pull/7231 > > diff --git a/numpy/__init__.py b/numpy/__init__.py > index 0fcd509..ba3ba16 100644 > --- a/numpy/__init__.py > +++ b/numpy/__init__.py > @@ -190,6 +190,12 @@ def pkgload(*packages, **options): > test = testing.nosetester._numpy_tester().test > bench = testing.nosetester._numpy_tester().bench > > +# Allow platform-specific build to intervene in numpy init > +try: > +from . import _distributor_init > +except ImportError: > +pass > + > from . import core > from .core import * > from . import compat > > So, numpy __init__.py looks for a module `_distributor_init`, in which > the distributor might have put custom code to do any checks and > initialization needed for the particular platform. We don't by > default ship a `_distributor_init.py` but leave it up to packagers to > generate this when building binaries. > > Does that sound like a sensible approach to y'all? > > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy 1.11.0b2 released
> (we've had a few recent issues with libgfortran accidentally missing as a requirement of scipy). On this topic, you may be able to get some milage out of adapting pypa/auditwheel, which can load up extension module `.so` files inside a wheel (or conda package) and walk the shared library dependency tree like the runtime linker (using pyelftools), and check whether things are going to resolve properly and where shared libraries are loaded from. Something like that should be able to, with minimal adaptation to use the conda dependency resolver, check that a conda package properly declares all of the shared library dependencies it actually needs. -Robert On Sat, Feb 6, 2016 at 3:42 PM, Michael Sarahan <msara...@gmail.com> wrote: > FWIW, we (Continuum) are working on a CI system that builds conda > recipes. Part of this is testing not only individual packages that change, > but also any downstream packages that are also in the repository of > recipes. The configuration for this is in > https://github.com/conda/conda-recipes/blob/master/.binstar.yml and the > project doing the dependency detection is in > https://github.com/ContinuumIO/ProtoCI/ > > This is still being established (particularly, provisioning build > workers), but please talk with us if you're interested. > > Chris, it may still be useful to use docker here (perhaps on the build > worker, or elsewhere), also, as the distinction between build machines and > user machines is important to make. Docker would be great for making sure > that all dependency requirements are met on end-user systems (we've had a > few recent issues with libgfortran accidentally missing as a requirement of > scipy). > > Best, > Michael > > On Sat, Feb 6, 2016 at 5:22 PM Chris Barker <chris.bar...@noaa.gov> wrote: > >> On Fri, Feb 5, 2016 at 3:24 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >>> On Fri, Feb 5, 2016 at 1:16 PM, Chris Barker <chris.bar...@noaa.gov> >>> wrote: >>> >> >> >>> >> > If we set up a numpy-testing conda channel, it could be used to >>> cache >>> >> > binary builds for all he versions of everything we want to test >>> >> > against. >>> >> Anaconda doesn't always have the >>> > latest builds of everything. >> >> >> OK, this may be more or less helpful, depending on what we want to built >> against. But a conda environment (maybe tied to a custom channel) really >> does make a nice contained space for testing that can be set up fast on a >> CI server. >> >> If whoever is setting up a test system/matrix thinks this would be >> useful, I'd be glad to help set it up. >> >> -Chris >> >> >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR(206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> chris.bar...@noaa.gov >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.uniform
On Thu, Jan 21, 2016 at 7:06 AM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > > There doesn't seem to be much of a consensus on the way to go, so leaving things as they are and have been seems the wisest choice for now, thanks for all the feedback. I will work with Greg on documenting the status quo properly. Ugh. Be careful in documenting the way things currently work. No one intended for it to work that way! No one should rely on high___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.uniform
On Tue, Jan 19, 2016 at 5:35 PM, Sebastian Berg <sebast...@sipsolutions.net> wrote: > > On Di, 2016-01-19 at 16:28 +, G Young wrote: > > In rand range, it raises an exception if low >= high. > > > > I should also add that AFAIK enforcing low >= high with floats is a > > lot trickier than it is for integers. I have been knee-deep in > > corner cases for some time with randint where numbers that are > > visually different are cast as the same number by numpy due to > > rounding and representation issues. That situation only gets worse > > with floats. > > > > Well, actually random.uniform docstring says: > > Get a random number in the range [a, b) or [a, b] depending on > rounding. Which docstring are you looking at? The current one says [low, high) http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.uniform.html#numpy.random.uniform -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
Hi all, Just as a heads up: Nathaniel and I wrote a draft PEP on binary linux wheels that is now being discussed on distutils-sig, so you can check that out and participate in the conversation if you're interested. - PEP on python.org: https://www.python.org/dev/peps/pep-0513/ - PEP on github with some typos fixed: https://github.com/manylinux/manylinux/blob/master/pep-513.rst - Email archive: https://mail.python.org/pipermail/distutils-sig/2016-January/027997.html -Robert On Tue, Jan 19, 2016 at 10:05 AM, Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > > On Tue, Jan 19, 2016 at 5:57 PM, Chris Barker - NOAA Federal < > chris.bar...@noaa.gov> wrote: > >> >> > 2) continue to support those users fairly poorly, and at substantial >> > ongoing cost >> >> I'm curious what the cost is for this poor support -- throw the source >> up on PyPi, and we're done. The cost comes in when trying to build >> binaries... >> > > I'm sure Nathaniel means the cost to users of failed installs and of numpy > losing users because of that, not the cost of building binaries. > > > Option 1 would require overwhelming consensus of the community, which >> > for better or worse is presumably not going to happen while >> > substantial portions of that community are still using pip/PyPI. >> >> Are they? Which community are we talking about? The community I'd like >> to target are web developers that aren't doing what they think of as >> "scientific" applications, but could use a little of the SciPy stack. >> These folks are committed to pip, and are very reluctant to introduce >> a difficult dependency. Binary wheels would help these folks, but >> that is not a community that exists yet ( or it's small, anyway) >> >> All that being said, I'd be happy to see binary wheels for the core >> SciPy stack on PyPi. It would be nice for people to be able to do a >> bit with Numpy or pandas, it MPL, without having to jump ship to a >> whole new way of doing things. >> > > This is indeed exactly why we need binary wheels. Efforts to provide those > will not change our strong recommendation to our users that they're better > off using a scientific Python distribution. > > Ralf > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.uniform
On Tue, Jan 19, 2016 at 5:40 PM, Charles R Harris <charlesr.har...@gmail.com> wrote: > > On Tue, Jan 19, 2016 at 10:36 AM, Robert Kern <robert.k...@gmail.com> wrote: >> >> On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: >> > >> >> > On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal < chris.bar...@noaa.gov> wrote: >> >> >> >> What does the standard lib do for rand range? I see that randint Is closed on both ends, so order doesn't matter, though if it raises for b<a, then that's a precedent we could follow. >> > >> > randint is not closed on the high end. The now deprecated random_integers is the function that does that. >> > >> > For floats, it's good to have various interval options. For instance, in generating numbers that will be inverted or have their log taken it is good to avoid zero. However, the names 'low' and 'high' are misleading... >> >> They are correctly leading the users to the manner in which the author intended the function to be used. The *implementation* is misleading by allowing users to do things contrary to the documented intent. ;-) >> >> With floating point and general intervals, there is not really a good way to guarantee that the generated results avoid the "open" end of the specified interval or even stay *within* that interval. This function is definitely not intended to be used as `uniform(closed_end, open_end)`. > > Well, it is possible to make that happen if one is careful or directly sets the bits in ieee types... For the unit interval, certainly. For general bounds, I am not so sure. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.uniform
On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris <charlesr.har...@gmail.com> wrote: > > On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal < chris.bar...@noaa.gov> wrote: >> >> What does the standard lib do for rand range? I see that randint Is closed on both ends, so order doesn't matter, though if it raises for b<a, then that's a precedent we could follow. > > randint is not closed on the high end. The now deprecated random_integers is the function that does that. > > For floats, it's good to have various interval options. For instance, in generating numbers that will be inverted or have their log taken it is good to avoid zero. However, the names 'low' and 'high' are misleading... They are correctly leading the users to the manner in which the author intended the function to be used. The *implementation* is misleading by allowing users to do things contrary to the documented intent. ;-) With floating point and general intervals, there is not really a good way to guarantee that the generated results avoid the "open" end of the specified interval or even stay *within* that interval. This function is definitely not intended to be used as `uniform(closed_end, open_end)`. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.uniform
On Tue, Jan 19, 2016 at 5:36 PM, Robert Kern <robert.k...@gmail.com> wrote: > > On Tue, Jan 19, 2016 at 5:27 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: > > > > > On Tue, Jan 19, 2016 at 9:23 AM, Chris Barker - NOAA Federal < chris.bar...@noaa.gov> wrote: > >> > >> What does the standard lib do for rand range? I see that randint Is closed on both ends, so order doesn't matter, though if it raises for b<a, then that's a precedent we could follow. > > > > randint is not closed on the high end. The now deprecated random_integers is the function that does that. > > > > For floats, it's good to have various interval options. For instance, in generating numbers that will be inverted or have their log taken it is good to avoid zero. However, the names 'low' and 'high' are misleading... > > They are correctly leading the users to the manner in which the author intended the function to be used. The *implementation* is misleading by allowing users to do things contrary to the documented intent. ;-) > > With floating point and general intervals, there is not really a good way to guarantee that the generated results avoid the "open" end of the specified interval or even stay *within* that interval. This function is definitely not intended to be used as `uniform(closed_end, open_end)`. There are special cases that *can* be implemented and are worth doing so as they are building blocks for other distributions that do need to avoid 0 or 1 as you say. Full-featured RNG suites do offer these: [0, 1] [0, 1) (0, 1] (0, 1) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Software Capabilities of NumPy in Our Tensor Survey Paper
On Fri, Jan 15, 2016 at 5:30 PM, Nathaniel Smith <n...@pobox.com> wrote: > > On Jan 15, 2016 8:36 AM, "Li Jiajia" <jiaji...@gatech.edu> wrote: > > > > Hi all, > > I’m a PhD student in Georgia Tech. Recently, we’re working on a survey paper about tensor algorithms: basic tensor operations, tensor decomposition and some tensor applications. We are making a table to compare the capabilities of different software and planning to include NumPy. We’d like to make sure these parameters are correct to make a fair compare. Although we have looked into the related documents, please help us to confirm these. Besides, if you think there are more features of your software and a more preferred citation, please let us know. We’ll consider to update them. We want to show NumPy supports tensors, and we also include "scikit-tensor” in our survey, which is based on NumPy. > > Please let me know any confusion or any advice! > > Thanks a lot! :-) > > > > Notice: > > 1. “YES/NO” to show whether or not the software supports the operation or has the feature. > > 2. “?” means we’re not sure of the feature, and please help us out. > > 3. “Tensor order” means the maximum number of tensor dimensions that users can do with this software. > > 4. For computational cores, > > 1) "Element-wise Tensor Operation (A * B)” includes element-wise add/minus/multiply/divide, also Kronecker, outer and Katri-Rao products. If the software contains one of them, we mark “YES”. > > 2) “TTM” means tensor-times-matrix multiplication. We distinguish TTM from tensor contraction. If the software includes tensor contraction, it can also support TTM. > > 3) For “MTTKRP”, we know most software can realize it through the above two operations. We mark it “YES”, only if an specified optimization for the whole operation. > > NumPy has support for working with multidimensional tensors, if you like, but it doesn't really use the tensor language and notation (preferring instead to think in terms of "arrays" as a somewhat more computationally focused and less mathematically focused conceptual framework). > > Which is to say that I actually have no idea what all those jargon terms you're asking about mean :-) I am suspicious that NumPy supports more of those operations than you have marked, just under different names/notation, but really can't tell either way for sure without knowing what exactly they are. In particular check if your operations can be expressed with einsum() http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.einsum.html -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
On Fri, Jan 15, 2016 at 11:56 AM, Travis Oliphant <tra...@continuum.io> wrote: > > > I still submit that this is not the best use of time. Conda *already* solves the problem.My sadness is that people keep working to create an ultimately inferior solution rather than just help make a better solution more accessible. People mistakenly believe that wheels and conda packages are equivalent. They are not. If they were we would not have created conda. We could not do what was necessary with wheels and contorting wheels to become conda packages was and still is a lot more work.Now, obviously, it's just code and you can certainly spend effort and time to migrate wheels so that they functionally equivalently to conda packages --- but what is the point, really? > > Why don't we work together to make the open-source conda project and open-source conda packages more universally accessible? The factors that motivate my interest in making wheels for Linux (i.e. the proposed manylinux tag) work on PyPI are - All (new) Python installations come with pip. As a package author writing documentation, I count on users having pip installed, but I can't count on conda. - I would like to see Linux have feature parity with OS X and Windows with respect to pip and PyPI. - I want the PyPA tools like pip to be as good as possible. - I'm confident that the manylinux proposal will work, and it's very straightforward. -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Get rid of special scalar arithmetic
On Wed, Jan 13, 2016 at 5:18 AM, Charles R Harris <charlesr.har...@gmail.com> wrote: > > Hi All, > > I've opened issue #7002, reproduced below, for discussion. >> >> Numpy umath has a file scalarmath.c.src that implements scalar arithmetic using special functions that are about 10x faster than the equivalent ufuncs. >> >> In [1]: a = np.float64(1) >> >> In [2]: timeit a*a >> 1000 loops, best of 3: 69.5 ns per loop >> >> In [3]: timeit np.multiply(a, a) >> 100 loops, best of 3: 722 ns per loop >> >> I contend that in large programs this improvement in execution time is not worth the complexity and maintenance overhead; it is unlikely that scalar-scalar arithmetic is a significant part of their execution time. Therefore I propose to use ufuncs for all of the scalar-scalar arithmetic. This would also bring the benefits of __numpy_ufunc__ to scalars with minimal effort. > > Thoughts? Not all important-to-optimize programs are large in our field; interactive use is rampant. The scalar optimizations weren't added speculatively: people noticed that their Numeric code ran much slower under numpy and were reluctant to migrate. I was forever responding on comp.lang.python, "It's because scalar arithmetic hasn't been optimized yet. We know how to do it, we just need a volunteer to do the work. Contributions gratefully accepted!" The most critical areas tended to be optimization where you are often working with implicit scalars that pop out in the optimization loop. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> And in any case we have lots of users who don't use conda and are thus doomed to support both ecosystems regardless, so we might as well make the best of it :-). Yes, this is the key. Conda is a great tool for a lot of users / use cases, but it's not for everyone. Anyways, I think I've made a pretty good start on the tooling for a wheel ABI tag for a LSB-style base system that represents a common set of shared libraries and symbols versions provided by "many" linuxes (previously discussed by Nathaniel here: https://code.activestate.com/lists/python-distutils-sig/26272/) -Robert On Mon, Jan 11, 2016 at 5:29 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Jan 11, 2016 3:54 PM, "Chris Barker" <chris.bar...@noaa.gov> wrote: > > > > On Mon, Jan 11, 2016 at 11:02 AM, David Cournapeau <courn...@gmail.com> > wrote: > >>> > >>> If we get all that worked out, we still haven't made any progress > toward the non-standard libs that aren't python. This is the big "scipy > problem" -- fortran, BLAS, hdf, ad infinitum. > >>> > >>> I argued for years that we could build binary wheels that hold each of > these, and other python packages could depend on them, but pypa never > seemed to like that idea. > >> > >> > >> I don't think that's an accurate statement. There are issues to solve > around this, but I did not encounter push back, either on the ML or face to > face w/ various pypa members at Pycon, etc... There may be push backs for a > particular detail, but making "pip install scipy" or "pip install > matplotlib" a reality on every platform is something everybody agrees o > > > > > > sure, everyone wants that. But when it gets deeper, they don't want to > have a bunc hof pip-installable binary wheels that are simply clibs > re-packaged as a dependency. And, then you have the problelm of those being > "binary wheel" dependencies, rather than "package" dependencies. > > > > e.g.: > > > > this particular build of pillow depends on the libpng and libjpeg > wheels, but the Pillow package, in general, does not. And you would have > different dependencies on Windows, and OS-X, and Linux. > > > > pip/wheel simply was not designed for that, and I didn't get any warm > and fuzzy feelings from dist-utils sig that the it ever would. And again, > then you are re-designing conda. > > I agree that talking about such things on distutils-sig tends to elicit a > certain amount of puzzled incomprehension, but I don't think it matters -- > wheels already have everything you need to support this. E.g. wheels for > different platforms can trivially have different dependencies. (They even > go to some lengths to make sure this is possible for pure python packages > where the same wheel can be used on multiple platforms.) When distributing > a library-in-a-wheel then you need a little bit of hackishness to make sure > the runtime loader can find the library, which conda would otherwise handle > for you, but AFAICT it's like 10 lines of code or something. > > And in any case we have lots of users who don't use conda and are thus > doomed to support both ecosystems regardless, so we might as well make the > best of it :-). > > -n > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
I started working on a tool for checking linux wheels for "manylinux" compatibility, and fixing them up if possible, based on the same ideas as Matthew Brett's delocate <https://github.com/matthew-brett/delocate> for OS X. Current WIP code, if anyone wants to help / throw penuts, is here: https://github.com/rmcgibbo/deloc8. It's currently fairly modest and can only list non-whitelisted external shared library dependencies, and verify that sufficiently old versioned symbols for glibc and its ilk are used. -Robert On Sun, Jan 10, 2016 at 1:19 AM, Robert McGibbon <rmcgi...@gmail.com> wrote: > Hi all, > > I followed Nathaniel's advice and restricted the search down to the > packages included in the Anaconda release (as opposed to all of the > packages in their repositories), and fixed some technical issues with the > way I was doing the analysis. > > The new list is much smaller. Here are the shared libraries that the > components of Anaconda require that the system provides on Linux 64: > > libpanelw.so.5, libncursesw.so.5, libgcc_s.so.1, libstdc++.so.6, > libm.so.6, libdl.so.2, librt.so.1, libcrypt.so.1, libc.so.6, libnsl.so.1, > libutil.so.1, libpthread.so.0, libX11.so.6, libXext.so.6, > libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0, > libXrender.so.1, libICE.so.6, libSM.so.6, libGL.so.1. > > Many of these libraries are required simply for the interpreter. The > remaining ones that aren't required by the interpreter are, but are > required by some other package in Anaconda are: > > libgcc_s.so.1, libstdc++.so.6, libXext.so.6, libSM.so.6, > libgthread-2.0.so.0, libgobject-2.0.so.0, libglib-2.0.so.0, libICE.so.6, > libXrender.so.1, and libGL.so.1. > > Most of these are parts of X11 required by Qt ( > http://doc.qt.io/qt-5/linux-requirements.html). > > -Robert > > > > On Sat, Jan 9, 2016 at 4:42 PM, Robert McGibbon <rmcgi...@gmail.com> > wrote: > >> > Maybe a better approach would be to look at what libraries are used on >> by an up-to-date default Anaconda install (on the assumption that this >> is the best tested configuration) >> >> That's not a bad idea. I also have a couple other ideas about how to >> filter >> this based on using debian popularity-contests and the package graph. I >> will report back when I have more info. >> >> -Robert >> >> On Sat, Jan 9, 2016 at 3:04 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >>> On Sat, Jan 9, 2016 at 3:52 AM, Robert McGibbon <rmcgi...@gmail.com> >>> wrote: >>> > Hi all, >>> > >>> > I went ahead and tried to collect a list of all of the libraries that >>> could >>> > be considered to constitute the "base" system for linux-64. The >>> strategy I >>> > used was to leverage off the work done by the folks at Continuum by >>> > searching through their pre-compiled binaries from >>> > https://repo.continuum.io/pkgs/free/linux-64/ to find shared >>> libraries that >>> > were dependened on (according to ldd) that were not accounted for by >>> the >>> > declared dependencies that each package made known to the conda package >>> > manager. >>> > >>> > The full list of these system libraries, sorted in from >>> > most-commonly-depend-on to rarest, is below. There are 158 of them. >>> [...] >>> > So it's not perfect. But it might be a useful starting place. >>> >>> Unfortunately, yeah, it looks like there's a lot of false positives in >>> here :-(. For example your list contains liblzma and libsqlite, but >>> both of these are shipped as dependencies of python itself. So >>> probably someone just forgot to declare the dependency explicitly, but >>> got away with it because the libraries were pulled in anyway. >>> >>> Maybe a better approach would be to look at what libraries are used on >>> by an up-to-date default Anaconda install (on the assumption that this >>> is the best tested configuration), and then erase from the list all >>> libraries that are shipped by this configuration (ignoring declared >>> dependencies since those seem to be unreliable)? It's better to be >>> conservative here, since the end goal is to come up with a list of >>> external libraries that we're confident have actually been tested for >>> compatibility by lots and lots of different users. >>> >>> -n >>> >>> -- >>> Nathaniel J. Smith -- http://vorpus.org >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> > Right. There's a small problem which is that the base linux system >> isn't just "CentOS 5", it's "CentOS 5 and here's the list of libraries > > that you're allowed to link to: ...", where that list is empirically > > chosen to include only stuff that really is installed on ~all linux >> machines and for which the ABI really has been stable in practice over > > multiple years and distros (so e.g. no OpenSSL). > > > > Does anyone know who maintains Anaconda's linux build environment? > I strongly suspect it was originally set up by Aaron Meurer. Who maintains it now that he is no longer at Continuum is a good question. >From looking at all of the external libraries referenced by binaries included in Anaconda and the conda repos, I am not confident that they have a totally strict policy here, or at least not ones that is enforced by tooling. The sonames I listed here <https://mail.scipy.org/pipermail/numpy-discussion/2016-January/074602.html> cover all of the external dependencies used by the latest Anaconda release, but earlier releases and other conda-installable packages from the default channel are not so strict. -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
Hi all, I followed Nathaniel's advice and restricted the search down to the packages included in the Anaconda release (as opposed to all of the packages in their repositories), and fixed some technical issues with the way I was doing the analysis. The new list is much smaller. Here are the shared libraries that the components of Anaconda require that the system provides on Linux 64: libpanelw.so.5, libncursesw.so.5, libgcc_s.so.1, libstdc++.so.6, libm.so.6, libdl.so.2, librt.so.1, libcrypt.so.1, libc.so.6, libnsl.so.1, libutil.so.1, libpthread.so.0, libX11.so.6, libXext.so.6, libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0, libXrender.so.1, libICE.so.6, libSM.so.6, libGL.so.1. Many of these libraries are required simply for the interpreter. The remaining ones that aren't required by the interpreter are, but are required by some other package in Anaconda are: libgcc_s.so.1, libstdc++.so.6, libXext.so.6, libSM.so.6, libgthread-2.0.so.0, libgobject-2.0.so.0, libglib-2.0.so.0, libICE.so.6, libXrender.so.1, and libGL.so.1. Most of these are parts of X11 required by Qt ( http://doc.qt.io/qt-5/linux-requirements.html). -Robert On Sat, Jan 9, 2016 at 4:42 PM, Robert McGibbon <rmcgi...@gmail.com> wrote: > > Maybe a better approach would be to look at what libraries are used on > by an up-to-date default Anaconda install (on the assumption that this > is the best tested configuration) > > That's not a bad idea. I also have a couple other ideas about how to filter > this based on using debian popularity-contests and the package graph. I > will report back when I have more info. > > -Robert > > On Sat, Jan 9, 2016 at 3:04 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> On Sat, Jan 9, 2016 at 3:52 AM, Robert McGibbon <rmcgi...@gmail.com> >> wrote: >> > Hi all, >> > >> > I went ahead and tried to collect a list of all of the libraries that >> could >> > be considered to constitute the "base" system for linux-64. The >> strategy I >> > used was to leverage off the work done by the folks at Continuum by >> > searching through their pre-compiled binaries from >> > https://repo.continuum.io/pkgs/free/linux-64/ to find shared libraries >> that >> > were dependened on (according to ldd) that were not accounted for by >> the >> > declared dependencies that each package made known to the conda package >> > manager. >> > >> > The full list of these system libraries, sorted in from >> > most-commonly-depend-on to rarest, is below. There are 158 of them. >> [...] >> > So it's not perfect. But it might be a useful starting place. >> >> Unfortunately, yeah, it looks like there's a lot of false positives in >> here :-(. For example your list contains liblzma and libsqlite, but >> both of these are shipped as dependencies of python itself. So >> probably someone just forgot to declare the dependency explicitly, but >> got away with it because the libraries were pulled in anyway. >> >> Maybe a better approach would be to look at what libraries are used on >> by an up-to-date default Anaconda install (on the assumption that this >> is the best tested configuration), and then erase from the list all >> libraries that are shipped by this configuration (ignoring declared >> dependencies since those seem to be unreliable)? It's better to be >> conservative here, since the end goal is to come up with a list of >> external libraries that we're confident have actually been tested for >> compatibility by lots and lots of different users. >> >> -n >> >> -- >> Nathaniel J. Smith -- http://vorpus.org >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
> do those packages use ld --as-needed for linking? Is it possible to check this? I mean, there are over 7000 packages that I check. I don't know how they were all built. It's totally possible for many of them to be unused. A reasonably common thing might be that packages use ctypes or dlopen to dynamically load shared libraries that are actually just optional (and catch the error and recover gracefully if the library can't be loaded). -Robert On Sat, Jan 9, 2016 at 4:20 AM, Julian Taylor <jtaylor.deb...@googlemail.com > wrote: > On 09.01.2016 12:52, Robert McGibbon wrote: > > Hi all, > > > > I went ahead and tried to collect a list of all of the libraries that > > could be considered to constitute the "base" system for linux-64. The > > strategy I used was to leverage off the work done by the folks at > > Continuum by searching through their pre-compiled binaries > > from https://repo.continuum.io/pkgs/free/linux-64/ to find shared > > libraries that were dependened on (according to ldd) that were not > > accounted for by the declared dependencies that each package made known > > to the conda package manager. > > > > do those packages use ld --as-needed for linking? > there are a lot libraries in that list that I highly doubt are directly > used by the packages. > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
", "numpy-1.8.2-py34_0", "numpy-1.9.0-py27_0", "numpy-1.9.0-py34_0", "numpy-1.9.1-py27_0", "numpy-1.9.1-py34_0", "numpy-1.9.2-py27_0", "numpy-1.9.2-py34_0"]. Note that this list of numpy versions doesn't include the latest ones -- all of the numpy-1.10 binaries made by Continuum pick up libgfortan from a conda package and don't depend on it being provided by the system. Also, the final '_0' or '_1' segment of many of these package names is the build number, which is to make a new release of the same release of a package, usually because of a packaging problem. So many of these packages were probably built incorrectly and superseded by new builds with a higher build number. So it's not perfect. But it might be a useful starting place. -Robert ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Defining a base linux-64 environment [was: Should I use pip install numpy in linux?]
> Maybe a better approach would be to look at what libraries are used on by an up-to-date default Anaconda install (on the assumption that this is the best tested configuration) That's not a bad idea. I also have a couple other ideas about how to filter this based on using debian popularity-contests and the package graph. I will report back when I have more info. -Robert On Sat, Jan 9, 2016 at 3:04 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Sat, Jan 9, 2016 at 3:52 AM, Robert McGibbon <rmcgi...@gmail.com> > wrote: > > Hi all, > > > > I went ahead and tried to collect a list of all of the libraries that > could > > be considered to constitute the "base" system for linux-64. The strategy > I > > used was to leverage off the work done by the folks at Continuum by > > searching through their pre-compiled binaries from > > https://repo.continuum.io/pkgs/free/linux-64/ to find shared libraries > that > > were dependened on (according to ldd) that were not accounted for by the > > declared dependencies that each package made known to the conda package > > manager. > > > > The full list of these system libraries, sorted in from > > most-commonly-depend-on to rarest, is below. There are 158 of them. > [...] > > So it's not perfect. But it might be a useful starting place. > > Unfortunately, yeah, it looks like there's a lot of false positives in > here :-(. For example your list contains liblzma and libsqlite, but > both of these are shipped as dependencies of python itself. So > probably someone just forgot to declare the dependency explicitly, but > got away with it because the libraries were pulled in anyway. > > Maybe a better approach would be to look at what libraries are used on > by an up-to-date default Anaconda install (on the assumption that this > is the best tested configuration), and then erase from the list all > libraries that are shipped by this configuration (ignoring declared > dependencies since those seem to be unreliable)? It's better to be > conservative here, since the end goal is to come up with a list of > external libraries that we're confident have actually been tested for > compatibility by lots and lots of different users. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
Does anyone know if there's been any movements with the PyPI folks on allowing linux wheels to be uploaded? I know you can never be certain what's provided by the distro, but it seems like if Anaconda can solve the cross-distro-binary-distribution-of-compiled-python-extensions problem, there shouldn't be much technically different for Linux wheels. -Robert On Fri, Jan 8, 2016 at 9:12 AM, Matthew Brett <matthew.br...@gmail.com> wrote: > Hi, > > On Fri, Jan 8, 2016 at 4:28 PM, Yuxiang Wang <yw...@virginia.edu> wrote: > > Dear Nathaniel, > > > > Gotcha. That's very helpful. Thank you so much! > > > > Shawn > > > > On Thu, Jan 7, 2016 at 10:01 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> On Thu, Jan 7, 2016 at 6:18 PM, Yuxiang Wang <yw...@virginia.edu> > wrote: > >>> Dear all, > >>> > >>> I know that in Windows, we should use either Christoph's package or > >>> Anaconda for MKL-optimized numpy. In Linux, the fortran compiler issue > >>> is solved, so should I directly used pip install numpy to get numpy > >>> with a reasonable BLAS library? > >> > >> pip install numpy should work fine; whether it gives you a reasonable > >> BLAS library will depend on whether you have the development files for > >> a reasonable BLAS library installed, and whether numpy's build system > >> is able to automatically locate them. Generally this means that if > >> you're on a regular distribution and remember to install a decent BLAS > >> -dev or -devel package, then you'll be fine. > >> > >> On Debian/Ubuntu, 'apt install libopenblas-dev' is probably enough to > >> ensure something reasonable happens. > >> > >> Anaconda is also an option on linux if you want MKL (or openblas). > > I wrote a page on using pip with Debian / Ubuntu here : > https://matthew-brett.github.io/pydagogue/installing_on_debian.html > > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> Both Anaconda and Canopy build on a base default Linux system so that > the built binaries will work on many Linux systems. I think the base linux system is CentOS 5, and from my experience, it seems like this approach has worked very well. Those packages are compatible with all essentially all Linuxes that are more recent than CentOS 5 (which is ancient). I have not heard of anyone complaining that the packages they install through conda don't work on their CentOS 4 or Ubuntu 6.06 box. I assume Python / pip is probably used on a wider diversity of linux flavors than conda is, so I'm sure that binaries built on CentOS 5 won't work for absolutely _every_ linux user, but it does seem to cover the substantial majority of linux users. Building redistributable linux binaries that work across a large number of distros and distro versions is definitely tricky. If you run ``python setup.py bdist_wheel`` on your Fedora Rawhide box, you can't really expect the wheel to work for too many other linux users. So given that, I can see why PyPI would want to be careful about accepting Linux wheels. But it seems like, if they make the upload something like ``` twine upload numpy-1.9.2-cp27-none-linux_x86_64.whl \ --yes-yes-i-know-this-is-dangerous-but-i-know-what-i'm-doing ``` that this would potentially be able to let packages like numpy serve their linux users better without risking too much junk being uploaded to PyPI. -Robert On Fri, Jan 8, 2016 at 3:50 PM, Matthew Brett <matthew.br...@gmail.com> wrote: > Hi, > > On Fri, Jan 8, 2016 at 11:27 PM, Chris Barker <chris.bar...@noaa.gov> > wrote: > > On Fri, Jan 8, 2016 at 1:58 PM, Robert McGibbon <rmcgi...@gmail.com> > wrote: > >> > >> I'm not sure if this is the right path for numpy or not, > > > > > > probably not -- AFAICT, the PyPa folks aren't interested in solving teh > > problems we have in the scipy community -- we can tweak around the edges, > > but we wont get there without a commitment to really solve the issues -- > and > > if pip did that, it would essentially be conda -- non one wants to > > re-impliment conda. > > Well - as the OP was implying, it really should not be too difficult. > > We (here in Berkeley) have discussed how to do this for Linux, > including (Nathaniel mainly) what would be sensible for pypi to do, in > terms of platform labels. > > Both Anaconda and Canopy build on a base default Linux system so that > the built binaries will work on many Linux systems. > > At the moment, Linux wheels have the platform tag of either linux_i686 > (32-bit) or linux_x86_64 - example filenames: > > numpy-1.9.2-cp27-none-linux_i686.whl > numpy-1.9.2-cp27-none-linux_x86_64.whl > > Obviously these platform tags are rather useless, because they don't > tell you very much about whether this wheel will work on your own > system. > > If we started building Linux wheels on a base system like that of > Anaconda or Canopy we might like another platform tag that tells you > that this wheel is compatible with a wide range of systems. So the > job of negotiating with distutils-sig is trying to find a good name > for this base system - we thought that 'manylinux' was a good one - > and then put in a pull request to pip to recognize 'manylinux' as > compatible when running pip install from a range of Linux systems. > > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
Well, it's always possible to copy the dependencies like libopenblas.so into the wheel and fix up the RPATHs, similar to the way the Windows wheels work. I'm not sure if this is the right path for numpy or not, but it seems like something would be suitable for some projects with compiled extensions. But it's categorically ruled out by the PyPI policy, IIUC. Perhaps this is OT for this thread, and I should ask on distutils-sig. -Robert On Fri, Jan 8, 2016 at 12:12 PM, Oscar Benjamin <oscar.j.benja...@gmail.com> wrote: > > On 8 Jan 2016 19:07, "Robert McGibbon" <rmcgi...@gmail.com> wrote: > > > > Does anyone know if there's been any movements with the PyPI folks on > allowing linux wheels to be uploaded? > > > > I know you can never be certain what's provided by the distro, but it > seems like if Anaconda can solve the > cross-distro-binary-distribution-of-compiled-python-extensions problem, > there shouldn't be much technically different for Linux wheels. > > Anaconda controls all of the dependent non-Python libraries which are > outside of the pip/pypi ecosystem. Pip/wheel doesn't have that option until > such libraries are packaged up for PyPI (e.g. pyopenblas). > > -- > Oscar > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> Doesn't building on CentOS 5 also mean using a quite old version of gcc? I have had pretty good luck using the (awesomely named) Holy Build Box <http://phusion.github.io/holy-build-box/>, which is a CentOS 5 docker image with a newer gcc version installed (but I guess the same old libc). I'm not 100% sure how it works, but it's quite nice. For example, you can use c++11 and still keep all the binary compatibility benefits of CentOS 5. -Robert On Fri, Jan 8, 2016 at 7:38 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum <nathan12...@gmail.com> > wrote: > > Doesn't building on CentOS 5 also mean using a quite old version of gcc? > > Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc > 4.8 by using the Redhat Developer Toolset release (which is gcc + > special backport libraries to let it generate RHEL5/CentOS5-compatible > binaries). (I might have one or both of those version numbers slightly > wrong.) > > > I've never tested this, but I've seen claims on the anaconda mailing > list of > > ~25% slowdowns compared to building from source or using system packages, > > which was attributed to building using an older gcc that doesn't > optimize as > > well as newer versions. > > I'd be very surprised if that were a 25% slowdown in general, as > opposed to a 25% slowdown on some particular inner loop that happened > to neatly match some new feature in a new gcc (e.g. something where > the new autovectorizer kicked in). But yeah, in general this is just > an inevitable trade-off when it comes to distributing binaries: you're > always going to pay some penalty for achieving broad compatibility as > compared to artisanally hand-tuned binaries specialized for your > machine's exact OS version, processor, etc. Not much to be done, > really. At some point the baseline for compatibility will switch to > "compile everything on CentOS 6", and that will be better but it will > still be worse than binaries that target CentOS 7, and so on and so > forth. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Should I use pip install numpy in linux?
> Continuum and Enthought both have a whole list of packages beyond glibc that are safe enough to link to, including a bunch of ones that would be big pains to statically link everywhere (libX11, etc.). That's the useful piece of information that goes beyond just CentOS5 + RH devtools + static linking -- can't tell of the "Holy Build Box" has anything like that. Probably-crazy Idea: One could reconstruct that list by downloading all of https://repo.continuum.io/pkgs/free/linux-64/, untarring everything, and running `ldd` on all of the binaries and .so files. Can't be that hard... right? -Robert On Fri, Jan 8, 2016 at 8:03 PM, Nathaniel Smith <n...@pobox.com> wrote: > On Fri, Jan 8, 2016 at 7:41 PM, Robert McGibbon <rmcgi...@gmail.com> > wrote: > >> Doesn't building on CentOS 5 also mean using a quite old version of gcc? > > > > I have had pretty good luck using the (awesomely named) Holy Build Box, > > which is a CentOS 5 docker image with a newer gcc version installed (but > I > > guess the same old libc). I'm not 100% sure how it works, but it's quite > > nice. For example, you can use c++11 and still keep all the binary > > compatibility benefits of CentOS 5. > > They say they have gcc 4.8: > > https://github.com/phusion/holy-build-box#isolated-build-environment-based-on-docker-and-centos-5 > so I bet they're using RH's devtools gcc. This means that it works via > the labor of some unsung programmers at RH who went through all the > library changes between gcc 4.4 and 4.8, and put together a version of > 4.8 that for every important symbol knows whether it's available in > the old 4.4 libraries or not; for the ones that are, it dynamically > links them; for the ones that aren't, it has a special static library > that it pulls them out of. Like sewer cleaning, it's the kind of very > impressive, incredibly valuable infrastructure work that I'm really > glad someone does. Someone else who's not me... > > Continuum and Enthought both have a whole list of packages beyond > glibc that are safe enough to link to, including a bunch of ones that > would be big pains to statically link everywhere (libX11, etc.). > That's the useful piece of information that goes beyond just CentOS5 + > RH devtools + static linking -- can't tell of the "Holy Build Box" has > anything like that. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] deprecate random.random_integers
On Sun, Jan 3, 2016 at 11:51 PM, G Young <gfyoun...@gmail.com> wrote: > > Hello all, > > In light of the discussion in #6910, I have gone ahead and deprecated random_integers in my most recent PR here. As this is an API change (sort of), what are people's thoughts on this deprecation? I'm reasonably in favor. random_integers() with its closed-interval convention only exists because it existed in Numeric's RandomArray module. The closed-interval convention was broadly been considered to be a mistake introduced early in the stdlib random module and rectified with the introduction and promotion of random.randrange() instead. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy funding update
On Wed, Dec 30, 2015 at 10:54 AM, Ralf Gommers <ralf.gomm...@gmail.com> wrote: > > Hi all, > > A quick good news message: OSDC has made a $5k contribution to NumFOCUS, which is split between support for a women in technology workshop and support for Numpy: http://www.numfocus.org/blog/osdc-donates-5k-to-support-numpy-women-in-tech > This was a very nice surprise to me, and a first sign that the FSA (fiscal sponsorship agreement) we recently signed with NumFOCUS is going to yield significant benefits for Numpy. > > NumFOCUS is also doing a special end-of-year fundraiser. Funds donated (up to $5k) will be tripled by anonymous sponsors: http://www.numfocus.org/blog/numfocus-end-of-year-fundraising-drive-5000-matching-gift-challenge > So think of Numpy (or your other favorite NumFOCUS-sponsored project of course) if you're considering a holiday season charitable gift! That sounds great! Do we have any concrete plans for spending that money, yet? -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators
On Mon, Dec 14, 2015 at 3:56 PM, Benjamin Root <ben.v.r...@gmail.com> wrote: > By the way, any reason why this works? > >>> np.array(xrange(10)) > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) It's not a generator. It's a true sequence that just happens to have a special implementation rather than being a generic container. >>> len(xrange(10)) 10 >>> xrange(10)[5] 5 -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] FeatureRequest: support for array construction from iterators
On Mon, Dec 14, 2015 at 5:41 PM, Benjamin Root <ben.v.r...@gmail.com> wrote: > > Heh, never noticed that. Was it implemented more like a generator/iterator in older versions of Python? No, it predates generators and iterators so it has always had to be implemented like that. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: SfePy 2015.4
I am pleased to announce release 2015.4 of SfePy. Description --- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -- - basic support for restart files - new type of linear combination boundary conditions - balloon inflation example For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman on behalf of the SfePy development team --- Contributors to this release in alphabetical order: Robert Cimrman Grant Stephens ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] reshaping array question
On Tue, Nov 17, 2015 at 3:48 PM, Neal Becker <ndbeck...@gmail.com> wrote: > > I have an array of shape > (7, 24, 2, 1024) > > I'd like an array of > (7, 24, 2048) > > such that the elements on the last dimension are interleaving the elements > from the 3rd dimension > > [0,0,0,0] -> [0,0,0] > [0,0,1,0] -> [0,0,1] > [0,0,0,1] -> [0,0,2] > [0,0,1,1] -> [0,0,3] > ... > > What might be the simplest way to do this? np.transpose(A, (-2, -1)).reshape(A.shape[:-2] + (-1,)) > > A different question, suppose I just want to stack them > > [0,0,0,0] -> [0,0,0] > [0,0,0,1] -> [0,0,1] > [0,0,0,2] -> [0,0,2] > ... > [0,0,1,0] -> [0,0,1024] > [0,0,1,1] -> [0,0,1025] > [0,0,1,2] -> [0,0,1026] > ... A.reshape(A.shape[:-2] + (-1,)) -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] reshaping array question
On Nov 17, 2015 6:53 PM, "Sebastian Berg" <sebast...@sipsolutions.net> wrote: > > On Di, 2015-11-17 at 13:49 -0500, Neal Becker wrote: > > Robert Kern wrote: > > > > > On Tue, Nov 17, 2015 at 3:48 PM, Neal Becker <ndbeck...@gmail.com> wrote: > > >> > > >> I have an array of shape > > >> (7, 24, 2, 1024) > > >> > > >> I'd like an array of > > >> (7, 24, 2048) > > >> > > >> such that the elements on the last dimension are interleaving the > > >> elements from the 3rd dimension > > >> > > >> [0,0,0,0] -> [0,0,0] > > >> [0,0,1,0] -> [0,0,1] > > >> [0,0,0,1] -> [0,0,2] > > >> [0,0,1,1] -> [0,0,3] > > >> ... > > >> > > >> What might be the simplest way to do this? > > > > > > np.transpose(A, (-2, -1)).reshape(A.shape[:-2] + (-1,)) > > > > I get an error on that 1st transpose: > > > > Transpose needs a slightly different input. If you look at the help, it > should be clear. The help might also point to np.swapaxes, which may be > a bit more straight forward for this exact case. Sorry about that. Was in a rush and working from a faulty memory. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion